While writing the code for any program or algorithm, developers can struggle to fill gaps in incomplete code and often make mistakes while trying to fit new pieces into existing code snippets or structures. These challenges arise from the difficulty of fitting the latest code with the prior and following parts, especially when the broader part of the context is not taken into consideration. In recent years, Fill-in-the-Middle (FIM) has become integral to code language models, enabling the generation of missing code given both left and right contexts. Currently, the Fill-in-the-Middle (FIM) model works by rearranging code sequences and using next-token prediction (NTP) to fill the gaps in incomplete code. FIM also requires planning capabilities and lack of it can hinder the prediction of the missing code.
The current methods for FIM rely mainly on NLP techniques in order to estimate the missing part of the code and rely on reordering training sequences and performing next-token prediction (NTP). However, these methods don’t work well in real-world coding scenarios because they rely on strict rules, like generating the exact number of lines present in the original code, etc. Moreover, model performance on FIM tasks deteriorates significantly without these unrealistic assumptions. Standard NTP training does not efficiently prepare models for this long-horizon planning task. Consequently, models often struggle to maintain coherence over the longer sequences required in FIM, particularly when approaching the transition to the right context. We believe that next-token prediction (NTP) alone doesn’t help models plan well enough when dealing with the distant part of the code that comes after the missing section, which is crucial for generating accurate code in the middle.
To mitigate this issue, an auxiliary training objective, namely horizon-length prediction (HLP) is added, to improve the planning capabilities of LLMs over long horizons. Specifically, given the hidden state of current token, the model is tasked by HLP to predict the number of future tokens required to complete the middle.
To solve this problem researchers from the University of Illinois Urbana-Champaign and AWS-AI Labs collaborated to propose Horizon-Length Prediction (HLP), as an efficient solution. HLP is a novel training approach that teaches models to predict the number of remaining middle tokens (horizon length) at each step. It is implemented as a linear layer on top of the transformer model with weight, whose input is the hidden state from the last attention layer. It improves Fill-in-the-Middle (FIM) by teaching models to plan and consider broader part. This helps the models naturally learn how to fill in gaps from any left and right code sections, without needing special rules or extra adjustments. Unlike rule-based post-processing, HLP is generalizable as it does not require any task-specific knowledge.
The evaluation conducted by the researchers also shows that HLP not only improves code in filling by up to 24% across various benchmarks without using any rule-based and/or dataset-specific post-processing but also enhances performance on code reasoning. They also found HLP super efficient as it only incurs negligible training overhead while not adding any inference overhead. In addition, HLP adds minimal overhead during training and no additional cost during inference, making it practical for real-world applications.
In conclusion, this paper introduces Horizon-Length Prediction (HLP), a novel training objective designed to enhance Fill-in-the-Middle (FIM) capabilities in code language models. By teaching models to predict the number of remaining tokens, HLP significantly improves the planning and coherence of generated code, achieving up to 24% performance gains on diverse benchmarks without relying on restrictive post-processing methods. Moreover, the enhanced planning capability acquired through HLP training also boosts models’ performance on code reasoning tasks, suggesting that HLP may broadly improve language models’ reasoning capabilities. Besides, HLP is also efficient as it does not cause any inference overhead and the training overhead is negligible as well. This research marks a significant step in developing more effective code language models for real-world applications.
Check out the Paper. All credit for this research goes to the researchers of this project. Also, don’t forget to follow us on Twitter and join our Telegram Channel and LinkedIn Group. If you like our work, you will love our newsletter.. Don’t Forget to join our 50k+ ML SubReddit
[Upcoming Event- Oct 17 202] RetrieveX – The GenAI Data Retrieval Conference (Promoted)
The post Are LLMs Failing to Match with Suffix in Fill-in-the-Middle (FIM) Code Completion? Horizon-Length Prediction: A New AI Training Task to Advance FIM by Teaching LLMs to Plan Ahead over Arbitrarily Long Horizons appeared first on MarkTechPost.