Evaluating Brain Alignment in Large Language Models: Insights into Linguistic Competence and Neural Representations

LLMs exhibit striking parallels to neural activity within the human language network, yet the specific linguistic properties that contribute to these brain-like representations remain unclear. Understanding the cognitive mechanisms that enable language comprehension and communication is a key objective in neuroscience. The brain’s language network (LN), a collection of left-lateralized frontotemporal regions, is crucial in processing linguistic input. Recent advancements in machine learning have positioned LLMs, which are trained on vast text corpora using next-word prediction, as promising computational models for studying LN functions. When exposed to the same language stimuli as humans during neuroimaging and electrophysiology experiments, these models account for significant neural response variability, reinforcing their relevance in cognitive neuroscience research.

Studies on model-to-brain alignment suggest that certain artificial neural networks encode representations that resemble those in the human brain. This resemblance was first identified in vision research and has since extended to auditory and language processing. Research indicates that even untrained neural networks can exhibit high levels of alignment with brain activity, implying that certain architectural properties contribute to their cognitive resemblance independent of experience-based training. Investigations into inductive biases across different network architectures highlight that randomly initialized models do not behave as arbitrary functions but instead capture fundamental structural patterns inherent in sensory and linguistic processing. These insights deepen our understanding of the neural basis of language and offer potential pathways for refining LLMs to simulate human cognition better.

EPFL, MIT, and Georgia Tech researchers analyzed 34 training checkpoints across eight model sizes to examine the relationship between brain alignment and linguistic competence. Their findings indicate that brain alignment correlates more strongly with formal linguistic competence—knowledge of linguistic rules—than with functional competence, which involves reasoning and world knowledge. While functional competence develops further with training, its link to brain alignment weakens. Also, model size does not predict brain alignment when controlled for feature size. Their results suggest that current brain alignment benchmarks remain unsaturated, emphasizing opportunities to refine LLMs for improved alignment with human language processing.

The study evaluates brain alignment in language models using diverse neuroimaging datasets categorized by modality, context length, and stimulus presentation (auditory/visual). The analysis follows a functional localization approach, identifying language-selective neural units. Brain alignment is assessed using ridge regression and Pearson correlation, while cross-subject consistency estimates account for noise. Formal competence is tested using BLIMP and SYNTAXGYM, while functional competence is assessed with reasoning and world knowledge benchmarks. Results show that contextualization impacts alignment, and untrained models retain partial alignment. The study emphasizes robust evaluation metrics and generalization tests to ensure meaningful comparisons across models.

Untrained models, despite lower alignment scores than pretrained ones (~50%), still exhibit notable brain alignment, surpassing random token sequences. This alignment arises from inductive biases, with sequence-based models (GRU, LSTM, Transformers) showing stronger alignment than token-based models (MLP, Linear). Temporal integration, particularly through positional encoding, plays a key role. Brain alignment peaks early in training (~8B tokens) and is linked to formal linguistic competence rather than functional understanding. Larger models do not necessarily improve alignment. Overtraining reduces behavioral alignment, suggesting that models diverge from human processing as they surpass human proficiency, relying on different mechanisms.

In conclusion, the study examined how brain alignment in LLMs evolves during training, showing that it closely follows formal linguistic competence while functional competence continues developing independently. Brain alignment peaks early, suggesting that the human language network primarily encodes syntactic and compositional structures rather than broader cognitive functions. Model size does not predict alignment; architectural biases and training dynamics play a key role. The study also confirms that brain alignment benchmarks remain unsaturated, indicating room for improvement in modeling human language processing. These findings refine our understanding of how LLMs relate to biological language processing, emphasizing formal over functional linguistic structures.

Check out the Paper. All credit for this research goes to the researchers of this project. Also, feel free to follow us on Twitter and don’t forget to join our 80k+ ML SubReddit.

Meet Parlant: An LLM-first conversational AI framework designed to provide developers with the control and precision they need over their AI customer service agents, utilizing behavioral guidelines and runtime supervision. It’s operated using an easy-to-use CLI and native client SDKs in Python and TypeScript .