The rise of large language models (LLMs) has sparked questions about their computational abilities compared to traditional models. While recent research has shown that LLMs can simulate a universal Turing machine by using an external memory through prompting, this setup often relies on external controls, which shift much of the computational load away from the LLM itself. Until now, it has remained an open question whether an LLM, unaided by such external elements, could independently achieve Turing completeness.
In a new paper, Autoregressive Large Language Models are Computationally Universal, a research team from Google DeepMind and the University of Alberta presents evidence that transformer-based LLMs using autoregressive decoding can indeed support universal computation without any external adjustments or modifications to model weights.
This study investigates whether an LLM can achieve universal computation on its own by leveraging an unbounded “chain of thought” process. Their findings provide an affirmative answer, demonstrating that an unaided LLM can simulate a universal Turing machine.
To achieve this, the researchers took a broader approach to autoregressive decoding, one that allows the model to handle infinitely long input sequences. Specifically, they introduced a generalized autoregressive decoding method in which each generated token is added to the end of the input sequence, allowing continuous processing of new contexts as the input expands. This method essentially defaults to standard autoregressive decoding when the input remains within the model’s context window.
The researchers first lay out their extended perspective on autoregressive decoding to accommodate long input sequences. They then demonstrate that this extension enables the model to emulate a restricted Lag system—a simplified form of one of the earliest general computational models. A Lag system, which can organize memory in a circular queue, allows for bidirectional access to memory. Building on the concept of finite memory simulation for Turing machines, they show that a Lag system with a context length of 2 can simulate any Turing machine.
Next, they apply this reduction method to a specific universal Turing machine, constructing a universal Lag system defined by 2027 production rules across 262 symbols. Finally, they develop a system prompt that directs a specific LLM, gemini-1.5-pro-001, to follow each of the 2027 rules under greedy decoding conditions.
These results allow the researchers to conclude that, through extended autoregressive (greedy) decoding, gemini-1.5-pro-001 can exactly replicate the computations of U15,2 for any input, effectively making it a general-purpose computer.
The paper Autoregressive Large Language Models are Computationally Universal is on arXiv.
Author: Hecate He | Editor: Chain Zhang
The post Unlocking Turing Completeness: How Large Language Models Achieve Universal Computation Without Assistance first appeared on Synced.