Infinitely Fallible

Machine Intelligence via Recursive Self-Improvement

The models just want to learn. - Ilya Sutskever

Earlier this year a research group called the Research Futures Project released a detailed scenario of the rise of a superintelligence. In this scenario, superintelligence arises within the next couple of years. The authors of this report vary in the likelihood they assign to this scenario.

According to AI 2027 the path to superintelligence is roughly:

The report also mentions that the data bottleneck problem can be partially solved by paying an army of humans to create and label new data. AIs will also get better at generating and labeling data themselves. This data is called synthetic data.

They claim that using increasingly intelligent AI to perform AI research leads to a AI research multiplier. This is the factor by which AI research progresses when AI itself is doing the research.

The report claims that AI progress can largely be attributable to two factors:

Scaling Compute, Parameters and Inference

In January 2025, OpenAI announced the Stargate Project, which is a $500 billion dollar initiative to greatly expand compute. A datacenter is already underway in Abilene, Texas.

What do we mean by scale? There are a few dimensions across which models can scale:

Algorithmic Improvements

What are the algorithmic improvements that can be made to LLMs to improve their performance? If you think about the discovery of superintelligence as a search through program space using some heuristic, then speeding up this search will likely lead to finding a program that implements AGI/ASI more quickly.

In 2017 the Transformer architecture was popularized which has enabled the LLM revolution currently underway. In AI 2027, the authors mention neuralese recurrence as a potential algorithmic improvement. Nueralese recurrence would enable models to reason for longer.

Iterated Distillation and Amplification (IDA) is also mentioned. IDA is comprised of two phases:

With IDA you begin with two models: a very large and slow LLM (let’s call this the "teacher") and a smaller and faster LLM (we’ll refer to this as the "student"). The idea is to first use a large amount of compute to train a highly capable “teacher” model. This is the Amplification phase.

Next, during the Distillation step, you give the teacher LLM a set of prompts. The teacher answers these prompts and provides their chain-of-thought. The student is then trained on these "distilled" answers and functionally becomes as smart as the teacher. In theory, this would enable you to train a student LLM with far less compute. It should be noted that AlphaGo was trained this way.

In Conclusion

We may be on the precipice of one of the most profound and influential technological transformations our species has undergone. Will this end well for us? No one knows. The question of alignment is still unanswered. This is both exciting and terrifying. Of course, we may look back to this time and think that these fears were unwarranted. This technology may be much farther away from being created than we think.