You can’t really say it is just predicting continuations when it is learning to write proofs for Erdos problems, formalise significant math results, or perform automated AI research. Those are far beyond what you get by just being a copying and re-forming machine, a lot of these problems require sophisticated application of logic.
I don’t know if this can reach AGI, or if that term makes any sense to begin with. But to say these models have not learnt from their RL seems a bit ludicrous. What do you think training to predict when to use different continuations is other than learning?
I would say LLM’s failure cases like failing at riddles are more akin to our own optical illusions and blind spots rather than indicative of the nature of LLMs as a whole.
I think you're conflating mechanism with function/capability.
I'm not sure what I wrote that made you conclude that I thought these models are not learning anything from their RL training?! Let me say it again: they are learning to steer towards reasoning steps that during training led to rewards.
The capabilities of LLMs, both with and without RL, are a bit counter-intuitive, and I think that, at least in part, comes down to the massive size of the training sets and the even more massive number of novel combinations of learnt patterns they can therefore potentially generate...
In a way it's surprising how FEW new mathematical results they've been coaxed into generating, given that they've probably encountered a huge portion of mankind's mathematical knowledge, and can potentially recombine all of these pieces in at least somewhat arbitrary ways. You might have thought that there are results A, B and C hiding away in some obscure mathematical papers that no human has previously considered to put together before (just because of the vast number of such potential combinations), that might lead to some interesting result.
If you are unsure yourself about whether LLMs are sufficient to reach AGI (meaning full human-level intelligence), then why not listen to someone like Demis Hassabis, one of the brightest and best placed people in the field to have considered this, who says the answer is "no", and that a number of major new "transformer-level" discoveries/inventions will be needed to get there.
> they are still predicting training set continuations
But this is underselling what they do. Probably a large part of what they predict is learnt from their training set, but RL has added a layer on top that does not just come from just mimicry.
Again, I doubt this is enough for “AGI” but I think that term is not very well-defined to begin with. These models have now shown they are capable of novel reasoning, they just have to be prodded in the right way.
It’s not clear to me that there isn’t scaffolding that can use LLMs to search for novel improvements, like Katpathy’s recent autoresearch. The models, with the help of RL, seem to be getting to the point where this actually works to some extent, and I would expect this to happen in other fields in the next few years as well.
In general there's a difference between novel and discovering something new.
Pretraining has given the LLM a huge set of lego blocks that it can assemble in a huge variety of ways (although still limited by the "assembly patterns" is has learnt). If the LLM assembles some of these legos into something that wasn't directly in the training set, then we can call that "novel", even though everything needed to do it was present in the training set. I think maybe a more accurate way to think of this is that these "novel" lego assemblies are all part of the "generative closure" of the training set.
Things like generating math proofs are an example of this - the proof itself, as an assembled whole, may not be in the training set, but all the piece parts and thought patterns necessary to construct the proof were there.
I'm not much impressed with Karpathy's LLM autoresearch! I guess this sort of thing is part of the day to day activities of an AI researcher, so might be called "research" in that regard, but all he's done so far is just hyperparameter tuning and bug fixing. No doubt this can be extended to things that actually improve model capability, such as designing post-training datasets and training curriculums, but the bottleneck there (as any AI researcher will tell you) isn't the ideas - it's the compute needed to carry out the experiments. This isn't going to lead to the recursive self-improvement singularity that some are fantasizing about!
I would say these types of "autoresearch" model improvements, and pretty much anything current LLMs/agents are capable of, all fall under the category of "generative closure", which includes things like tool use that they have been trained to do.
It may well be possible to retrofit some type of curiosity onto LLMs, to support discovery and go beyond "generative closure" of things it already knows, and I expect that's the sort of thing we may see from Google DeepMind in next 5 years or so in their first "AGI" systems - hybrids of LLMs and hacks that add functionality but don't yet have the elegance of an animal cognitive architecture.
You laid out the theoretical limitations well, and I tend to agree with them.
I just get frustrated when people downplay how big of an impact filling in the gaps at the frontier of knowledge would have. 99.9% of researchers will never have an idea that adds a new spike to the knowledge frontier (rather than filling in holes), and 99.99% of research is just filling in gaps by combining existing ideas (numbers made up). In this realm, autoresearch may not be groundbreaking, but it can do the job. AlphaEvolve is similar.
If LLMs can actually get closer to something like that, it leaves human researchers a whole lot more time to focus on new ideas that could move entire fields forward. And their iteration speed can be a lot faster if AI agents can help with the implementation and testing of them.
> What do you think training to predict when to use different continuations is other than learning?
Sure, training = learning, but the problem with LLMs is that is where it stops, other than a limited amount of ephemeral in-context learning/extrapolation.
With an LLM, learning stops post-training when it is "born" and deployed, while with an animal that's when it starts! The intelligence of an animal is a direct result of it's lifelong learning, whether that's imitation learning from parents and peers (and subsequent experimentation to refine the observed skill), or the never ending process of observation/prediction/surprise/exploration/discovery which is what allows humans to be truly creative - not just behaving in ways that are endless mashups of things they have seen and read about other humans doing (cf training set), but generating truly novel behaviors (such as creating scientific theories) based on their own directed exploration of gaps in mankind's knowledge.
Application of AGI to science and new discovery is a large part of why Hassabis defines AGI as human-equivalent intelligence, and understands what is missing, while others like Sam Altman are content to define AGI as "whatever makes us lots of money".
Memory systems built on top of LLMs could provide continual learning. I do not agree that it is some fundamental limitation.
Claude Code already writes its own memory files. And people already finetune models. There is clear potential to use the former as a form of short-term memory and the latter for long-term “learning”.
The main blockers to this are that models aren’t good enough at managing their own memory, and finetuning is expensive and difficult. But both of these seem like solvable engineering problems.
Continual learning isn't a "fundamental limitation" or unsolvable problem. Animal brains are an existence proof that it's possible, but it's tough to do, and quite likely SGD is not the way to do it, so any attempt to retrofit continual learning to LLMs as they exist today is going to be a hack...
Memory and learning are two different things. Memorization is a small subset of learning. Memorizing declarative knowledge and personal/episodic history (cf. LLM context) are certainly needed, but an animal (or AI intern) also needs to be able to learn procedural skills which need to become baked into the weights that are generating behavior.
Fine tuning is also no substitute for incremental learning. You might think of it as addressing somewhat the same goal, but really fine tuning is about specializing a model for a particular use, and if you repeatedly fine tune a model for different specializations (e.g. what I learnt yesterday, vs what I learnt the day before) then you will run into the catastrophic forgetting problem.
I agree that incremental learning seems more like an engineering problem rather than a research one, or at least it should succumb to enough brain power and compute put into solving it, but we're now almost 10 years into the LLM revolution (attention paper in 2017) and it hasn't been solved yet - it's not easy.
Fundamentally, I’m more optimistic on how far current approaches can scale. I see no reason why RL could not be used to train models to use memory, and fine-tuning already works, it’s just expensive.
The continual learning we get may be a bit hamfisted, and not fit into a neat architecture, but I think we could actually see it work at scale in the next few years. Whereas new techniques like what Yann Lecun have demonstrated still live heavily in the realm of research. Cool, but not useful yet.
Fine tuning is also not so limited as you suggest. For one, we don’t need to fine tune the same model over and over, you can just start with a frontier model each time. And two, modern models are much better at generating synthetic data or environments for RL. This could definitely work, but it might require a lot of work in data collection and curation, and the ROI is not clear. But if large companies continue to allocate more and more resources to AI in the next few years, I could see this happening.
OpenAI already has a custom model service, and labs have stated they already have custom models built for the military (although how custom those models are is unclear). It doesn’t seem like a huge leap to also fine-tune models over a companies internal codebases and tooling. Especially for large companies like Google, Amazon, or Stripe that employ tens of thousands of software engineers.
I don’t know if this can reach AGI, or if that term makes any sense to begin with. But to say these models have not learnt from their RL seems a bit ludicrous. What do you think training to predict when to use different continuations is other than learning?
I would say LLM’s failure cases like failing at riddles are more akin to our own optical illusions and blind spots rather than indicative of the nature of LLMs as a whole.