Andrej Karpathy on the Decade of Agents, the Limits of RL, and Why Education Is His Next Mission

A summary of key takeaways from Andrej Karpathy’s conversation with Dwarkesh Patel

In a wide-ranging conversation with Dwarkesh Patel, Andrej Karpathy — former head of AI at Tesla, founding member of OpenAI, and creator of some of the most popular AI educational content on the internet — shared his views on where AI is headed, what’s still broken, and why he’s now pouring his energy into education. Here are the key takeaways.

“It’s the Decade of Agents, Not the Year of Agents”

Karpathy’s now-famous quote is a direct pushback on industry hype. Early agents like Claude Code and Codex are impressive and he uses them daily, but they’re missing too much to be considered mature: they lack continual learning, robust multimodality, and general cognitive reliability. Based on his 15 years in the field watching predictions overshoot, he estimates roughly a decade of hard work remains to get agents to the point where you’d genuinely hire one as an intern.

He frames this through self-driving as analogy. He saw a flawless Waymo demo back in 2014 and assumed the technology was nearly there. Over a decade later, full-scale autonomous driving is still not done. The lesson: demos are easy, products are hard, and progress is a “march of nines” where each additional nine of reliability takes the same amount of effort as the last.

Pre-training Is “Crappy Evolution”

Karpathy draws a careful analogy between pre-training and biological evolution. Both produce a starting point rich with capability — but they work very differently. Evolution encodes algorithms for learning into DNA; pre-training compresses the internet into neural network weights. The result is what he calls “ghosts” or “spirits” — fully digital entities that mimic humans rather than replicate the biological process that created human intelligence.

A key distinction: pre-training gives models both knowledge and intelligence, but the knowledge may actually be holding them back. Models rely too heavily on memorized patterns from their training data and struggle when they need to go off that data manifold. Karpathy’s vision for the future is a “cognitive core” — an intelligent entity stripped of memorized knowledge that retains only the algorithms for thought, problem-solving, and reasoning. When it needs a fact, it looks it up.

In-Context Learning Is Where the Real Intelligence Lives

One of the most fascinating segments: Karpathy argues that the real intelligence of LLMs is visible in their in-context learning — the way they reason, self-correct, and adapt within a conversation window. He compares this to human working memory, contrasting it with the “hazy recollection” that lives in the model’s weights from pre-training. The difference in information density is staggering — roughly a 35-million-fold gap between how much information per token is stored in the KV cache versus the weights.

He also raises the tantalizing possibility, backed by research, that in-context learning may internally implement something like gradient descent within the transformer’s layers — a kind of learning-within-learning that emerged spontaneously from pre-training.

Reinforcement Learning Is “Terrible”

Karpathy doesn’t mince words here. Current RL for LLMs is crude: you generate hundreds of parallel attempts at a problem, check which ones got the right answer, and then upweight every single token in those successful trajectories — even the wrong turns and dead ends. He calls it “sucking supervision through a straw.” A human would never learn this way; they’d reflect on what specifically went right and wrong.

Process-based supervision — giving feedback at every step rather than just at the end — is the obvious fix, but it’s tricky to implement. Using LLM judges for intermediate rewards runs into adversarial examples: models will find bizarre inputs (like strings of “dhdhdhdh”) that trick the judge into giving perfect scores. These aren’t prompt injections; they’re simply out-of-distribution inputs that exploit the judge model’s weaknesses.

The path forward likely involves some kind of reflection and review mechanism — models that can analyze their own trajectories, generate synthetic training examples, and learn from their mistakes in a structured way. Several papers are exploring this direction, but nothing has convincingly worked at scale yet.

Model Collapse and the Entropy Problem

When models generate synthetic data to train on, the output looks reasonable sample by sample — but the distribution is “silently collapsed.” Ask ChatGPT for a joke ten times and you’ll notice it only knows about three. This collapse means you can’t just scale up synthetic data generation and expect returns. Train too long on your own outputs and performance degrades.

Karpathy draws a surprising parallel to human aging: people also collapse over time, revisiting the same thoughts, saying more of the same things. Children, by contrast, haven’t overfit yet — they say shocking, unpredictable things precisely because they haven’t collapsed. He even references research suggesting that dreaming may be evolution’s mechanism for preventing this kind of overfitting by putting the brain in weird, novel situations.

The Cognitive Core Could Be Surprisingly Small

Here’s a contrarian prediction: Karpathy believes a genuinely intelligent cognitive core could fit in roughly a billion parameters — perhaps even smaller. Current frontier models are around a trillion parameters, but most of that capacity is devoted to memorizing facts from the internet’s vast (and mostly terrible) training data. Strip away the memory, distill from a better model, and train on a much cleaner dataset, and you might get something remarkably compact that thinks well but has to look things up.

AGI Will Blend Into 2% GDP Growth

Perhaps the most provocative take: Karpathy doesn’t expect AI to cause a visible discontinuity in economic growth. He points out that transformative technologies — computers, smartphones, the internet — are invisible in the GDP curve. The same exponential continues. He sees AI as an extension of computing, not a separate revolution, and expects it to diffuse gradually across the economy just like every technology before it.

He’s skeptical of the “God in a box” framing where a single superintelligence solves everything at once. Instead, he expects a gradual “autonomy slider” where AI handles more and more of the routine work, humans supervise teams of AI agents, and society slowly reorganizes around what’s automatable — similar to how self-driving still has hidden teleoperators and radiologists still have jobs despite excellent computer vision.

Coding Agents: Impressive but Still Limited

Karpathy built his recent nanochat repository — an 8,000-line end-to-end ChatGPT clone — largely with autocomplete assistance rather than agent-based “vibe coding.” The agents kept trying to force conventional patterns on his deliberately unconventional code, inserting unnecessary error handling, using deprecated APIs, and failing to internalize that he had custom implementations of standard components.

His sweet spot remains the middle ground: you write the code and architect the system, but the model autocompletes as you type. Agents are useful for boilerplate and for languages you’re less familiar with (he used them more when rewriting a tokenizer in Rust), but for novel, intellectually dense code, they produce what he bluntly calls “slop.”

This has direct implications for AI-accelerated AI research: if models are worst at exactly the kind of novel, never-been-written-before code that advances the field, the recursive self-improvement story may play out more slowly than optimists expect.

Why Education, and Why Now

Karpathy’s new venture, Eureka, is building what he calls a “Starfleet Academy” — starting with an elite AI course, with nanochat as its capstone project. He’s motivated by a deep concern that humanity could be sidelined as AI advances, ending up in a WALL-E or Idiocracy scenario. His antidote is education.

His vision is informed by a personal experience learning Korean with a one-on-one tutor — an experience so good it made him realize how far current AI tutors are from the real thing. A great human tutor instantly understands where you are, serves you exactly the right material, and makes you the only bottleneck to your own learning. That capability doesn’t exist in LLMs yet, so for now he’s building something more conventional: beautifully crafted courses with human faculty, a physical institution, and a digital tier below it.

Long-term, he sees education transforming in the same way physical fitness did. Just as nobody was casually bench-pressing two plates a century ago but gym culture now makes that common, AI-powered tutoring could make it trivial for anyone to speak five languages or master an undergraduate curriculum — not for career reasons, but because learning, done right, feels good.

This post summarizes the conversation between Andrej Karpathy and Dwarkesh Patel. Watch the full interview here.