AI's Frozen Present: The Need for Continual On-the-Job Learning

Original Title: Why We Need Continual Learning

AI + a16z · April 28, 2026 · Listen to Original Episode →

The AI models we deploy today are like characters in Memento, stuck in a perpetual present, unable to truly learn from their experiences. While current methods like extensive context windows and retrieval-augmented generation (RAG) offer sophisticated workarounds, they mask a fundamental limitation: these systems are largely frozen after their initial training. This conversation with Malika Abakirova, partner on the AI infrastructure team at a16z, reveals the non-obvious implications of this paradigm, highlighting that the true "ultimate test" for AI lies not in reasoning or generation, but in its capacity for genuine, on-the-job learning and improvement, much like humans. Anyone building or relying on AI, from researchers to product managers, will gain a critical advantage by understanding the limitations of current approaches and the emerging landscape of continual learning.

The Frozen Present: Why In-Context Learning Isn't Enough

The current AI landscape is dominated by what's termed "in-context learning," a paradigm where models can process and respond to information provided within their immediate context window. This approach, exemplified by tools that leverage file systems or extensive prompts, has undeniably delivered impressive results. However, as Malika Abakirova explains, this is akin to applying sticky notes and tattoos to a patient with amnesia, as depicted in Memento. The core model remains static; it doesn't fundamentally update its knowledge or capabilities based on new interactions.

This limitation becomes starkly apparent in scenarios requiring true adaptation. Consider adversarial security, where a new "jailbreak" attack emerges. Simply updating the system prompt--the AI equivalent of a note--won't suffice if the model's underlying parameters are already tuned to be helpful in ways that can be exploited. The knowledge about the new attack vector needs to be embedded more deeply, within the model's weights, which are inaccessible to the attacker. Similarly, when a software library like React undergoes a breaking change, a model trained on an older version will struggle. No amount of context can override the deeply ingrained "knowledge" of the non-existent old function.

"The model is basically frozen, but new experiences and knowledge still persist. Humans are not AI, but we still learn on the job; we learn from experience, and that's what makes humans unique."

The problem isn't that these systems fail to retrieve information, but that they fail to learn from it in a way that permanently alters their behavior or knowledge base. We've built elaborate scaffolding--agent harnesses, RAG systems, system prompts--to compensate for this inherent lack of adaptability. While these workarounds are effective, they raise a critical question: are we merely papering over a fundamental limitation, or have we hit the ceiling of what this frozen-paradigm can achieve? The implication is that relying solely on these external mechanisms might be a temporary fix, creating a fragile system that struggles with novel, evolving challenges. The true innovation lies in moving beyond these external layers to models that can genuinely learn and adapt.

The Compaction Spectrum: From Non-Parametric to Parametric Evolution

The conversation delves into a framework for understanding where "learning" or, more precisely, "compaction" occurs within AI systems, categorizing it into three buckets: context, modules, and weights. This spectrum reveals the trade-offs and limitations inherent in each approach, offering a systems-level view of AI development.

Context (Non-Parametric Learning): This is the realm of in-context learning, where models leverage external data stores or large context windows. Companies like Pinecone building RAG systems and Latta or Mantis creating agent harnesses fall here. The primary constraint is the finite context length. The challenge is not if it works, but how to use that limited context most efficiently. This approach is powerful for accessing vast amounts of information but lacks the ability to permanently integrate new knowledge into the model's core. It’s like a student who can look up answers in a textbook but doesn't internalize the concepts.

Modules (Hybrid Approaches): This middle ground explores updating parts of the model without retraining the entire thing. The mention of a Stanford paper on updating KV caches hints at techniques that modify specific components of the model's memory or processing. This offers a potential balance, allowing for more dynamic adaptation than pure context-based methods, but without the full complexity of weight updates. It’s akin to a student who can update specific chapters in their notes but not rewrite the entire textbook.

Weights (Parametric Learning): This is the frontier of true continual learning, where the model's core parameters are updated through experience. This is where the most profound, yet nascent, progress is being made. Abakirova notes that this field is still in its early stages, with various teams exploring different paradigms. Some focus on reinforcement learning data and systems, while others question the fundamental transformer architecture itself, suggesting that novel architectures are needed for genuine, continuous learning. This is the ideal state: a student who not only looks up information but truly understands, integrates, and builds upon it, becoming demonstrably smarter over time.

The critical insight here is that while all these mechanisms involve learning, the depth and permanence of that learning vary dramatically. The current reliance on context is a workaround, not a solution for continuous improvement. The real advantage lies in developing systems that can genuinely update their weights, creating a learning loop where use directly translates to improvement, mirroring human cognitive development.

The Memento Metaphor and the Future of AI

The recurring Memento metaphor is not just a stylistic choice; it's a potent encapsulation of the current AI paradigm's core limitation: a frozen present. The protagonist, Leonard Shelby, cannot form new memories, forcing him to rely on external aids to navigate his life. Similarly, today's AI models, after their initial training cutoff, operate without the ability to genuinely integrate new experiences into their core understanding.

Malika Abakirova posits that the "ultimate test" for AI is its capacity for continual learning--the ability to learn on the job and improve through use, just as humans do. This isn't about achieving Artificial General Intelligence (AGI) in a brute-force sense, but about replicating the more nuanced, adaptive learning process that defines human intelligence. The development of benchmarks specifically designed to measure continual learning, as researchers at Berkeley and other labs are undertaking, is crucial for defining and advancing this capability.

The implication is that the very definition of an AI "model" may need to evolve. Instead of static entities trained once, we might need to think of them as dynamic systems that are constantly refining themselves. This shift has profound consequences for how we build, deploy, and interact with AI. It suggests that the most advanced AI won't just be the most powerful in a single moment, but the one that demonstrably gets better and more capable over time, adapting to new information and user interactions.

"Is there a system that is able to learn on the job and get better through use, just like humans? I think that would be kind of the question."

The emergence of early examples of "on-the-job learning," like test-time training that allows models to adapt to out-of-distribution data, offers a glimpse into this future. These aren't just incremental improvements; they represent a fundamental departure from the frozen model paradigm. The competitive advantage will accrue to those who can build or leverage systems that don't just respond to the world but actively learn from it, creating a virtuous cycle of improvement and adaptation.

Embrace the Memento Analogy: Recognize that current AI models are largely "frozen" after training. Understand the limitations this imposes on their ability to adapt to new information or evolving circumstances.
Prioritize Weight Updates over Context Augmentation: While RAG and large context windows are useful, focus on research and development that aims to update the model's core parameters (weights) for true, lasting learning. This is a longer-term investment but offers greater durability.
Explore Hybrid Learning Mechanisms: Investigate and experiment with "module" level updates or other middle-ground approaches that offer more dynamic adaptation than purely non-parametric methods, but with less complexity than full weight retraining.
Develop Continual Learning Benchmarks: Support or contribute to the creation of robust benchmarks that accurately measure an AI's ability to learn on the job and improve through use, moving beyond static performance metrics.
Redefine "Model" as a Dynamic System: Shift the mindset from viewing AI models as static artifacts to dynamic systems capable of continuous improvement and adaptation. This requires a strategic re-evaluation of deployment and update cycles.
Invest in "On-the-Job" Learning Capabilities: Seek out or build AI systems that demonstrate early signs of test-time adaptation and out-of-distribution learning. This capability will be a key differentiator for systems facing unpredictable real-world environments.
Foster Cross-Disciplinary Research: Encourage collaboration between AI researchers, engineers, and cognitive scientists to bridge the gap between current AI limitations and the nuanced learning processes observed in humans. This is a multi-year endeavor.

Related Episodes

Shifting AI From Static Training To Continual Learning

Jun 26, 2026 Dwarkesh Podcast

Current AI models act like elite test takers that struggle in the messy real world. To build a true competitive advantage, organizations must move from static deployment to architectures that learn continuously on the job.

View Episode Notes →

On-Policy Learning, End-to-End Reasoning, and Data Efficiency Drive AI Progress

Jan 23, 2026 Latent Space: The AI Engineer Podcast

AI's future demands genuine understanding beyond imitation, prioritizing "on-policy" learning and end-to-end reasoning to achieve true adaptability and competitive advantage.

View Episode Notes →

AI Advantage: Building Durable Systems Beyond Benchmark Chasing

Feb 01, 2026 Lex Fridman Podcast

AI's true advantage lies not in chasing benchmarks, but in building durable systems. Discover how efficiency, strategic deployment, and hidden mechanics drive lasting value beyond the hype.

View Episode Notes →

AI's Real Constraints: Infrastructure and Energy Bottlenecks

Feb 12, 2026 The Daily AI Show

AI's rapid capability growth is hitting hard infrastructure and energy limits, creating a "jagged disruption" that demands immediate adaptation and innovation beyond just smarter models.

View Episode Notes →

AI's Hidden Consequences Drive Long-Term Advantage

Apr 23, 2026 Latent Space: The AI Engineer Podcast

AI's true advantage lies beyond model capabilities, demanding agent-first APIs and agent-friendly developer experiences for durable, defensible businesses.

View Episode Notes →

AI's Hidden Dynamics Shape Future Progress and Risk

May 11, 2026 Last Week in AI

AI's true advantage lies not just in powerful models, but in understanding the intricate web of interactions they create. Master systemic implications to anticipate shifts and capitalize on opportunities.

View Episode Notes →