Object-Centered AI Models Grounded in Physics for True Understanding

Original Title: We Invented Momentum Because Math is Hard [Dr. Jeff Beck]

Machine Learning Street Talk (MLST) · December 31, 2025 · Listen to Original Episode →

The current AI boom, fueled by massive scaling and powerful tools like automatic differentiation, has delivered impressive function approximators. However, this conversation with Dr. Jeff Beck reveals a critical blind spot: we've been building AI backwards by treating language, not physics, as the foundational model of intelligence. This approach, while yielding impressive results in areas like LLMs, fundamentally misunderstands how biological intelligence operates. The brain doesn't predict text; it builds causal models of a physical world composed of objects and forces. This distinction is crucial because it unlocks the potential for AI that can truly understand, interact with, and adapt to the complexities of the real world, moving beyond mere pattern matching to genuine problem-solving and invention. Those who grasp this shift from prediction to causal modeling will gain a significant advantage in building more robust, adaptable, and genuinely intelligent systems.

The Illusion of the Prediction Machine

The prevailing paradigm in AI, particularly with large language models (LLMs), is that intelligence is fundamentally about prediction. We train colossal models on vast datasets, enabling them to predict the next word, pixel, or action with remarkable accuracy. This approach, while producing impressive feats of fluency and pattern recognition, is akin to mistaking a sophisticated mimic for a true thinker. Dr. Jeff Beck argues that this focus on prediction, particularly when grounded in language, is a fundamental misdirection. Language, he points out, is notoriously unreliable as a descriptor of thought processes or reality. Self-reporting, a common method for understanding human behavior, is often inconsistent with actual observed actions.

"Self-report is the least reliable form of data that one gets out of a cognitive or psychological experiment."

-- Dr. Jeff Beck

This reliance on language as the bedrock of AI leads to models that are excellent at generating plausible outputs but lack a deep, causal understanding of the world. They operate in a statistical space of tokens rather than a physical space of objects and forces. This is where the "Cat in the Warehouse Problem" becomes a stark illustration. An AI trained solely on warehouse operations might excel at managing forklifts and boxes but would be utterly stumped by the unexpected appearance of a cat. It wouldn't know what it doesn't know, leading to potential system failures or dangerous actions. The consequence of this predictive, language-centric approach is AI that is brittle, unable to generalize to novel situations, and fundamentally incapable of the kind of creative problem-solving that characterizes human intelligence.

The Brain as a Scientist: Causal Models and Uncertainty

In contrast to the prediction machine, Beck proposes that the brain operates more like a scientist, constantly building and testing causal models of the world. This perspective is rooted in Bayesian inference, a framework that describes how we update our beliefs in the face of new evidence, inherently accounting for uncertainty. The brain, in this view, isn't just predicting; it's actively inferring, hypothesizing, and experimenting. This is evident in human behavior, particularly in tasks involving sensory integration, where we optimally combine information from different senses, adjusting for the reliability of each cue on a trial-by-trial basis.

"Bayesian inference provides us with like a normative approach to empirical inquiry and encapsulates the scientific method writ large."

-- Dr. Jeff Beck

The implication here is profound: true intelligence requires a model of the world that is structured around objects, their properties, and the forces that govern their interactions. This object-centered, causal approach allows for a more robust understanding of how the world works, enabling AI to not only predict but also to reason, adapt, and invent. The advantage of this approach lies in its ability to handle uncertainty explicitly. Instead of confidently producing incorrect outputs, an AI grounded in causal modeling can recognize when it lacks information and actively seek to acquire it, much like the warehouse AI that can "phone a friend" for information about the cat. This "knowing what you don't know" is a critical step towards more reliable and trustworthy AI systems.

The Future: A Symphony of Small Models

The current trend towards ever-larger, monolithic AI models might be misguided. Beck suggests that a more effective and efficient architecture for AI mirrors the complexity of video game engines: a vast collection of smaller, modular "object models." Each model represents a specific object or concept, with defined properties and interaction rules. When faced with a new environment or task, the AI can dynamically select and instantiate only the relevant models, creating a sparse and computationally efficient system.

This "lots of little models" approach offers several advantages. Firstly, it allows for more efficient learning and adaptation. Instead of retraining an entire massive model, individual object models can be updated or replaced as needed. Secondly, it promotes better generalization. By understanding the fundamental properties and interactions of objects, the AI can combine them in novel ways to solve new problems, akin to systems engineering where known components are assembled into new creations. This contrasts sharply with current LLMs, which operate in a pixel or token space, where macroscopic concepts are implicit rather than explicit. The consequence of this modular approach is AI that is not only more capable but also more interpretable and debuggable, as the behavior of individual components can be more easily understood.

Actionable Insights for Building Smarter AI

The insights from this conversation offer a roadmap for developing more sophisticated AI, moving beyond the limitations of current approaches.

Shift from Prediction to Causal Modeling: Prioritize building AI systems that model the causal relationships in the physical world, rather than solely focusing on predicting sequential data like text. This requires grounding models in physics and object interactions.
Embrace Uncertainty Explicitly: Develop AI that can represent and reason about its own uncertainty. This enables systems to identify knowledge gaps and seek clarification, rather than generating confident but incorrect outputs.
Adopt a Modular Architecture: Explore building AI from a collection of specialized, reusable object models, similar to video game engines. This allows for greater flexibility, efficiency, and interpretability.
Ground Models in Physics, Not Just Language: Recognize the limitations of language as a primary grounding mechanism for AI. Prioritize models that understand the physical properties and dynamics of objects and their environment.
Invest in Continual Learning: Implement systems that can continuously learn and adapt from new interactions and data, rather than relying on static, pre-trained models. This is crucial for real-world adaptability.
Develop Object-Centric Representations: Focus on creating AI that understands the world in terms of discrete objects and their relationships, mirroring human conceptualization. This will be key for enabling systems to perform complex tasks like systems engineering.
Prioritize Simulation Fidelity: If using simulated environments for training, ensure they accurately reflect real-world physics and dynamics. This is essential for successful transfer learning to robotic systems.

Related Episodes

AI Augments Human Capabilities Driving Economic Growth and Innovation

Jan 05, 2026 The a16z Show

AI amplifies human capabilities, driving unprecedented productivity and economic growth, not job displacement. Embrace AI as a creative partner to unlock new potential and ensure broad access.

View Episode Notes →

World Models: Next AI Frontier Beyond LLMs

Dec 06, 2025 Latent Space: The AI Engineer Podcast

World models, trained on real-world interactions, are the next AI frontier, surpassing LLMs for spatial intelligence and embodied robotics.

View Episode Notes →

Agentic and Physical AI Accelerate Automation and Real-World Integration

Dec 10, 2025 NVIDIA AI Podcast

Agentic AI handles 75-80% of repetitive tasks, freeing humans for high-value work, while physical AI and humanoid robots promise a larger transformation than generative AI.

View Episode Notes →

AI Boom: Early Stage, Market-First Investing, and 3D Content Creation

Dec 30, 2025 The a16z Show

AI's early boom offers tangible value, unlike past hype, transforming coding and 3D content creation. True value emerges at the individual level, with enterprise adoption evolving.

View Episode Notes →

Beyond Transformer Limitations: Building Trustworthy Reasoning AI

Feb 06, 2026 The Stack Overflow Podcast

Current AI struggles with long-term reasoning due to design flaws. Discover next-generation models that overcome these limitations for robust, trustworthy AI.

View Episode Notes →

AI's Hidden Costs--Economic Incoherence and Societal Disruption

Mar 09, 2026 This Week in Tech (Audio)

AI's dazzling promise masks economic incoherence and societal disruption, challenging human creativity and labor. Understand the hidden costs and unforeseen consequences of the AI gold rush.

View Episode Notes →