Spatial Intelligence: AI's Next 3D Frontier Unlocked

The a16z Show · November 13, 2025 · Listen to Original Episode →

Original Title: The Frontier of Spatial Intelligence with Fei-Fei Li

Related Episodes

AI's Missing Dimension: World Models for 3D Spatial Intelligence

Dec 23, 2025 AI + a16z

AI must build "world models" for 3D spatial intelligence, moving beyond language to truly understand and interact with the physical world, unlocking applications from robotics to infinite virtual universes.

View Episode Notes →

AI's Uncertain Trajectory: Platform Shift, Bubble Risk, and Productization

Dec 12, 2025 The a16z Show

AI's potential rivals electricity, but its true impact and widespread daily use remain unpredictable due to unknown physical limits and the risk of market bubbles.

View Episode Notes →

Spatial Intelligence: AI's Next Frontier Beyond Language Models

Dec 05, 2025 The a16z Show

AI is shifting from language to spatial intelligence, enabling AI to understand and interact with the 3D world, unlocking new possibilities in gaming, VFX, and robotics.

View Episode Notes →

AI Agents Require Domain-Specific Data Generation and Reinforcement Learning

Dec 01, 2025 The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

AI agents automate 99% of knowledge work, shifting competitive moats to data feedback loops and enabling 100x productivity gains for widespread entrepreneurship.

View Episode Notes →

Cursor's Focused Editor: Powering AI Coding's Future

Nov 10, 2025 The a16z Show

## Episode Synopsis Cursor, a rapidly growing developer tool company, has achieved remarkable success by focusing on core AI coding...

View Episode Notes →

Spatial Intelligence: The Next Frontier Beyond Language AI

Nov 25, 2025 Latent Space: The AI Engineer Podcast

Spatial intelligence, the next AI frontier beyond LLMs, unlocks editable 3D worlds from multimodal inputs, offering richer understanding than language alone and enabling novel applications.

View Episode Notes →

Resources

Books

"The Bitter Lesson" - This concept suggests that algorithms should be designed to leverage available compute power, as compute tends to advance more predictably than algorithmic cleverness.

Research & Studies

"Alexnet" (2012 paper) - This paper is credited with a breakthrough moment in computer vision for deep learning, demonstrating a deep neural network's success on the ImageNet challenge.
"Neural Radiance Fields" (NeRF) (Ben Mildenhall) - This paper presented a clear method for reconstructing 3D structure from 2D observations, significantly impacting the field of 3D computer vision.
"A Neural Algorithm of Artistic Style" (Leigh Gatys) - This 2015 paper demonstrated the ability to transfer artistic styles to real-world photographs using neural networks, a precursor to generative AI.

People Mentioned

Fei-Fei Li (Co-founder of World Labs) - A prominent researcher in AI and computer vision, known for her work on ImageNet and her current focus on spatial intelligence.
Justin Johnson (Co-founder of World Labs) - A researcher who made significant contributions to generative AI and spatial intelligence.
Martin Casado (a16z General Partner) - Co-host of the discussion, providing insights from the venture capital perspective.
Honglak Lee (Google Brain) - Mentioned in relation to an early influential paper on deep learning.
Andrew Ng (Google Brain) - Mentioned in relation to an early influential paper on deep learning and teaching machine learning.
Pietro Perona (Caltech) - The undergraduate advisor for Justin Johnson and the PhD advisor for Fei-Fei Li.
Daphne Koller - Mentioned as an instructor of a complicated Bayesian modeling course.
Jeff Hinton - Mentioned for having generative model papers.
Leigh Gatys - Lead author of the paper on artistic style transfer.
Christoph Lasner (Co-founder of World Labs) - Recognized for his work in computer graphics and a precursor to Gaussian Splat representations.
Ben Mildenhall (Co-founder of World Labs) - Known for his seminal work on NeRF.

Organizations & Institutions

World Labs - A company focused on developing spatial intelligence for machines.
Caltech - The alma mater of both Justin Johnson and Fei-Fei Li for their undergraduate studies.
Stanford - Where Fei-Fei Li was a professor and where the ImageNet project was significantly developed.
Google Brain - Where early influential papers in deep learning were published.
OpenAI - Mentioned in the context of large language models and multimodal models.
Nvidia - Mentioned for its high-performance GPUs.
Fair (Meta AI) - Where Justin Johnson worked on 3D computer vision.

Websites & Online Resources

ImageNet - A large-scale dataset of images used for computer vision research, instrumental in the development of modern computer vision.
Archive - A pre-print server where research papers are often first published.
X (formerly Twitter) - Mentioned as a platform for following a16z.
a16z.com - The website for a16z, including disclosures.
a16z.substack.com - A substack newsletter for a16z.

Other Resources

The Cat Paper - An early, famous paper on deep learning from Google Brain.
The "Bitter Lesson" - A concept in AI that highlights the importance of compute over specific algorithmic cleverness.
Transformer's Paper (Attention) - An algorithmic unlock that has been foundational for modern AI, particularly language models.
Stable Diffusion - A generative AI model mentioned as a key unlock in the current wave of AI.
CLIP - A model mentioned in the context of using internet data and human labeling (alt tags) for image understanding.
GANs (Generative Adversarial Networks) - A type of generative model that was difficult to use, requiring structured input like scene graphs.
LSTM (Long Short-Term Memory) - A type of recurrent neural network architecture used before transformers.
RNN (Recurrent Neural Network) - A type of neural network architecture.
GRU (Gated Recurrent Unit) - A type of recurrent neural network architecture.
GPT-2 - A large language model mentioned as requiring significant resources to train.
Scene Graphs - A structured way of representing objects and their relationships, used as input for early generative models.
Gaussian Splat Representation - A 3D modeling technique that has recently gained traction.
VR Headset - Mentioned as a transformative technology experience.
Apple Vision Pro - A spatial computing device released by Apple.