Spatial Intelligence: Beyond LLMs to Generative 3D Worlds - Episode Hero Image

Spatial Intelligence: Beyond LLMs to Generative 3D Worlds

Original Title: After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs

Resources

Resources & Recommendations

Books

  • Fei-Fei Li's book - This book likely discusses her thoughts and experiences in the field of AI, particularly related to the development of image captioning.

Research & Studies

  • "Dense Captioning" (Justin Johnson, Andrej Karpathy, Fei-Fei Li) - This paper, presented at CVPR 2016, describes a system that takes a single image, draws bounding boxes around interesting objects, and generates a short descriptive snippet for each.
  • "Image Captioning" (Andrej Karpathy, Fei-Fei Li) - This paper, presented at CVPR 2015, describes one of the first systems to combine convolutional neural networks and LSTMs to generate a single sentence caption from an image.
  • Rtfm Model (World Labs) - This model generates frames one at a time as a user interacts with the system, showcasing an alternative approach to 3D world generation.
  • Behavior (Stanford Lab) - An open dataset and benchmark for robotic learning in simulated environments, promoting open science in AI research.

People Mentioned

  • Yann LeCun - Mentioned as a prominent proponent of world models.
  • Andrej Karpathy - Referenced as a former student and collaborator of Fei-Fei Li, working on image captioning and language modeling.
  • John Markoff (New York Times) - A reporter who covered the independent development of image captioning research by both Google and Fei-Fei Li's lab.
  • Howard Gardner - A psychologist mentioned for his theory of multiple intelligences, which includes linguistic and spatial intelligence.
  • Francis Crick and James Watson - Mentioned in the context of their spatial reasoning during the deduction of the DNA double helix structure.
  • Sir Isaac Newton - Referenced in the discussion of formalizing empirical, spatial understanding into language, specifically regarding the laws of gravity.

Organizations & Institutions

  • World Labs - The company founded by Fei-Fei Li and Justin Johnson, focused on spatial intelligence and world models.
  • Stanford's Institute for Human-Centered AI (Stanford HAI) - Fei-Fei Li is a co-director, advocating for public sector and academic AI work.
  • University of Michigan - Where Justin Johnson was a professor before joining World Labs.
  • Meta - Where Justin Johnson worked previously.
  • Google - Referenced for their independent work on image captioning and for a team that advised Justin Johnson to pursue a PhD for computer vision.

Websites & Online Resources

  • National AI Research Resource (NAIR) Bill - A bill discussed with policymakers to scope out a national AI compute cloud and data repository.
  • World Labs Homepage - The website for World Labs, which showcases different use cases for Marble, including visual effects, gaming, and simulation.

Other Resources

  • AlexNet - A convolutional neural network that significantly advanced image recognition, mentioned as a historical turning point in AI and a benchmark for later advancements.
  • ImageNet - A large visual database that was crucial for the training and success of AlexNet.
  • Gaussian Splats - The native output format of Marble, described as tiny, semi-transparent particles with position and orientation in 3D space, which can be rendered efficiently in real time.
  • Linux Source Code - Used as a dataset to train an RNN for language modeling, allowing researchers to analyze the internal workings of the LSTM.

---
Handpicked links, AI-assisted summaries. Human judgment, machine efficiency.
This content is a personally curated review and synopsis derived from the original podcast episode.