Spatial Intelligence: Beyond LLMs to Generative 3D Worlds
After LLMs: Spatial Intelligence and World Models — Fei-Fei Li & Justin Johnson, World Labs
Resources
Resources & Recommendations
Books
- Fei-Fei Li's book - This book likely discusses her thoughts and experiences in the field of AI, particularly related to the development of image captioning.
Research & Studies
- "Dense Captioning" (Justin Johnson, Andrej Karpathy, Fei-Fei Li) - This paper, presented at CVPR 2016, describes a system that takes a single image, draws bounding boxes around interesting objects, and generates a short descriptive snippet for each.
- "Image Captioning" (Andrej Karpathy, Fei-Fei Li) - This paper, presented at CVPR 2015, describes one of the first systems to combine convolutional neural networks and LSTMs to generate a single sentence caption from an image.
- Rtfm Model (World Labs) - This model generates frames one at a time as a user interacts with the system, showcasing an alternative approach to 3D world generation.
- Behavior (Stanford Lab) - An open dataset and benchmark for robotic learning in simulated environments, promoting open science in AI research.
People Mentioned
- Yann LeCun - Mentioned as a prominent proponent of world models.
- Andrej Karpathy - Referenced as a former student and collaborator of Fei-Fei Li, working on image captioning and language modeling.
- John Markoff (New York Times) - A reporter who covered the independent development of image captioning research by both Google and Fei-Fei Li's lab.
- Howard Gardner - A psychologist mentioned for his theory of multiple intelligences, which includes linguistic and spatial intelligence.
- Francis Crick and James Watson - Mentioned in the context of their spatial reasoning during the deduction of the DNA double helix structure.
- Sir Isaac Newton - Referenced in the discussion of formalizing empirical, spatial understanding into language, specifically regarding the laws of gravity.
Organizations & Institutions
- World Labs - The company founded by Fei-Fei Li and Justin Johnson, focused on spatial intelligence and world models.
- Stanford's Institute for Human-Centered AI (Stanford HAI) - Fei-Fei Li is a co-director, advocating for public sector and academic AI work.
- University of Michigan - Where Justin Johnson was a professor before joining World Labs.
- Meta - Where Justin Johnson worked previously.
- Google - Referenced for their independent work on image captioning and for a team that advised Justin Johnson to pursue a PhD for computer vision.
Websites & Online Resources
- National AI Research Resource (NAIR) Bill - A bill discussed with policymakers to scope out a national AI compute cloud and data repository.
- World Labs Homepage - The website for World Labs, which showcases different use cases for Marble, including visual effects, gaming, and simulation.
Other Resources
- AlexNet - A convolutional neural network that significantly advanced image recognition, mentioned as a historical turning point in AI and a benchmark for later advancements.
- ImageNet - A large visual database that was crucial for the training and success of AlexNet.
- Gaussian Splats - The native output format of Marble, described as tiny, semi-transparent particles with position and orientation in 3D space, which can be rendered efficiently in real time.
- Linux Source Code - Used as a dataset to train an RNN for language modeling, allowing researchers to analyze the internal workings of the LSTM.