Transformer's Local Minimum: A New AI Path Emerges
Resources
Resources & Recommendations
Books
- "Why Greatness Cannot Be Planned" by Kenneth Stanley - This book discusses the importance of allowing people to follow their own interests unfettered by objectives to foster epistemic foraging and discover novelty, which is a core philosophy of Sakana AI.
Research & Studies
- "Continuous Thought Machines" (Sakana AI) - This paper introduces a new recurrent model with native adaptive compute, using higher-level concepts for neurons and synchronization as a representation, designed to solve problems in more human-like ways. It was a spotlight at NeurIPS 2025.
- "Fractured Entangled Representations" - This paper was alluded to in the context of the shortcut learning problem in current language models.
- "Intelligence Matrix Exponentiation" - An obscure paper, possibly rejected, that serves as a poster child for representing data more intrinsically, demonstrating a spiral decision boundary for a spiral dataset.
- "Adaptive Computation Time" (Alex Graves) - This paper details a method for neural networks to decide how much computation to perform based on the input, though it required careful balancing of a penalty loss.
- "Anthropic Biology" - A paper mentioned in the context of planning and thinking in AI systems.
- "Neural Turing Machine" (Alex Graves) - A type of recurrent neural network that can interact with an external memory, discussed in relation to the Continuous Thought Machines.
- "Sudoku Bench" - A dataset created to benchmark powerful reasoning systems, featuring variant Sudoku puzzles with complex, natural language-described rules, requiring meta-reasoning and deep understanding.
People Mentioned
- Kenneth Stanley (Author) - His book "Why Greatness Cannot Be Planned" is a huge inspiration for Sakana AI's philosophy.
- Sep Hotwriter - Mentioned as having new architectural ideas that are not being implemented by major AI labs.
- Sarah Hooker - Referenced for her work on "the hardware lottery," drawing a parallel to an "architecture lottery" in AI.
- Randall Bellero - Mentioned for his spline theory of neural networks.
- Alex Graves - Referenced for his work on Adaptive Computation Time and Neural Turing Machines.
- Andre Karpathy - Quoted for his idea that true AGI would require learning from "thought traces" rather than just text data.
- Chris Moore - Mentioned in relation to the Sudoku Bench dataset.
Organizations & Institutions
- Sakana AI - The company founded by Llion Jones, focused on pursuing research directions that foster freedom and exploration in AI.
- Google DeepMind - Omar, a product and design lead, works here.
- Two Sigma AI Labs - A research lab based in Zurich, known for winning the ARC AGI 3 pub competition and hiring ML engineers and research scientists focused on AI safety.
- OpenAI - Mentioned as an example of a company that, despite huge achievements, is now facing commercialization pressures that could lead to "technology capture."
Websites & Online Resources
- AI Studio (ai.studio/build) - A platform by Google DeepMind for mixing and matching AI capabilities and building applications with Gemini.
- Two Sigma AI Labs Website - Where one can learn about their approach to winning the ARC AGI 3 pub competition and explore job opportunities.
- Cracking the Cryptic (YouTube channel) - Two British gentlemen solve extremely difficult Sudoku puzzles, providing detailed reasoning that was scraped to create the Sudoku Bench dataset for imitation learning.
Other Resources
- NeurIPS 2025 - The conference where the "Continuous Thought Machines" paper was a spotlight.