Transformer's Local Minimum: A New AI Path Emerges - Episode Hero Image

Transformer's Local Minimum: A New AI Path Emerges

Original Title:

Resources

Resources & Recommendations

Books

  • "Why Greatness Cannot Be Planned" by Kenneth Stanley - This book discusses the importance of allowing people to follow their own interests unfettered by objectives to foster epistemic foraging and discover novelty, which is a core philosophy of Sakana AI.

Research & Studies

  • "Continuous Thought Machines" (Sakana AI) - This paper introduces a new recurrent model with native adaptive compute, using higher-level concepts for neurons and synchronization as a representation, designed to solve problems in more human-like ways. It was a spotlight at NeurIPS 2025.
  • "Fractured Entangled Representations" - This paper was alluded to in the context of the shortcut learning problem in current language models.
  • "Intelligence Matrix Exponentiation" - An obscure paper, possibly rejected, that serves as a poster child for representing data more intrinsically, demonstrating a spiral decision boundary for a spiral dataset.
  • "Adaptive Computation Time" (Alex Graves) - This paper details a method for neural networks to decide how much computation to perform based on the input, though it required careful balancing of a penalty loss.
  • "Anthropic Biology" - A paper mentioned in the context of planning and thinking in AI systems.
  • "Neural Turing Machine" (Alex Graves) - A type of recurrent neural network that can interact with an external memory, discussed in relation to the Continuous Thought Machines.
  • "Sudoku Bench" - A dataset created to benchmark powerful reasoning systems, featuring variant Sudoku puzzles with complex, natural language-described rules, requiring meta-reasoning and deep understanding.

People Mentioned

  • Kenneth Stanley (Author) - His book "Why Greatness Cannot Be Planned" is a huge inspiration for Sakana AI's philosophy.
  • Sep Hotwriter - Mentioned as having new architectural ideas that are not being implemented by major AI labs.
  • Sarah Hooker - Referenced for her work on "the hardware lottery," drawing a parallel to an "architecture lottery" in AI.
  • Randall Bellero - Mentioned for his spline theory of neural networks.
  • Alex Graves - Referenced for his work on Adaptive Computation Time and Neural Turing Machines.
  • Andre Karpathy - Quoted for his idea that true AGI would require learning from "thought traces" rather than just text data.
  • Chris Moore - Mentioned in relation to the Sudoku Bench dataset.

Organizations & Institutions

  • Sakana AI - The company founded by Llion Jones, focused on pursuing research directions that foster freedom and exploration in AI.
  • Google DeepMind - Omar, a product and design lead, works here.
  • Two Sigma AI Labs - A research lab based in Zurich, known for winning the ARC AGI 3 pub competition and hiring ML engineers and research scientists focused on AI safety.
  • OpenAI - Mentioned as an example of a company that, despite huge achievements, is now facing commercialization pressures that could lead to "technology capture."

Websites & Online Resources

  • AI Studio (ai.studio/build) - A platform by Google DeepMind for mixing and matching AI capabilities and building applications with Gemini.
  • Two Sigma AI Labs Website - Where one can learn about their approach to winning the ARC AGI 3 pub competition and explore job opportunities.
  • Cracking the Cryptic (YouTube channel) - Two British gentlemen solve extremely difficult Sudoku puzzles, providing detailed reasoning that was scraped to create the Sudoku Bench dataset for imitation learning.

Other Resources

  • NeurIPS 2025 - The conference where the "Continuous Thought Machines" paper was a spotlight.

---
Handpicked links, AI-assisted summaries. Human judgment, machine efficiency.
This content is a personally curated review and synopsis derived from the original podcast episode.