2025 AI Advancements: AGI Arrival, Small Models, and Regulatory Divergence - Episode Hero Image

2025 AI Advancements: AGI Arrival, Small Models, and Regulatory Divergence

Original Title:

TL;DR

  • The first major copyright case settlement, involving Anthropic and authors, signals a potential shift towards a "pay to train" model for AI development, impacting intellectual property usage.
  • AI influencers are increasingly replacing human user-generated content at scale, with platforms like TikTok and Meta providing tools that enable brands to leverage AI avatars for marketing campaigns.
  • Non-technical individuals are now building software on the fly through "vibe coding" and low-code AI tools, with a recent study indicating 70% of new enterprise apps in 2025 were built this way.
  • Global AI regulations are tightening, particularly in the EU with the AI Act, impacting feature availability and imposing significant penalties for non-compliance, while the US has largely abstained from similar regulation.
  • Narrow AI agents, focused on specific tasks or verticals, are dominating the landscape, outperforming broader general-purpose agents which are still in early development stages.
  • LLM memory and context caching have become a major focus, with major front-end LLMs now offering chat memory and developers standardizing context caching APIs for improved efficiency.
  • Smaller, efficient language models are becoming more capable and prominent than larger models, with open-source models like GPT-OSS 20B demonstrating superior performance to previous state-of-the-art large models.
  • Mixture of models, an approach where multiple independent models process input and aggregate outputs, is emerging as a strategy for enhanced accuracy and modular AI development.
  • Artificial General Intelligence (AGI) has arguably been achieved, with AI models surpassing human experts on elite thinking tests and economically valuable work, yet its arrival remains largely unnoticed due to a lack of a definitive definition and public proclamation.

Deep Dive

In 2025, the AI landscape saw significant advancements that validated bold predictions, particularly in the evolution of AI models and their integration into professional workflows. While the year did not culminate in a universally recognized declaration of Artificial General Intelligence (AGI), evidence suggests that AI capabilities have begun to surpass human performance in economically valuable tasks, blurring the lines of what constitutes AGI and signaling a fundamental shift in human-machine collaboration.

The year's AI developments revealed a clear trend toward more specialized and efficient AI models. Predictions that smaller, more capable language models would outperform larger, earlier versions proved accurate, with open-source models like GPT-OSS 20B demonstrating superior performance to previous state-of-the-art models like GPT-40, despite being significantly smaller. This shift toward "small language models" (SLMs) indicates a move towards greater accessibility and efficiency. Furthermore, the concept of "mixture of models," where multiple independent AI models are orchestrated to process inputs and aggregate outputs, gained traction. This approach, exemplified by Zoom's success in AI benchmarks by combining off-the-shelf models, suggests a future where complex AI tasks are handled by ensembles of specialized tools rather than a single monolithic model.

Beyond model architecture, AI's impact on professional life became increasingly evident. The prediction that non-technical individuals would be able to build software on the fly, now termed "vibe coding," materialized with tools that democratize software creation. Similarly, the rise of AI influencers and AI-generated content began to displace human user-generated content (UGC) at scale, as seen with the success of virtual influencers like Lil Miquela and the increasing adoption of AI tools by platforms like TikTok and Meta. This trend has implications for marketing, content creation, and the very definition of authenticity online, with Gen Z jobs in the influencer industry facing significant disruption. The integration of AI into enterprise workflows also accelerated, with predictions of "reasoner wrappers" for enterprise data and the increased relevance of virtual machines for AI agents coming to fruition. Microsoft's announcement of Windows 365 for agents, and Google's Project Mariner, highlight the growing need for dedicated computational environments for AI agents.

The year also saw AI become deeply entangled in politics and policy. Regulations tightened globally, with the EU AI Act imposing significant penalties for non-compliance, while the US largely pursued a deregulatory approach. This divergence in regulatory strategies has practical implications for how AI features are deployed and accessed across different regions. The prediction that AI would become overly political was validated by executive orders addressing ideological biases in federal AI procurement and significant political donations from major tech companies to political campaigns, underscoring the growing influence of AI on governance and geopolitical strategy.

Ultimately, the most profound implication of 2025's AI advancements is the subtle yet significant arrival of AGI-level capabilities. While a definitive proclamation of AGI remained elusive, AI models consistently outperformed humans on elite thinking tests, including the International Mathematical Olympiad. New benchmarks like OpenAI's GPT-VAL, which assesses performance on real-world, economically valuable tasks, show AI models now exceeding human expert levels in a majority of comparisons. This suggests that while the definition of AGI may continue to evolve, AI systems are increasingly capable of performing tasks that were once exclusively the domain of human intelligence, impacting careers, economies, and societal structures in ways that are only beginning to be understood.

Action Items

  • Audit AI influencer usage: Analyze 5-10 marketing campaigns to quantify AI avatar impact on user engagement and conversion rates.
  • Create runbook for AI agent deployment: Define 5 required sections (setup, common failures, rollback, monitoring) to ensure reliable enterprise AI agent operation.
  • Implement LLM memory tracking: Measure memory usage and effectiveness across 3-5 core LLM applications to optimize context window utilization.
  • Evaluate small model performance: Benchmark 3-5 open-source small language models against proprietary counterparts on domain-specific tasks.
  • Design multi-model orchestration strategy: Develop a framework for integrating 3-5 distinct AI models to handle complex, multi-stage decision-making processes.

Key Quotes

"well yes that happened it was a 1 5 billion settlement where anthropic had a class action lawsuit regarding um allegedly just stealing information from books right that's what was alleged against them and there was a 1 5 billion settlement so this is i think one of the first big pieces that eventually we're going to move to a pay to train model right"

The author argues that the 1.5 billion dollar settlement in the Anthropic class action lawsuit, concerning the alleged use of copyrighted books for training data, signifies a shift towards a "pay to train" model in AI development. This highlights the growing legal and financial implications of data sourcing for large language models.


"my actual prediction was that ai influencers or avatars are going to start replacing human user generated content at scale so i'm talking specifically ugc influencer type videos so although there isn't a huge ai like there's not a huge study on this i think it's happening and the crazy thing is is most people don't know"

The author explains their prediction that AI influencers and avatars are beginning to replace human user-generated content (UGC) on a large scale, particularly in influencer-style videos. This trend is happening, but the author notes that most people are unaware of its prevalence.


"y'all this was this is how fast time flies in ai world this was before vibe coding right vibe coding was literally not a thing and i just kind of in january i just kind of predicted there's going to be this thing called vibe coding right except i said it's going to be non technical people building on the fly software i think vibe coding is a much better term right"

The author reflects on their prediction that non-technical individuals would start building software "on the fly," a concept they later recognized as "vibe coding." They note the rapid evolution of AI, as this prediction predated the popularization of the term "vibe coding" by a month.


"reasoners wrappers will hit the scene so my prediction back in january was that tools are going to emerge that wrap reasoning models in enterprise data to drive decisions so essentially how the transformer old school models work on structured data i said well there's going to be a a kind of movement uh for reasoning data or a company's kind of how they make decisions right"

The author discusses their prediction that "reasoner wrappers" would emerge, which are tools designed to integrate reasoning models with enterprise data to inform decision-making. This concept was based on the idea of applying AI's decision-making capabilities to a company's specific data and processes.


"ai becomes overly political i said it ai is going to become deeply entangled in politics and policy conflict yeah that happened you know and i'm looking at this i'm basing the us our biggest listenership is here in the us so looking at this mainly through the us's point of view but in december uh so this week uh president trump signed an executive order blocking states from regulating ai framing deregulation of ai as a critical critical weapon in the ai race against china"

The author asserts that AI has become deeply entangled in politics and policy conflicts, citing President Trump's executive order blocking state regulation of AI as a key example. This action framed AI deregulation as a strategic move in the competition with China, highlighting the political dimension of AI development.


"global ai regulations tighten just not in the us right and yeah that happened and it honestly impacted how users worldwide were able to interact or not interact with their favorite large language model of choice right obviously in the eu very restrictive with the eu ai act and a lot of features like openai's long term memory upgrade got really delayed in the eu and the uk because of this strict ai regulations"

The author confirms their prediction that global AI regulations would tighten, noting that this has significantly impacted user interaction with AI models worldwide. They specifically mention the EU AI Act's restrictiveness, which led to delays in features like OpenAI's long-term memory upgrade in the EU and UK.


"narrow ai agents or narrow agi also achieved but anyways my prediction when it came to agents that there wasn't going to be a runaway general purpose agent the general purpose agents were not going to be good but narrow agents would absolutely dominate right and that prediction came true because you can there's no one great agent out there look at the big players right microsoft copilot i'd say is probably the closest to having a general agent but even those are built on a company's certain use cases right so they're narrow they're not general for the most part"

The author validates their prediction that narrow AI agents, rather than general-purpose ones, would dominate the landscape. They argue that while general agents are not yet highly effective, specialized or "narrow" agents built for specific use cases, like Microsoft Copilot for certain business functions, have become prevalent.


"large language models become small language models well what i meant was my prediction there was that smaller efficient models would become more prominent and capable than january 2025's big models right that's what i was alluding to essentially i said hey in a year there's going to be small models that are way better than big models and people are going to be using small models a lot more than they're using them now"

The author explains their prediction that smaller, more efficient language models would surpass the capabilities of larger models from early 2025. They state that this shift has occurred, with smaller models becoming more prominent and widely used due to their improved performance and efficiency.


"speaking of models mixture of models becomes a thing all right a year ago no one's talking about mixture models we talk about mixture of experts which is a little different but my prediction was that there would be systems that would orchestrate multiple models in parallel as a bundle"

The author discusses their prediction that "mixture of models," systems orchestrating multiple AI models in parallel as a bundle, would become a significant trend. They differentiate this from "mixture of experts," noting that while mixture of experts was discussed previously, mixture of models was a novel concept at the time of their prediction.


"agi is going to be achieved but no one notices right so my prediction was that agi artificial general intelligence would arrive in 2025 but daily life doesn't feel any differently and there's no bulb proclamation of you know hey agi has been achieved you know we're going to throw up the agi flag on the federal buildings and you know now the humans taken that"

The author reiterates their prediction that Artificial General Intelligence (AGI) would be achieved in 2025, but without a noticeable change in daily life or a public declaration of its arrival. They suggest that AGI's emergence would be subtle, rather than a dramatic, widely recognized event.

Resources

External Resources

Books

  • "Barts versus Anthropic" - Mentioned as the technical name for a class action lawsuit against Anthropic regarding the alleged use of copyrighted books for training data.

Articles & Papers

  • "Gen Z Job Warning as New AI Trends Set to Destroy 80% of Influencer Industry" (Yahoo Finance) - Referenced as evidence for the impact of AI trends on the influencer industry.
  • "America's AI Action Plan" - Mentioned as a document signed by the Trump administration that included an executive order preventing the federal government from procuring AI models with ideological biases.
  • "Global AI Assurance Pilot" - Mentioned as an initiative launched by Singapore in February.
  • "National AI Law" - Mentioned as a law that took effect in Italy in October.
  • "AI IQ Test" (Tracking AI) - Referenced as a resource that provides offline IQ tests for large language models.
  • "Mensa Norway Test" - Mentioned as a test used by Tracking AI to assess the IQ of large language models.
  • "GDP Val" (OpenAI) - Described as an OpenAI benchmark that tests models on real job tasks across various economic sectors, measuring their performance against human experts.

Research & Studies

  • "70% of new enterprise apps in 2025 were built using low-code AI tools" - Cited as a statistic regarding the adoption of low-code AI tools for enterprise app development.
  • "7.3 billion enterprise investment into departmental AI" (Menlo Ventures report) - Cited as evidence of enterprise investment in narrow AI use cases.
  • "70% of top AI-driven enterprises will use multi-model architectures by 2028" (IDC report) - Cited as a projection for the adoption of multi-model architectures in AI-driven enterprises.

Tools & Software

  • "Symphony" (TikTok) - Mentioned as TikTok's generative AI ad tools.
  • "AI Studio" - Mentioned as a platform where non-technical people can build AI applications without coding knowledge.
  • "Lada" - Mentioned as a tool that helps parse and structure company decision-making processes, pairing reasoning models with human or company reasoning.
  • "Copilot Studio" - Mentioned as a tool from Microsoft related to AI agents.
  • "Agent Force 20" (Salesforce) - Mentioned as a tool that handles end-to-end CRM workflows.
  • "Fusion Cloud Apps" - Mentioned as a platform where Oracle deployed role-based AI agents.
  • "GitHub Copilot" - Mentioned as a coding agent.
  • "Claude Code" (Anthropic) - Mentioned as a coding agent.
  • "Interactions API" (Google) - Mentioned as an API that dynamically routes queries and shares information between models in real-time.

People

  • Andreessen Horowitz - Mentioned as the source of the term "vibe coding."
  • Paige Bailey - Mentioned as one of the heads at Google who stated that non-technical people are winning hackathons.
  • President Trump - Mentioned for signing an executive order blocking states from regulating AI and for signing "America's AI Action Plan."
  • Sam Altman - Mentioned for defining AGI in the OpenAI charter and for his company's development of GPT-5.
  • Boris Power - Mentioned as the head of applied research at OpenAI, who stated that winning the IMO gold was seen as an AGI-level difficulty problem.

Organizations & Institutions

  • Anthropic - Mentioned in relation to a $1.5 billion settlement for a class action lawsuit and for its financial models.
  • OpenAI - Mentioned in relation to a settlement with Disney, the New York Times lawsuit, its charter definition of AGI, its open-source model GPT-OSS, and its models GPT-5 and GPT-5.2.
  • Disney - Mentioned for entering an agreement with OpenAI involving equity and payment for IP usage.
  • The New York Times - Mentioned for its ongoing lawsuit against OpenAI.
  • Microsoft - Mentioned for its involvement in AI development and its "Phi" series of models.
  • Google - Mentioned for its involvement in AI development, its "Gemini" models, its "Disco" experiment, and its "DeepMind" division.
  • Meta - Mentioned for introducing and expanding its AI ad tools.
  • TikTok - Mentioned for expanding its "Symphony" generative AI ad tools.
  • Samsung - Mentioned as a brand that has partnered with AI influencer Lil Michaela.
  • Calvin Klein - Mentioned as a brand that has partnered with AI influencer Lil Michaela.
  • Prada - Mentioned as a brand that has partnered with AI influencer Lil Michaela.
  • Salesforce - Mentioned for its "Agent Force 20" tool.
  • Oracle - Mentioned for deploying role-based AI agents in Fusion Cloud Apps.
  • Github - Mentioned for its "Copilot" coding agent.
  • Menlo Ventures - Mentioned for a report on enterprise investment in departmental AI.
  • Microsoft - Mentioned for its "Phi" series of models.
  • Google - Mentioned for its "Gemma" series of models.
  • Area - Mentioned as a company with a mixture of models architecture that raised $100 million.
  • Zoom - Mentioned for achieving the highest score on "Humanity's Last Exam" using a mixture of models approach.
  • IDC - Mentioned for a report on multi-model architectures.
  • EU (European Union) - Mentioned for its restrictive "EU AI Act" and its rules for general purpose AI models.
  • UK - Mentioned in relation to strict AI regulations impacting OpenAI's long-term memory upgrade.
  • Singapore - Mentioned for launching its "Global AI Assurance Pilot."
  • Italy - Mentioned for its national AI law.
  • IMO (International Mathematical Olympiad) - Mentioned as a difficult math competition where OpenAI and Google Deepmind won gold.

Websites & Online Resources

  • "youreverydayai.com" - Mentioned as the website for the podcast where video versions of discussions are available and where users can sign up for a daily newsletter.
  • "lm arena" - Mentioned as a benchmark that is judged by humans.

Other Resources

  • AGI (Artificial General Intelligence) - Discussed extensively as a concept that may have been achieved in 2025 without widespread notice, with various definitions and benchmarks explored.
  • Vibe Coding - Mentioned as a term coined by Andreessen Horowitz for non-technical people building on-the-fly software.
  • User Generated Content (UGC) - Discussed in the context of AI influencers potentially replacing human UGC.
  • Pay to Train Model - Mentioned as a future direction for AI training, following copyright case settlements.
  • Intellectual Property (IP) - Mentioned in the context of Disney's agreement with OpenAI.
  • Reasoning Data - Mentioned as a concept related to enterprise data used to drive decisions.
  • Agency Layer - Mentioned as a concept related to business decision-making and expertise.
  • Agent Frameworks - Mentioned as public evidence supporting the growth of the agency layer.
  • Observability/Tracing - Mentioned as necessary components for wrappers to function reliably in business settings.
  • Virtual Machines (VMs) - Discussed as becoming trendy or a thing again due to the needs of AI agents.
  • Cloud PCs - Mentioned in relation to Microsoft's announcement for AI agents.
  • General Purpose AI Models - Mentioned in the context of EU regulations.
  • Narrow AI Agents - Discussed as dominating over general-purpose agents.
  • LLM Memory - Mentioned as becoming a major focus, with updates from major LLM providers.
  • Context Caching - Mentioned as a standard API feature and a form of memory for developers.
  • Context Window - Discussed in relation to LLM memory and its utilization.
  • Personalization - Mentioned in relation to Gemini's ability to work with search history.
  • Connectors - Mentioned as a component that works with context and memory for LLMs.
  • Small Language Models (SLMs) - Discussed as becoming more prominent and capable than larger models.
  • Parameters - Explained as a measure of model size, particularly for open-source models.
  • Mixture of Models - Discussed as a system that orchestrates multiple independent models, contrasting with Mixture of Experts.
  • Mixture of Experts (MoE) - Explained as a single sparse model that uses a gating mechanism to activate specific parameters.
  • Humanity's Last Exam - Mentioned as a difficult AI test that Zoom's mixture of models approach excelled at.
  • Model Routing - Mentioned as a related but different concept to mixture of models.
  • Modular Approach - Discussed as a smart strategy for AI, particularly in light of model updates.
  • Economically Valuable Work - Mentioned as a key part of Sam Altman's definition of AGI.

---
Handpicked links, AI-assisted summaries. Human judgment, machine efficiency.
This content is a personally curated review and synopsis derived from the original podcast episode.