AI's Diverging Paths: Power Versus Practicality

Original Title: Claude Fable 5 Is Incredible. And A Little Scary.

The two edges of AI: when Fable 5 costs what it costs, and Siri finally works

This conversation describes a split in the AI landscape. Anthropic released Fable 5, the most capable publicly available model yet, but deliberately limited its cybersecurity and biology capabilities by routing those queries to a weaker model. At the same time, Apple finally shipped AI that works on-device, but it's three years late and doesn't do anything your existing tools couldn't already imagine. The less obvious point: the gap between "most powerful" and "most useful" is getting wider, not narrower. Anyone building on AI needs to understand which edge they're operating on, because the cost and latency profiles of these two worlds are diverging fast.

Why the obvious fix creates a new problem

Anthropic's safety approach with Fable 5 shows a fascinating systems dynamic. They didn't just add guardrails. They built a routing system that actively redirects any request touching cybersecurity or biology to Opus 4.8, a weaker older model. The immediate benefit is clear: reduced risk of catastrophic misuse. But the downstream effect creates a new vulnerability.

"Our safety systems for Fable 5 automatically review requests that touch on high risk areas like cyber security or biology. Those requests are then redirected to Opus 4. Pointy. We do that intentionally."

The hidden consequence: the model's ability to classify what counts as a high risk request is itself a function of its intelligence. If Fable 5 is the most capable model ever released, it's also the most capable at evading its own safety systems. The system responds by routing ambiguous requests to a weaker model, but that weaker model is also worse at detecting whether a request should have been routed. Over time, this creates a feedback loop where sophisticated users learn to phrase dangerous requests in ways that bypass the classifier, while legitimate users get frustrated by false positives.

Kevin and Gavin already experienced this dynamic with the previous Claude model, which refused to write show notes about a Chipotle AI hack story. "Claude was like, I don't think you should include this story. And I am not going to write the notes for it." The system's safety mechanisms create friction that compounds: users adapt by working around them, which makes the safety systems less effective, which forces Anthropic to tighten further.

The 18 month payoff nobody wants to wait for

The conventional wisdom says better AI models make everything easier. Fable 5's benchmarks tell a different story. The agentic coding score jumped from 69.2% (Opus 4.8) to 80.3%. On the Frontier Code (diamond) benchmark, Fable 5 scored 29.3% versus GPT 5.5's 5.7%. These are massive jumps. But the real insight isn't about capability. It's about cost and latency.

Fable 5 costs twice as much per token as Opus 4.8, which was already the most expensive model available. And it's slow. "I've heard this now for multiple people who have tested it is very slow. So you also have to plan on these things, taking a while." This creates a counterintuitive dynamic: the most capable model is the least suitable for most tasks.

Gavin's wife burned through her entire pro account's token allocation in a single session trying to process her Gmail. That's not a product failure. It's a use case mismatch. As Kevin put it, using Fable 5 for email triage is "like summoning an asteroid because there's an ant on the sidewalk."

The systems thinking insight here is that Fable 5 is best used as a planner and orchestrator, not a worker. It should design the architecture, then delegate the grunt work to cheaper sub agents. This is where the delayed payoff lives: teams that invest in building orchestration layers now will have a competitive advantage in 12 to 18 months, when costs come down and the orchestration patterns are already established. Teams that just throw Fable 5 at every problem will burn through budgets and get frustrated.

Noam Brown's post crystallized this: we need to judge AI models not just on capabilities, but on cost and token efficiency. The quality speed cost triangle is real, and Fable 5 is optimizing hard on quality at the expense of the other two.

Where immediate pain creates lasting moats

Apple's WWDC announcements are the mirror image of Anthropic's strategy. Siri AI is finally functional. On device, local, privacy preserving, and actually working. But it does nothing that other AI assistants couldn't already do. The demos showed basic functionality: searching your texts, setting reminders based on context, reframing photos.

The non obvious dynamic is that Apple's late arrival might actually be an advantage. They spent years not building a chatbot, which meant they could observe the entire industry's mistakes. Their approach is deeply integrated into the device ecosystem. Every Apple device, from watch to Vision Pro, gets the same AI layer. And they solved the privacy problem by running models locally and using Google's infrastructure with Nvidia's confidential computing technology.

"They have a version of their secure cloud basically running with Google and video hardware in the cloud. So they're not fully hosting your secured cloud, but they have new technology from Nvidia that allows them to say with confidence that nobody not even Apple, not even the host can see what your requests are."

The consequence: Apple is betting that most users don't need Fable 5's capabilities. They need AI that works reliably, cheaply, and privately on the device they already own. This creates a moat that's hard to replicate: deep OS integration, hardware optimization, and a privacy narrative that competitors can't easily match.

The irony isn't lost on the hosts. Last year's Apple Intelligence promised summaries that were comically bad. "The message summarization issues from the past" where the summary was literally the same length as the original message. Now they've shipped something that works. The delayed payoff was real, but it required patience most companies don't have.

How the system routes around your solution

The two stories here, Fable 5's safety routing and Apple's local first approach, reveal a deeper pattern. Every solution creates new problems that the system adapts to.

Fable 5's safety routing will create an arms race: users learn to phrase requests to avoid triggering the classifier, Anthropic tightens the classifier, users adapt again. The system doesn't settle into equilibrium. It oscillates between restriction and circumvention.

Apple's local first approach creates a different dynamic. By running AI on device, they limit what the model can do. It can't access the full internet, it can't run massive computations. But they also eliminate latency and privacy concerns. The system responds by making the device itself more valuable, which locks users deeper into the Apple ecosystem. Every time you use Siri AI to search your texts or reframe a photo, you're reinforcing the moat.

The hosts captured this tension perfectly: "We're seeing the two edges of AI right now. The mythos fable is like the far edge, but what people are going to be doing at the end, the AI, Apple AI is kind of like the mainstream."

Key action items

Over the next quarter, audit your AI usage patterns. If you're using expensive frontier models for routine tasks like email triage or basic summarization, switch to cheaper alternatives immediately. You're burning budget on capability you don't need.

Within 30 days, if you're building on Fable 5, design an orchestration layer. Use it as a planner and delegator, not a worker. Let it design the architecture, then route execution to cheaper sub agents. This pays off in 6 to 12 months as costs scale.

This month, test whether your use cases actually need frontier model capability. Most don't. The discomfort of downgrading to a weaker model now creates the advantage of lower costs and faster iteration later.

Over the next 6 months, watch for Apple's on device AI to create a new baseline expectation for privacy and reliability. If you're building consumer AI products, assume users will expect local processing as a default, not a premium feature.

Immediately, if you're in a regulated industry like healthcare, finance, or defense, pay close attention to Fable 5's safety routing. The model's ability to classify high risk requests is itself a function of its intelligence, and that creates unpredictable failure modes.

Over the next 12 to 18 months, invest in orchestration patterns and agent delegation architectures. The models will get cheaper and faster. The companies that already have the orchestration layer built will have a 12 month head start.

This quarter, if you're still using AI for tasks that feel like summoning an asteroid for an ant, stop. The discomfort of finding the right tool for the job now prevents the pain of massive token bills later.

---
Handpicked links, AI-assisted summaries. Human judgment, machine efficiency.
This content is a personally curated review and synopsis derived from the original podcast episode.