Cyborg AI Future: Code Execution Creates Unfair Advantages
The AI Future is Cyborg: How Complex Tooling Creates Unfair Advantages
The core thesis of this conversation is that the future of AI integration with the internet is not about simply exposing more tools to language models, but about enabling sophisticated code execution within controlled environments. This shift reveals hidden consequences: the limitations of current MCP (Model Context Protocol) designs, the significant engineering effort required for effective AI interaction, and the potential for code execution to become the primary method of AI tool use. Those who understand this transition--especially developers and product managers--will gain a significant advantage by focusing on building robust code execution environments and understanding the complex interplay between AI, APIs, and developer tooling. This conversation highlights how conventional wisdom around AI tool exposure is failing as we move towards more complex, production-ready AI applications.
The Illusion of Infinite Tools: Why MCP Falls Short
The promise of AI agents is seductive: imagine an AI that can perform any task a human operator can, navigating complex software interfaces with ease. This vision, often framed through the lens of MCP, suggests exposing a vast array of "tools"--essentially, API endpoints--for AI models to call. Alex Rattray, CEO of Stainless, argues that this approach is fundamentally flawed, not because the vision is wrong, but because the current implementation is inadequate. The sheer volume of tools and their associated documentation quickly overwhelms the context windows of even the most advanced LLMs, rendering them confused and inefficient.
"Everyone listening to this should already know you've just burned through your entire context budget. That's maybe hundreds of thousands of tokens just there, just in pretty much translating the Stripe OpenAPI spec directly over to MCP tools."
This highlights a critical consequence: the "obvious" solution of mapping every API endpoint to an MCP tool creates an unmanageable information overload. The complexity of real-world software, with its hundreds of endpoints and intricate workflows, cannot be simply translated into a list of discrete AI tools. This leads to a cascade of issues: models struggle to find the right tool, execute it correctly, and process the often-voluminous responses. The immediate benefit of having a tool available is quickly overshadowed by the downstream cost of its impracticality.
The Hidden Cost of "Easy" AI Interaction
The current MCP paradigm often requires significant manual effort to make tools work reliably for LLMs. This involves "handcrafting" tools, tailoring them to the specific way an LLM "thinks" and processes information. This is analogous to the decades of effort it took to develop robust Software Development Kits (SDKs) for human developers. Rattray points out that we haven't yet cracked the code for creating equally ergonomic interfaces for LLMs.
"We haven't figured out how to expose an API ergonomically to an LLM in the same way that we've figured out how to expose it ergonomically to a Python developer. That's kind of like a new research problem in a sense."
This presents a significant barrier to adoption. The promise of AI doing complex tasks is undermined by the reality of extensive engineering work to make even simple tasks reliable. The consequence is that many AI applications remain in a "playing around" mode, prone to disconnections and "paper cuts." The conventional wisdom that providing more tools equals better AI capability fails to account for the deep engineering required to make those tools usable and reliable within the constraints of LLM architecture. This delayed payoff, where significant upfront engineering yields future AI capabilities, is where competitive advantage can be built, but it requires patience and a focus beyond immediate productivity gains.
The Cyborg Future: Code Execution as the Next Frontier
Rattray's vision for the future of AI is one of "cyborgs"--a fusion of LLMs and traditional code execution. Instead of an overwhelming array of discrete tools, the future lies in models that can write and execute code. This approach dramatically reduces the context window burden, as the LLM only needs to interact with a code interpreter and a limited set of documentation or SDKs. The actual execution of complex, multi-step operations happens via standard code, which can be highly optimized and run efficiently.
This has profound implications. Firstly, it shifts the burden from defining countless tools to ensuring a robust and secure code execution environment. Secondly, it allows for a more natural progression from one-off AI actions to permanent production software. A task that an AI performs once via code execution can be iterated upon, refined, and eventually committed as production code. This is where the delayed payoff truly emerges. Building this code execution infrastructure, ensuring its security, and enabling LLMs to interact with it effectively creates a durable advantage.
"The way I actually see that happening and what we're building towards is code execution. So rather than the model having a bajillion tools, the model has two tools. One to execute code... and then the other to kind of search the docs and ask questions to the docs."
The security implications are also critical. Rattray emphasizes that security must reside at the API layer, using mechanisms like OAuth with granular permissions, rather than solely by limiting exposed MCP tools. A well-designed code execution environment can enforce these API-level security controls, preventing the AI from venturing outside its permitted actions. This focus on secure, robust code execution, rather than a simple tool-listing approach, is where the real innovation and competitive advantage will lie.
Key Action Items
- Prioritize Code Execution Environments: Focus development efforts on building secure, flexible code execution sandboxes for LLMs, rather than solely on creating discrete MCP tools. (Immediate Action)
- Develop Robust API Security: Implement granular permissions and OAuth at the API layer to secure AI interactions, rather than relying on limiting tool exposure. (Immediate Action)
- Invest in LLM-to-SDK Integration: Research and develop better methods for LLMs to interact with and utilize Software Development Kits (SDKs), especially for complex libraries. (This pays off in 12-18 months)
- Refine Documentation for AI Consumption: Structure API documentation to be easily parsed and understood by LLMs, facilitating effective code generation and tool use. (Ongoing Investment)
- Establish Comprehensive Evaluation Frameworks: Build systems for evaluating AI performance in production, tracking tool usage, code execution success, and user feedback to identify regressions and improvements. (Immediate Action)
- Embrace Iterative Automation: View AI-driven tasks as opportunities for automation. Use code execution logs to identify recurring patterns that can be productized into more permanent software solutions. (This pays off in 6-12 months)
- Foster Developer-Centric AI Tooling: Create environments that cater to developers' needs for flexible, powerful AI interaction, even if it involves taking calculated risks with security controls for early adopters. (This pays off in 12-18 months)