The future of software development is here, and it's less about writing code and more about orchestrating intelligent agents and leveraging raw data access. This episode of Python Bytes dives into a provocative new pattern: "Raw + DC," advocating for a shift away from traditional ORMs towards direct database queries coupled with Python dataclasses. The non-obvious implication? This approach not only aligns better with how AI coding assistants are trained but also unlocks significant performance gains and future-proofs applications against the churn of library maintenance. Developers and architects who grasp this shift will gain a competitive edge by building more efficient, AI-friendly systems. It’s a call to rethink established abstractions and embrace a more direct, performant, and agent-aligned data access strategy.
The Unseen Cost of Abstraction: Why ORMs Might Be Obsolete for AI
The conversation kicks off with a bold proposition: the Object-Relational Mapper (ORM) pattern, long a staple for developers seeking type safety and abstraction, is becoming misaligned with the evolving landscape of AI-assisted coding. Michael Kennedy, the article's author, argues that while ORMs offer undeniable benefits like preventing injection attacks and providing excellent IDE auto-completion, they are not the "native language" of AI agents.
The core of the argument lies in training data. AI models, particularly those assisting with code generation, are exposed to raw database queries far more frequently than specific ORM implementations. Kennedy points to staggering download statistics: the underlying library for raw MongoDB queries is downloaded 74 million times a month, dwarfing the 1.4 million downloads for Beanie, a popular ODM. This disparity suggests that AI agents will inherently be more proficient and generate better results when working with raw queries.
Furthermore, the lifespan and maintenance of ORMs present a significant risk. Kennedy highlights Beanie's infrequent releases and substantial open issues as an example of how popular libraries can become unmaintained, forcing developers into costly rewrites. In contrast, raw SQL and native query syntaxes for databases like MongoDB have remained remarkably stable for decades, offering a level of future-proofing that ORMs often fail to provide.
"The underlying library that lets you talk in the native query syntax is downloaded 74 million times per month. That's 53 times more. And there are other libraries for talking directly to Mongo as well. So there are like 50-75 times more bits of code out there that are actually just talking directly to the database than using this particular ORM."
-- Michael Kennedy
This doesn't mean abandoning all abstraction. Kennedy proposes a "Raw + DC" pattern: a data access layer that uses raw queries internally, allowing AI to excel, but exposes well-defined dataclasses to the rest of the application. This hybrid approach offers the best of both worlds: AI-friendly raw queries and type-safe, lightweight interfaces for developers. Brian Okken concurs, emphasizing that fewer dependencies are always a win. The implication here is that teams clinging to traditional ORMs might find their AI coding partners less effective and their applications more susceptible to the obsolescence of specific libraries.
The Performance Dividend: Beyond Developer Convenience
The discussion extends beyond AI alignment to tangible performance benefits. Kennedy presents a graph illustrating the overhead of various data access methods. While raw dataclass access is nearly native speed, ORMs like Beanie (which relies on Pydantic) can be up to five times slower, and older libraries like MongoEngine can be fifteen times slower.
This isn't just a marginal difference; it's a significant performance penalty for the convenience of ORM abstractions. The "Raw + DC" pattern, by minimizing serialization and validation overhead at the data access layer, offers a substantial performance advantage. This is where delayed payoffs create competitive advantage: a system that can handle significantly more reads and writes with less computational cost will outperform competitors, especially under load. Conventional wisdom, focused on developer speed via ORMs, fails when extended forward to consider the long-term operational costs and performance ceilings.
"This style is actually almost native speed. There's just a few times where like the serialization maybe doubles the time to read say a thousand or 100 orders or something like that, but barely."
-- Michael Kennedy
The conversation touches on Pydantic as an alternative for runtime validation, but Kennedy suggests it might be overkill for the data access layer, preferring the lightweight nature of dataclasses. This highlights a key systems-thinking insight: not all layers of an application require the same level of abstraction or validation. Applying heavy-duty validation everywhere can become a performance bottleneck, a hidden cost that compounds over time.
Navigating the New Tooling Landscape: Trust and Agents
The latter half of the episode delves into the practical implications of these shifts, particularly concerning new tools and the increasing role of AI agents. Brian Okken discusses updates to pytest-check, emphasizing the ongoing maintenance of even niche tools and the importance of documentation. This serves as a counterpoint to the ORM maintenance issue -- active projects, even those with fewer users, can remain valuable.
A more significant discussion emerges around SQLiteo, a native macOS SQLite browser. Okken raises critical questions about trust in the age of AI-generated code. He notes that while he trusts the developer, Adam Hill, the use of AI agents in development introduces new considerations. The fact that agents are used is not hidden, which Okken appreciates, but the idea of agents committing code raises concerns about accountability and the contributor history.
"The trust part is something I’m thinking about a lot in these days of dev+agent built tools."
-- Brian Okken
This points to a crucial downstream effect: the development process itself is changing. Teams will need to develop new heuristics for evaluating the trustworthiness of software built with AI assistance. This involves looking beyond code quality to the developer's practices, the transparency of their tooling, and the presence of robust testing and review processes. The conventional wisdom of "trust the developer" now requires a more nuanced understanding of how that developer is building the software.
The episode also introduces Dataclass Wizard, a tool that brings Pydantic-like functionality to Python's built-in dataclasses. This offers a way to achieve flexible data parsing and serialization without the heavy dependencies of full ORMs or validation libraries, aligning with the "Raw + DC" philosophy. The wizard's ability to handle various formats like JSON, YAML, and environment variables, and its flexibility in opting in to validation, demonstrates a systems-level approach to tooling -- providing power where needed without imposing it everywhere.
The Enduring Value of Human Insight in a Coded World
The "Extras" section offers further reflection on the role of AI in software development. An article by Carson Gross, "Yes, And," argues that a computer science degree remains valuable, not despite AI, but because of it. The core idea is that programming is fundamentally about problem-solving and controlling complexity. AI agents can handle the "Lego blocks" of known solutions, but humans are essential for defining novel problems, making trade-offs, and guiding the AI.
This echoes Jevons paradox: increased efficiency (through AI) doesn't necessarily lead to less demand for human effort, but rather to more overall activity and a shift in the nature of that effort. The ability to direct agents, understand system capabilities, and make nuanced decisions based on context--like company culture or industry norms--is a skill that AI currently lacks.
"The top level thing is programming is fundamentally problem solving using computers and learning to control complexity while solving these problems."
-- Carson Gross (as discussed on the podcast)
The discussion about COBOL and the IBM stock crash further illustrates this point. While AI might threaten legacy systems, it also creates new opportunities for developers who can bridge the gap between old and new, or who can leverage AI to understand and maintain complex, older codebases. The ultimate takeaway is that while AI tools will change how software is built, the need for human insight, strategic decision-making, and deep understanding of problem domains will only become more critical. The advantage lies not in resisting AI, but in learning to work with it effectively, focusing on the areas where human intelligence excels -- defining problems, making trade-offs, and orchestrating complex systems.
Key Action Items
- Adopt the "Raw + DC" Pattern: For new projects or significant refactors, prioritize direct database queries within a dedicated data access layer, returning Python dataclasses.
- Immediate Action: Begin evaluating existing ORM usage for performance bottlenecks and AI alignment.
- This pays off in 6-12 months through improved performance and easier integration with AI coding tools.
- Investigate Dataclass Enhancement Libraries: Explore tools like
Dataclass Wizardto add flexible parsing and serialization capabilities to dataclasses without heavy dependencies.- Immediate Action: Experiment with
Dataclass Wizardon a small, non-critical feature. - This pays off in 3-6 months by reducing boilerplate code for data handling.
- Immediate Action: Experiment with
- Develop AI Collaboration Skills: Actively practice directing AI coding assistants, focusing on providing clear prompts for raw queries and understanding their outputs.
- Immediate Action: Dedicate 30 minutes daily to using an AI coding assistant for specific, well-defined tasks.
- This pays off in 3-6 months through increased personal productivity and better AI-generated code.
- Re-evaluate Dependency Management: Conduct an audit of ORMs and other heavy abstraction libraries, considering their maintenance status and performance implications.
- Immediate Action: Identify critical ORM dependencies and research their maintenance activity.
- This pays off in 12-18 months by mitigating risks of unmaintained libraries and reducing technical debt.
- Prioritize Developer Trust and Transparency: When evaluating new tools, especially those leveraging AI, scrutinize the developer's practices, transparency, and commitment to testing.
- Immediate Action: Add "developer trust and transparency" as a criterion in your tool evaluation checklist.
- This pays off long-term by avoiding integration issues with unreliable or poorly maintained tools.
- Focus on Problem Definition and Trade-offs: Recognize that AI excels at generating code, but humans are crucial for defining the right problems and making strategic trade-offs.
- Immediate Action: Spend more time upfront defining the problem and exploring potential trade-offs before coding begins.
- This pays off in 6-12 months by ensuring that the software being built truly addresses the core business needs and avoids costly downstream consequences.
- Embrace Continuous Learning in AI Tooling: Stay informed about the rapidly evolving landscape of AI coding assistants and related developer tools.
- This pays off in 12-18 months by allowing you to adapt to new workflows and maintain a competitive edge.