How Trusted Intermediaries Enable Scalable Open Source

Original Title: SE Radio 723: Dave Airlie on Linux Kernel Maintenance

The Linux kernel’s maintenance model is a masterclass in scaling trust, not control. While most organizations obsess over process and tooling, the kernel’s real innovation is its social architecture: a hierarchy of maintainers who delegate authority, enforce standards publicly, and absorb the fallout from Linus Torvalds--so their contributors don’t have to. This system, refined over two decades, reveals a non-obvious consequence: the most resilient open source projects aren't governed by rules, but by trusted intermediaries who manage both code and conflict. The hidden cost of fast patching? Delayed regressions that only surface in diverse hardware. The payoff? A self-correcting system where scale isn’t managed by centralization, but by distributed accountability. Engineers, architects, and technical leads should read this--not for Linux-specific tactics, but to understand how to build systems that survive beyond their founders. This isn't about git or patches. It's about how humans coordinate at scale when the stakes are real and the code runs everything.

The Hidden Cost of Vendor Code Sharing: Why "Collaboration" Often Fragments the Kernel

When companies are pushed to contribute Linux drivers, the expectation is better, upstreamable code. The reality, as Dave Airlie observes, is often the opposite. Companies optimizing for internal efficiency--not kernel health--create drivers designed to share code with their Windows counterparts. This leads to hardware abstraction layers, impedance mismatches, and second-class Linux citizens.

"The thing about what people often don't understand is the incentives for companies to do things cause results that are probably not what you expected... When you pass that into a company that is writing windows drivers their first instinct is how can we share the code between our windows driver and our linux driver to cut the cost of doing this."

This creates a hidden consequence: the kernel becomes harder to maintain, not easier. Commonality is extracted differently. The Linux kernel thrives because all drivers live in the same tree, allowing maintainers to factor out shared patterns--like the 802.11 layer in wireless drivers. But when vendors prioritize cross-platform code sharing, they abstract Linux specifics away, making it harder for the kernel community to extract its own commonality. The result? More fragmentation, not less.

Over time, this creates a two-tier system: clean, community-driven drivers versus bloated, vendor-maintained ones that resist upstreaming. The immediate benefit--faster initial support--is outweighed by the downstream cost: longer review cycles, higher maintenance burden, and a harder time for new contributors to navigate inconsistent patterns. The competitive advantage goes to subsystems, like DRM, that actively resist this by enforcing strict, community-defined standards--making it harder for vendors to "just merge it" and move on.

The 18-Month Payoff Nobody Wants to Wait For: Why Hierarchy Scales When Ego Doesn't

Most open source projects stall at 20 contributors. The DRM subsystem handles 300. The difference? A deliberate, corporate-style hierarchy--and a maintainer, Airlie, who consciously stepped back.

The immediate reaction to scale is often more control: bigger PRs, longer reviews, bottlenecks at the top. Airlie’s insight was the opposite. He built a delegation pyramid--co-maintainers, sub-maintainers, committers--and then got out of the way. His job shifted from code reviewer to facilitator and shield.

This requires immediate discomfort: letting go of control, trusting others to make decisions, absorbing Linus’s wrath so his team doesn’t have to. But the payoff is massive. By not letting his "ego or niche workflows" dominate, Airlie enabled a system that could scale. The hierarchy isn't a failure of decentralization; it's what makes the decentralization work at scale.

"A lot of my job has been to just sort of step away and not let my i suppose ego or my own niche workflows drive everything else... I need to accept that okay this is something that makes the community better and maybe it makes my life a bit harder."

This connects to the larger kernel process. Linus doesn’t review every patch. He trusts maintainers like Airlie. Airlie trusts his sub-maintainers. The system only works because each layer accepts the responsibility to review, reject, and protect. The delayed payoff? A stable, predictable release cycle. The system routes around individual heroics and enforces process through delegation. Most projects fail here because they can’t scale trust. The kernel succeeds because it treats trust as the core API.

How the System Routes Around Your Solution: Why CI and AI Are Still Afterthoughts

The kernel’s development model resists modern tooling not because it’s stubborn, but because the scale breaks them. Centralized CI? It doesn’t work. "There’s no like central point to pick these patches up and push them through a centralized ci system," Airlie notes. The problem isn’t the idea--it’s the reality of 300 contributors, thousands of patches, and hardware diversity. A central CI would be a bottleneck, a single point of failure, and a resource sink.

Instead, the system routes around it. Intel runs its own CI on patches from the mailing list. Qualcomm has internal CI. The graphics subsystem is building its own pipeline. It’s messy, fragmented, and inefficient--until you realize it’s the only thing that scales. The immediate solution (a unified CI) fails because it doesn’t account for the system’s distributed nature. The lasting advantage comes from bespoke, subsystem-level tooling that evolves independently.

AI review is following the same path. Rather than a single, mandated system, multiple experiments are happening: Google, Meta, and Airlie himself with his Claude-based reviewer. The model? "I built side public inboxing infrastructure... it's there as an option." No decree. No mandate. Just space for experimentation.

This is where conventional wisdom fails. Most organizations would standardize on one tool. The kernel lets the system decide. The delayed payoff? A solution that actually fits the workflow, not one that forces the workflow to fit the tool. The ones who win aren’t those who adopt AI first, but those who integrate it without breaking the existing social contract.

Where Immediate Pain Creates Lasting Moats: The Unpopular Discipline of Patch Design

In a world of merge requests and giant PRs, the kernel’s obsession with small, self-contained patches seems archaic. But it’s this very discipline that creates a durable advantage.

A patch isn’t just a change--it’s the unit of review, trust, and backporting. A well-written patch series tells a story: infrastructure first, then driver core, then features. It avoids "and and but" in the commit message because that’s a sign of a patch trying to do too much.

This is painful. It requires distilling messy development into linear, logical steps. It means doing extra work--breaking up changes, writing clear summaries--before anything is submitted. Most developers would rather "just get it in."

But over time, this creates a moat. Patches that are small and self-contained are easier to review, easier to revert, and easier to backport. A security fix in a new kernel can be adapted to an old one if the original change was clean. A regression can be pinpointed and removed without collateral damage.

The immediate benefit of a sloppy, monolithic PR? Faster submission. The downstream effect? A backport nightmare, a revert that breaks unrelated functionality, and a reviewer who gives up. The kernel’s insistence on patch quality isn’t about pedantry--it’s about ensuring the system can evolve without collapsing under its own weight. The ones who gain are those willing to do the unglamorous work of writing good patches, not just fast ones.

Key Action Items

  • Design your patch series as a narrative, not a dump. Break changes into logical, self-contained steps. If your commit message has "and" or "but," split it. This pays off in 12-18 months when backports and audits become trivial.

  • Delegate authority before you need to. Build a hierarchy of trust in your project early, even if you’re still the primary reviewer. The 18-month payoff? A team that can scale without you becoming a bottleneck.

  • Integrate tooling at the edge, not the center. Don’t wait for a perfect CI system. Start with what your team can build and run locally. The delayed advantage? A solution that actually fits your workflow.

  • Treat vendor contributions with skepticism. Assume they’re optimized for the vendor’s internal needs, not your project’s health. Over the next quarter, establish review criteria that force upstreamable design.

  • Use AI review as a side channel, not a gate. Run experimental reviewers (like Airlie’s Claude system) in parallel. The signal-to-noise ratio improves over 6-12 months as models and prompts mature.

  • Measure release health by trend, not size. Track patch volume and change size over cycles. A sudden spike in large, intricate changes post-merge window is a red flag. Start tracking this next cycle.

  • Encourage new contributors to scratch personal itches. Don’t assign tasks. Point them to real problems on their own hardware. The long-term payoff? Sustained engagement and better code.

---
Handpicked links, AI-assisted summaries. Human judgment, machine efficiency.
This content is a personally curated review and synopsis derived from the original podcast episode.