Culture, Not Tools, Sustains Organizational Experimentation
Most organizations don’t fail at experimentation because they lack tools or training--they fail because they confuse activity with culture. The real consequence of this confusion? Teams learn to perform experimentation rather than practice it, creating a theater of innovation that collapses the moment leadership attention wanes. This conversation reveals that sustainable experimentation isn’t about running more A/B tests; it’s about designing feedback loops so deeply embedded in decision-making that reverting feels unnatural. Leaders, product managers, and engineers who want to build organizations that adapt--not just react--need to understand the non-obvious systems at play: how resource allocation shapes risk-taking, how leadership vulnerability enables team autonomy, and how a “failed” product metric can signal a breakthrough. The advantage here is real: organizations that get this right don’t just ship better products--they become harder to compete with over time because their learning compounds while others stagnate.
Why the Obvious Fix--More Training--Makes Things Worse
When companies realize they’re not innovating, the instinct is to fix the symptom: train people in experimentation. But as David Bland points out, this often backfires. “They took this check the box mentality to it so I ran experiments check what else do I need to do to just launch the thing I already want to launch.” That quote cuts to the core of the problem. When experimentation becomes a gate to pass rather than a method to learn, it doesn’t de-risk decisions--it delays them. Teams go through the motions, run a few tests, and then push forward with the original plan, now armed with a veneer of validation.
This creates a hidden cost: the illusion of rigor without the substance. The immediate benefit? Leadership feels confident they’re being data-driven. The downstream effect? Teams stop believing in the process because they see it being gamed. Over time, this erodes trust in data itself. People start seeing experiments as political tools, not learning tools. And once that happens, no amount of additional training will fix it--because the system has already routed around the solution.
Bland’s insight--that real change comes from practicing new tools on real opportunities--reveals a different path. Instead of abstract workshops, he forces immediacy: “I’ll introduce a concept I’ll introduce a fine case study that’s really short and then I’m like okay now we’re using it on your real stuff.” This does something subtle but powerful: it ties learning to accountability. You can’t treat a method as optional when it’s being applied to the very project you’re on the hook for.
And here’s the kicker: even when this works, it’s fragile. Bland shares a story that should unnerve every leader: a company that was a poster child for innovation stopped the moment leadership stopped talking about it. “It was in our bloodstream but it wasn’t in our DNA.” That metaphor is more than poetic--it’s diagnostic. Bloodstream is circulation. It moves things around, but it doesn’t replicate. DNA is replication. It ensures continuity even when the host isn’t actively maintaining it.
So what happens when the system responds to inconsistent reinforcement? It reverts. Not because people are resistant to change, but because culture follows incentives. If leaders only reward shipping, not learning, then learning becomes a tax--not an investment.
"They stopped talking about it and everybody in the company stopped doing it and they were really surprised the quote they said to me which i'll never forget was it was in our bloodstream but it wasn't in our dna."
-- David Bland
This is where conventional wisdom fails. Most advice stops at “leaders must champion change.” But Bland shows the deeper truth: championing isn’t a one-time speech. It’s a repetitive, almost tedious, act of reinforcement. The advantage isn’t in launching the program--it’s in maintaining it long after the novelty wears off. That’s where others won’t go. That’s where the moat forms.
The Leadership Behavior That Changes Everything (And It’s Not What You Think)
If training doesn’t stick, what does? Monica Lewis offers a counterintuitive answer: leaders must model failure publicly. Not just admit it--own it, celebrate it, shout it from the rooftops. “I’ve been in front of the company all hands and told that story that like hey i was wrong and it was a great thing that the team drove out.”
This is not vulnerability as inspiration. This is vulnerability as system design. When a leader admits they were wrong, they’re not just being humble--they’re altering the incentive structure. They’re signaling that being right is less valuable than learning. And that single shift changes how teams behave downstream.
Think about the typical product team. They’re under pressure to deliver. Roadmaps are locked. Deadlines are set. In that environment, testing something risky is dangerous--because if it fails, it’s on them. But if the leader has already shown that being wrong is safe--even celebrated--then the risk profile changes. The cost of experimentation drops.
Lewis also changes the timing of input. Instead of sharing polished plans, she shares “the skeleton doc” early. This isn’t just about collaboration--it’s about shifting the feedback loop upstream. Most teams get feedback after they’ve committed. Lewis builds it in before. That means discovery happens earlier, reducing the cost of being wrong.
And she pairs this cultural work with a structural one: the 70/20/10 portfolio framework. Sure bets (70%), strategic bets (20%), venture bets (10%). On the surface, this looks like a budgeting tool. But it’s really a risk thermostat. It makes uncertainty explicit. It says: we expect 30% of our portfolio to be high-risk. We’re resourcing it that way. No surprises.
But here’s the non-obvious part: the numbers aren’t fixed. “Maybe it's a more disruptive time and i feel like there's more risk for a business and i want to swing a lot more for the venture bets.” That flexibility is key. The system adapts to context. It doesn’t force-fit a rigid model onto a changing reality.
This creates a delayed payoff. In the short term, protecting 10--30% of your team’s time for high-risk bets feels inefficient. You could be “shipping more.” But over 12--18 months, that minority share becomes a source of optionality. While other teams are optimizing last year’s roadmap, yours is exploring next year’s.
How a “Failing” Product Metric Created a Category
Then there’s GitHub Copilot. A product that, by conventional metrics, should have been killed. “When we first released it acceptance rates were like in the 20 30. Just think about this--70 of the time that we suggested something to you you did not accept it.” By any standard A/B test logic, that’s a failure. Most organizations would have sunsetted it.
But Mario Rodriguez and his team didn’t. Why? Because they were outcome-driven, not output-driven. They weren’t measuring feature adoption--they were measuring developer love. And developers loved it. Even when it failed, it felt magical.
"You would think that makes a horrible product but no because of all of the value when you did people just absolutely loved it and that was a learning even for me."
-- Mario Rodriguez
That disconnect between metric and experience is critical. It reveals a deeper truth: innovation often breaks existing evaluation frameworks. If you’re using old metrics to judge new categories, you’ll kill the future.
Copilot didn’t win because it was right more often. It won because of how it was wrong--and how users adapted. The team noticed people using comments to program. Writing in Spanish. Asking for help mid-line. These weren’t test results. They were emergent behaviors.
And the team responded not with a pivot, but with iteration. “We value the learning loop or like your innovation velocity through that.” They ran three experiments a week, not because they had to, but because they wanted to. Each failure narrowed the path to what worked.
The real breakthrough? “Fill in the middle.” Most code isn’t written at the end of a line. It’s edited in the middle. That insight didn’t come from a roadmap. It came from watching people struggle. From failing. From learning.
And that’s the final layer: expertise matters. Rodriguez is clear--this wasn’t just about iteration. “You do need expertise in developer tools.” Fast learning amplifies good taste. Without it, you’re just failing faster in the wrong direction.
The consequence? Copilot didn’t just become a product. It became a new way of programming. And because the team was willing to sit with discomfort--low acceptance rates, unclear ROI, ambiguous metrics--they got to a place others couldn’t. Not because they were smarter. Because they were willing to be uncomfortable longer.
Key Action Items
- Stop training for behavior change--start modeling it. Leadership repetition isn’t optional. If you haven’t said it 40 times, it hasn’t sunk in. Make experimentation part of every review, every all-hands, every 1:1.
- Publicly own your failures. Use all-hands meetings to share when you were wrong and what you learned. This isn’t humility--it’s system design. It lowers the cost of risk for your team.
- Adopt the 70/20/10 portfolio framework. Allocate resources across sure bets, strategic bets, and venture bets. Adjust the mix based on market disruption. Protect time for learning.
- Measure outcomes, not just outputs. If a product feels magical but has low acceptance, dig deeper. Don’t kill innovation because it breaks your current metrics.
- Run at least 2--3 experiments per week on high-uncertainty problems. Speed up the learning loop. Reward learning velocity, not just delivery. This pays off in 12--18 months.
- Share unfinished thinking early. Circulate “skeleton docs” before plans are locked. Give teams time to do discovery before commitments are made.
- Invest in domain expertise. Fast iteration amplifies taste. Without deep user understanding, you’ll optimize the wrong things. This is a long-term moat--start now.