Biohub: The Future of Biology is Open-Source with Co-Founders Mark Zuckerberg, Priscilla Chan, and Head of Science Alex Rives

Biohubs virtual biology initiative is about more than just applying AI to science. The core bet is that the biggest leverage comes from building hierarchical world models that scale from proteins up to cells and whole systems, and then putting those tools in everyones hands. The unexpected consequence might be that the biggest breakthroughs wont come from directly curing common diseases. Instead, they might come from empowering a broad range of researchers to tackle rare conditions, generating knowledge that flows back to benefit the rest of biology. If you are investing in AI-driven drug development or scientific infrastructure, the real edge is understanding why patience with hierarchical modeling and open-source distribution creates compounding advantages that closed, disease-focused approaches cannot match.

Most efforts to model biology try to skip levels. They want a cellular model without first understanding the proteins that make a cell work. Biohubs co-founders argue that is a trap. Mark Zuckerberg explained it clearly.

You need to understand protein interactions to understand how cells work. You cannot just go straight to cells without understanding protein modeling. And if you are trying to understand how the immune system works or how a bunch of cells interact, it is tough to do that without first understanding cells.

This is not just theoretical elegance. It has downstream consequences. If you skip a level, your model will generalize poorly. An antibody designed without understanding its off-target binding across all cell types will fail in human trials. The hidden cost of that shortcut is spending years on a therapy that looks good in a dish but collapses in Phase I. Building the hierarchy from the ground up feels slower. You spend months on protein modeling before tackling cells. But each layer becomes a foundation that accelerates everything above it. The payoff is delayed, but it compounds.

Priscilla Chan connected this to a deeper shift moving biology from discovery science to engineering science. She asked what if we could actually understand how biology worked and move it from a discovery-based science to an engineering-based science. We could systematically understand how living beings and living cells worked and understand why things go wrong. That is the second-order effect. Once you have a hierarchical world model, you do not just predict. You design. You intervene at the mechanistic level for an individual patient, not a population average.

The nonprofit, open-source model is not just charity. It is a strategic choice that creates a feedback loop most for-profit efforts ignore. By releasing ESMFold2 as an open discovery engine, Biohub enables thousands of independent researchers to explore problems no single organization would prioritize. Priscilla Chan explained the consequence.

If you decentralize the effort and put the tools in many peoples hands, you start getting people who are super interested in spinal muscular atrophy. They care deeply about it. If you put the tools in that persons hands, they will make progress in a way you probably would not if you had to focus your efforts and make big bets. You would not because it is a niche, individual, small group disease. Understanding that disease process helps unlock knowledge about how the human body works.

Rare diseases are edge cases. Edge cases stress-test models and reveal mechanisms that common diseases hide. Every rare disease solved by a motivated researcher becomes a data point that improves the world model for everyone. Over time, the long tail feeds back into the core, making the entire system smarter. Conventional wisdom says to focus on big markets. Biohubs bet is the opposite. Distributing tools to the edges creates more total knowledge per dollar. And because it is open source, that knowledge is not siloed.

Mark Zuckerberg underlined the philosophy. He said they believe a positive future is one where you build a technology as a tool, put it in individuals hands, and that is how society makes progress. The immediate discomfort is obvious. There is no proprietary data moat, no exclusivity. The ten-year advantage is that hundreds of independent teams are extending your model in directions you never could.

One of the most underappreciated insights from the conversation is the role of mechanistic interpretability. It is not just about predicting protein structures. It is about using the model to discover new biology. Alex Rives described how the protein language model, trained on billions of sequences, develops representations that correspond to known biological structures purely from token prediction. But it also captures connections between proteins we know nothing about and those we do.

He said they did not design a model for antibodies. They did not design a model to bind one particular target. They just designed a model that could understand proteins, and protein design emerged as a property.

This is a radical departure from traditional drug discovery, where you start with a target and engineer a molecule. Here, the model understands the grammar of protein language so deeply that design emerges. Instead of screening millions of antibodies in the lab, you can compute hundreds of thousands of trajectories, test 96 in a well plate, and find nanomolar binders. The speed gain is obvious. The non-obvious gain is that the models internal representations become a map of unknown biology. Ask it why it thinks a certain protein binds, and it reveals connections that no human had noticed. Over time, every digital experiment enriches the map, turning biology into a queryable system rather than a series of lucky breaks.

---
Handpicked links, AI-assisted summaries. Human judgment, machine efficiency.
This content is a personally curated review and synopsis derived from the original podcast episode.