Compound
Thesis DBAboutPortfolioWriting

2026 Compound

Compound
Thesis DBAboutPortfolioWriting
Back to Database
Discovery Bio Platforms
Bio

Discovery Bio Platforms

Discover biological insights and / or develop drugs for targets

TL;DR

Discovery‑based biotech platforms use a repeatable engine (novel assays + automation/ML + proprietary data) to discover novel targets and/or develop drug candidates. They monetize via partnerships and an internal pipeline.

  • Sales cycle on partnerships: 6-18 months of hand-to-hand combat with pharma
  • Typical partnership value:
  • Avg: $5–10M upfront + $200–250M biobucks
  • 90th percentile: $20–30M upfront + $700–800M
  • Preclinical licensors typically capture <20% of the total deal value after nine years
  • Value capture: 3/5 (partnership on early assets generally a good model but historically discovery hasn’t been as easy to monetize as novel modalities but we expect that to change)
  • Companies

    Millennium, Vertex, Isomorphic, DESRES, Relay, Schrödinger, Recursion, NewLimit

    Compound portfolio companies Achira, Juvena

    Overview

    Discovery Platforms

    Discovery-focused platforms have historically been harder to build large businesses around than modality-focused ones. As seen in the table above, the only truly stand out discovery platforms in history are the pioneer of rational drug design and of utilizing genetic evidence.

    Both turned out to be the rare paradigm shift in the process and efficacy of drug development. Nowadays, developing drugs without any semblance of rational design feels prehistoric. Meanwhile, genetic evidence for drug targets has proven 2x the odds of clinical success with clear causal genes (Mendelian traits and GWAS associations linked to coding variants) being even higher!

    To frame the rarity of truly paradigm-shifting discovery processes, consider that the number of novel targets approved by the FDA each year is usually between two and four.

    The reason for this historically greater difficulty might lie in the difficulty, long duration, and capital intensity of getting a novel discovery process to a minimally disruptive product and product-disease fit. As such discovery platforms start off as point solutions, it takes heightened clarity to get to a disruptive process within a couple years.

    Put more bluntly, it’s historically been easier to commercialize a novel modality capable of expanding the targetable space than to reinvent the scientific process itself.

    Whereas, modality platform biotechs are selling end products not information to pharma. As Steven Holtzman puts it:

    The output of Discovery Platform Companies is data/information/insights, NOT, as with the Modality Companies, New Chemical Entities and biologic therapeutics. The history of data in the biopharmaceutical industry is the history of its commoditization. Companies whose “life’s blood” is drugs/products have a vested interest in rendering data “pre-competitive” (or, at least doing so after they have had proprietary access for a time). They win based on their products; they don’t want to be held captive by the owner of the information. A more restrictive intellectual property (IP) environment: gone are the days when a transcriptional profile showing over-expression of a gene in a diseased tissue (or a genetic mutation in the diseased state) could get you an issued patent of the logical form, “a method of treating disease X comprising modulating target A by any means” (with dependent claims stating that the “means” could be an antibody, an antisense, an RNAi, a gene therapy, a small molecule, etc.). Moreover, the demands of the customer base became more expansive over time. Whereas, in the 1990s, most big pharma customers were willing to agree to terms that restricted their use of the data to the discovery and development of small molecule drugs (because that is all they did), in the present all pharmas/big biotechs would demand the right to exploit the data for all therapeutic modalities.

    Because pharma is reluctant to pay more than minuscule amounts for pre-clinical information or change their workflow around novel technology, discovery-focused platforms tend to adopt a JV-like business strategy of orchestrating their own tech to help pharma hit difficult targets. It also means that discovery platforms tend to need to move faster up the value chain into an integrated internal pipeline than modality companies.

    This industry dynamic of pharma paying pennies for pre-clinic discovery tools may change if and only if pharma starts to view novel approaches as existential threats. We’re currently seeing this play out with our portfolio company Wayve. Just within the last year, the legacy car OEMs have internalized that AVs are here and are existential to their very survival. Negotiations have shifted from contracts for tiny margins won via hand-to-hand combat over 2-year sales cycles to gargantuan deals over 3 month sales cycles.

    We at Compound expect these problems will be increasingly alleviated under the theses that:

    • It may be quicker to build computational tooling that’s useful than whole wet lab flows
    • There’s now enough tools at the industry’s disposal that a newcomer building a step-function better point solution can slot right in before expanding its offerings over time
    • Pharma will internalize in the next five years that these information-based platforms are serious threats
    • ML discovery engines that benefit from ever-more data can have an improved tradeoff to partnerships. Historically, they’re viewed as selling future value for cash in the present. But in theory, platforms that compound with data can enjoy both the cash / legitimization in the present and the increased likelihood of future programs.

    We are exhilarated by the dozens of emerging technologies to bend the probabilities of discovery and have been investing in many different versions.

    As Holtzman framed Millennium’s pioneering of the discovery platform business model:

    In reality, using large partnerships to create out-size value requires a keen sense of the external partnering environment, identifying the potential partner(s) which at that specific moment in its/their history has/have come to perceive a critical need for what you have, and crafting deals in which you, while you sell rights, retain the potential for value creation by, for example, retaining ownership of the knowledge. That, not a cookie-cutter model, is your legacy in this arena.

    Broad vs Deep Discovery Platforms

    Finally, not all insights platforms are the same: technologies with broad applicability to many different biological pathways and drug targets—like Millennium with genomics. Second, there are platforms with a deep disease-specific focus. One of the most successful examples is Agios, which primarily focuses on metabolic insights into cancer. The narrower the focus for an insights platform, the more its strategy starts to resemble a product-focused company. There are less “shots on goal,” so each one needs to count. As a result, they realistically need to land a sizable multi-target “foundational corporate partnership” earlier on—like Agios did with Celgene—and then push even harder to advance their own internal programs.
    Model image

    Building Platforms More Broadly

    Over the last 10+ years, the dominant meta in biotech has swung violently and irrationally between being all-in on maximally general platforms and conservative single assets plays.

    We at Compound push back on the recently popularized platform playbook of raising hundreds of millions, then taking 5+ years to build out the most generalized tech infrastructure possible, only after that start thinking about what target/disease to apply it to, and then pursue a pipeline of 10+ drug candidates under the flawed logic of maintaining maximal optionality (which just means that no candidate will receive adequate focus).

    The reality is that no matter what, ultimately all platforms:

    • Will be judged by their first asset progressed to clinic
    • Have little pricing power until proven by stand-out clinical read-outs
    • Have at best slightly better odds (i.e. still very low) of developing a successful drug, so building a large, unfocused portfolio is the negative expected value strategy
    Model image
    Model image

    Moreover, nearly 50% of biotechs go bankrupt due to lack of continued funding vs just 33% for clinical failure. This further accentuates the misguided rationale for maximally general, long duration infra buildout without targeted end-points in mind.

    We at Compound focus our bio investing efforts on how to build platforms thoughtfully. We firmly believe that platforms should be built in a step-wise, capital constrained / gated way with an intense focus on building the initial technology towards a specific target/disease uniquely unlocked by your technology.

    • Teams should from the beginning have informed views on the range of targets / diseases that they would be uniquely positioned to address
    • Not only will this inform the initial tech buildout, but also it will enable you to be in a greater position of strength when you go to negotiate with pharma. Instead of accepting their list of targets (which will be those that’ve proven impossible for them to drug internally), you can give them a list of target types or targets you’re exploring and partner with them on ones similar to those of interest to you. This may also better align your early partnerships with your vision for you long-term tech buildout, rather than being a speculative distraction.
  • Then, build the minimal viable tech infra/platform to develop the maximally effective drug candidate for that target(s)/disease(s). The first target/disease is usually not as obvious as Genentech making human insulin, so it’s welcomed to iterate on a handful of targets/diseases before narrowing down and focusing efforts on 1-2 that have an exceptionally strong biological rationale for approval and commercial success.
  • Then, as those progress towards the clinic, a more full-fledged platform infra build out commences
  • Not coincidentally, the rough playbook above is how all the greatest biotech platforms in history were built.

    Non-dilutive capital from early pharma partnerships funds further investments into the platform → making it more capable → making odds of drug development success tick up and/or making it useful to more researchers and targets → drawing more non-dilutive opportunities → meanwhile, partnering with pharma gives the team a first-hand look at drug development → strengthening the case for internal pipeline development, which is the way to capture the value you create.

    Model image

    Millennium got acquired for $9B off of only $8M in VC funding (despite building out custom wet lab automation equipment, assays, etc. which required $50M in annual CAPEX by its third year).

    To execute this playbook, it’s essential to have a strong idea of which potential pharma partners have a strategic imperative to succeed in your focus areas, because those are the only entities with whom you’ll have the pricing power to earn a nice premium for all your efforts.

    Model image

    Lastly, our team at Compound gathered clinical, partnerships, and financial data on the 90+ most successful platform biotechs of all time to more precisely understand what these companies typically look like as they scale. Please treat the results as illustrative, not as fact.

    • Just 16% historically have gone public with an asset later than Phase I
    • 25% of the 90 companies' initial lead candidates ultimately got approved
    • 50% of the most successful platform biotechs ever ultimately got 1+ drug approved. That means that for 25% of them, it wasn't their lead asset at IPO (many companies were pre-clinical, PI or in diagnostics at IPO)
    • Just 25% of the most successful platforms ever get multiple drugs approved. 15% get 3+.
    • Ownership retained from partnership deal structures typically goes from ~60% for the first two drugs to 80%+ thereafter
    • Revenue plateaus from years ~5-15. The very few companies that make it passed that 10 year lull in sales become the Amgens of the world.
    • It takes 17 years after IPO for these most successful platforms to have revenues cover R&D alone... and then revenue goes vertical
    Model image
    Model image
    Model image
    Model image
    Model image
    Model image
    Model image

    Deal Structure Heuristics

    • Partnerships for the hottest later stage startups can be $30+ upfront with $0.5–1.5B milestones. From 2010-2015, that would correspond to the 90th percentile deal.
    • The average is roughly $5–10M upfront + $200–250M biobucks
    • Preclinical licensors typically capture <20% of the total deal value after nine years. And payouts are heavily power lawed with the majority of dollars going to Phase III readouts, top 10 deals, or for blockbuster status milestones. Deals on novel or unspecified targets delivered only 5–7% of potential, while those for targets already being pursued by others in trials secured 20–40%.
    Model image
    Model image
    Model image

    Further Reading

    "The human population, through explosive growth, has performed a comprehensive saturation mutagenesis experiment on itself. It is now the case that any single base substitution that is compatible with life is expected to be present somewhere among the nearly 8 billion living humans. Humanity has thus, in effect, done many of the natural experiments required to understand our own genotype-phenotype map; this leaves geneticists to catalog the outcomes of those experiments, and to leverage both observational and experimental approaches to understand the mechanisms by which variants alter biology."

    Model image

    https://leadershipandbiotechnology.blogspot.com/2018/08/early-stage-biotech-value-creation_15.html

    https://www.bio.org/clinical-development-success-rates-and-contributing-factors-2011-2020

    https://shelbyann.substack.com/p/a-playbook-for-human-evidence

    https://medium.com/data-science/the-road-to-biology-2-0-will-pass-through-black-box-data-bbd00fabf959

    https://www.mackenziemorehead.com/autonomous-science-part-i-everythings-an-api-away/

    https://shelbyann.substack.com/p/commercializing-autonomous-science

    https://www.mackenziemorehead.com/the-fickleness-of-scaling-laws/

    Lessons learned from the fate of AstraZeneca's drug pipeline: a five-dimensional framework

    Can the flow of medicines be improved? Fundamental pharmacokinetic and pharmacological principles toward improving Phase II survival

    On NPV, DCF and IRR(elevance)

    https://rapport.racap.com/all-stories/semper-maior-2026-biotech-ma

    https://www.science.org/doi/10.1126/science.abi8207

    https://shelbyann.substack.com/p/predicting-protein-dynamics-moats

    https://www.mackenziemorehead.com/untitled-2/

    https://reconstrategy.com/2025/04/preclinical-licensing-deals-realized-value/

    https://www.mckinsey.com/industries/life-sciences/our-insights/small-but-mighty-priming-biotech-first-time-launchers-to-compete-with-established-players

    https://centuryofbio.com/p/on-biotech-platform-strategy

    https://www.michaeldempsey.me/blog/2025/10/03/sequencing-vs-equal-odds-applied-research/

    The Entrepreneur’s Guide to a Biotech Startup https://ott.emory.edu/_includes/documents/sections/startups/guide_to_biotech_startup.pdf

    Company Histories

    Comprehensive list of discovery platforms

    Physics / computation-first small-molecule design

  • Vertex — first industrial SBDD/rational design at scale
  • D. E. Shaw Research (DESRES) — MD (Anton)–driven physics simulations → ligand design
  • Schrödinger — physics-based modeling → partnered drugs
  • Nimbus — computation-native, asset-centric SBDD
  • Exscientia — active-learning + AI design–make–test loops
  • Insilico Medicine — gen-AI from target ID→design
  • Genesis Therapeutics — GNNs for potency/ADMET
  • Atomwise — CNN docking at massive scale
  • Chai Discovery — computational antibody design
  • Isomorphic Labs — DL on structure/interaction graphs
  • XtalPi — physics-informed ML (QM + DL)
  • Verseon — proprietary physics search of chem space
  • Agouron — early X-ray SBDD pioneer (historical)
  • Fragment-based lead discovery

  • Astex — industrialized X-ray/NMR FBLD
  • Sunesis — “tethering” covalent fragment method
  • Vernalis — pragmatic FBLD to clinic
  • Plexxikon — scaffold-based, structure-guided evolution
  • DNA-encoded libraries / DNA-templated chemistry

  • Ensemble Therapeutics — DNA-templated macrocycles (historical)
  • X-Chem — industrial DEL hit-finding
  • HitGen — ultra-large DELs, broad partnering
  • Vipergen — in-solution DEL (YoctoReactor)
  • Nuevolution — Chemetics DEL (Amgen acq.)
  • Chemoproteomics & ligandability mapping

  • Vividion — proteome-wide covalent ligand discovery
  • Belharra — photoaffinity chemoproteomics for non-covalents
  • Scorpion Therapeutics — hotspot-centric oncology mapping
  • Kymera (chemo-proteomics arm) — degrader E3/target mapping
  • Phenotypic / cell-state modeling

  • Recursion — high-content cell imaging phenomics + in-house automation/ML loops
  • Juvena Therapeutics — AI/ML maps the stem-cell secretome to rank & engineer protein biologics for tissue regeneration (multi-dimensional protein library + phenotypic screens)
  • Insitro — iPSC + high-throughput phenotyping + ML
  • Cellarity — cell-state trajectories as the drug target
  • Phenomic AI — single-cell/secretome-aware phenomics
  • Turbine — in-silico cell simulations for combos
  • CytoReason — immune-system digital twins
  • Human genetics / in-human functional genomics–first

  • Millennium — genomics-first target discovery + in-house automation
  • deCODE (Amgen) — population genetics → targets
  • 23andMe Therapeutics — consumer genetics → genetically backed targets
  • Maze Therapeutics — human genetics + CRISPR validation
  • Verge Genomics — patient multi-omics networks → first-in-class
  • BioAge Labs — longitudinal proteomics of aging cohorts
  • Alchemab — “resilient” patient antibodies → protective biology
  • Antibody & binder discovery platforms

  • Cambridge Antibody Tech — phage display pioneer (historical)
  • MorphoSys — human Ab libraries (HuCAL)
  • Dyax — phage libraries → kallikrein drug (historical)
  • Adimab — yeast display at industrial scale
  • AbCellera — single-cell microfluidic Ab mining + ML
  • BigHat Biosciences — ML-guided wet lab for Ab optimization
  • Biolojic Design — function-first AI Abs (allostery/multispecifics)
  • A-Alpha Bio — high-throughput PPI mapping (AlphaSeq)
  • Natural-products, genome mining & metabolomics-first

  • Hexagon Bio — fungal genome mining + ML
  • Enveda Biosciences — metabolomics + ML from plants
  • Lodo Therapeutics — eDNA genome mining (historical)
  • Hotspot/allostery & dynamic pocket discovery

  • Relay Therapeutics — MD-based motion-guided cryptic-pocket/allostery engine
  • HotSpot Therapeutics — regulatory “hotspot” mapping → allosterics
  • Black Diamond Therapeutics — mutation-defined allosteric sites (MAP)
  • Revolution Medicines — tri-complex/allosteric RAS pathway chemistry
  • https://www.youtube.com/watch?v=IFbJIV8Sidw&ab_channel=NewLimit

    https://centuryofbio.com/p/manifold

    Model image

    At a Glance

    Categories
    Bio
    Definition
    Discover biological insights and / or develop drugs for targets

    Related Models

    Forward Deployed Engineer

    Palantir did it so it must be good

    DeSci

    Science meets decentralization and tokenomics

    JVs for Physical Products

    Partner with incumbents for scale up, own the royalties

    2026 Compound