Understanding biology's
foundation models

We apply mechanistic interpretability to single-cell foundation models — turning black-box AI into causal, verifiable insights about gene regulation, cell programs, and perturbation responses.

4

Research Tracks

38

Active Projects

10

Research Outputs

3

Atlas Modules

Biology demands more than predictions — it demands understanding

Foundation models like scGPT learn powerful representations from millions of cells. But in biology, a prediction without a mechanism is just a correlation. If we can't explain why a model predicts a gene interaction, we can't trust it to guide experiments, discover drug targets, or advance scientific knowledge.

Biodyn bridges this gap. We apply mechanistic interpretability — the science of understanding what neural networks learn internally — to biological foundation models. Our goal: reduce the time from biological question to reproducible, mechanistic result by 10–100× using rigorous, causally-grounded methods.

Principle 01

Beyond Black Boxes

We open the hood of foundation models to find biologically meaningful circuits — gene programs, pathways, and cell-state representations.

Principle 02

Causal, Not Correlational

Every interpretability claim must survive causal intervention tests. We ablate, patch, and perturb to verify mechanistic hypotheses.

Principle 03

Automation as Advantage

Every solved research step becomes reusable infrastructure, compounding our R&D velocity across projects.

Four interconnected tracks

Our research spans mechanistic interpretability, network inference, perturbation modeling, and automated R&D — each feeding into the others.

Track 01

Mechanistic Interpretability

Convert black-box single-cell foundation models into mechanistically understood systems. We use representation probes, sparse autoencoders, activation patching, and targeted ablations to identify gene programs, pathways, and cell-state circuits within transformer models — with causal verification at every step.

Activation Patching Sparse Features Concept Probes Causal Tracing
Track 02

Biological Network Inference

Build and benchmark gene regulatory network (GRN) and signaling inference pipelines from single-cell data. We extract attention-based interaction scores, calibrate against ground-truth databases (TRRUST, DoRothEA), and produce versioned, queryable network objects for downstream analysis.

GRN Recovery Signaling Chains Confidence Calibration
Track 03

Perturbation Modeling

Predict cellular responses to CRISPR knockouts, drug treatments, and genetic perturbations across cell types and doses. We use perturbation-derived edges from Perturb-seq experiments as ground truth to validate model predictions and build perturbation-to-network benchmarks.

Perturb-seq CRISPR Screens Active Learning
Track 04

Agentic R&D Automation

Automate the entire research loop — from data ingestion and quality control to experiment design, execution, evaluation, and reporting. Coordinated AI agents handle the repetitive work while humans provide scientific steering and strategic direction.

Dataset Factory Experiment Factory Auto-Reporting

The R&D Flywheel

Our operating loop compounds progress. Every cycle produces reusable infrastructure, rigorous evaluation, and mechanistic insight.

01

Discover

Continuous scanning of research opportunities, market signals, and emerging datasets. AI agents produce scored Opportunity Briefs.

02

Design

Experiments are designed with falsifiable hypotheses, explicit controls, and pre-registered evaluation criteria. No fishing expeditions.

03

Implement

Reproducible pipelines with pinned data versions, tracked configurations, and deterministic seeds. Every run is auditable.

04

Evaluate

Standardized benchmarks with ablations, baselines, robustness checks, and bias-aware evaluation protocols to prevent misleading claims.

05

Interpret

Mechanistic reports with causal intervention evidence, boundary conditions, and explicit separation between biological insights and suggestive observations.

06

Automate

Every repeated step becomes a reusable command, template, or agent skill — compounding speed and consistency across future projects.

Opening biology's black boxes

Mechanistic interpretability of biological foundation models isn't just an academic exercise — it's a prerequisite for trustworthy, actionable AI in the life sciences.

Priority 01

Drug Target Discovery

Understanding which internal model features correspond to real gene regulatory mechanisms enables principled identification of drug targets — grounded in causal evidence rather than statistical correlation.

Priority 02

Scientific Rigor

Biology demands explanations that survive falsification. Our causal intervention framework — ablation, patching, perturbation validation — ensures mechanistic claims are testable and reproducible, not just pattern-matching.

Priority 03

Evaluation Integrity

Current benchmarks are brittle: mapping and candidate-set choices dominate metrics, causing misleading ranking reversals. Our evaluation bias protocols expose and correct these hidden confounds.

Foundation models are transforming biology — learning rich, compressed representations from millions of single cells across tissues, conditions, and perturbations. But predictive power without interpretability is a liability. In domains like drug discovery and precision medicine, deploying a model that "just works" without understanding why it works can lead to false confidence, wasted experiments, and missed therapeutic opportunities.

Mechanistic interpretability changes this equation. By mapping a model's internal representations to known biology — gene programs, signaling pathways, cell-state transitions — we can verify that models learn real mechanisms rather than dataset artifacts. And by testing these circuits with causal interventions, we produce insights that are not just plausible but falsifiable — meeting the standard that biology demands.

Explore the full project portfolio

The complete catalog of active projects now lives on a dedicated portfolio page with per-project annotations.

Browse all active Biodyn projects in one place, including concise project descriptions and current status labels.

Open Portfolio Page →

Mechanistic interpretability explorations

Interactive atlas modules for sparse autoencoder (SAE) feature analysis across Geneformer, scGPT, and Novae.

Atlas Module

Geneformer Atlas

Interactive SAE mechanistic interpretability exploration for Geneformer, focused on feature-level biological semantics and circuit inspection.

SAE Mechanistic Interpretability Model: Geneformer
Open atlas
Atlas Module

scGPT Atlas

Interactive SAE mechanistic interpretability exploration for scGPT, including atlas views for feature behavior across biological contexts.

SAE Mechanistic Interpretability Model: scGPT
Open atlas
Atlas Module

Novae Atlas

Interactive SAE mechanistic interpretability exploration for Novae, with atlas views for feature structure and biological program organization.

SAE Mechanistic Interpretability Model: Novae
Open atlas

Research outputs

Preprints, papers, and public research outputs from the Biodyn pipeline.

Academic collaborations and research engagements

We collaborate with academic labs and also maintain technical exchanges around models developed externally. Where noted as a research engagement, this reflects discussion and input rather than formal co-development.

Research labs

Academic groups with whom we collaborate directly on biological foundation models and adjacent interpretability questions.

Research Collaboration

Theodoris Lab

Collaborative research around biological foundation models, network biology, and mechanistic interpretability in the Geneformer ecosystem.

Gladstone Institutes Geneformer
Research Collaboration

Université Paris-Saclay, Laboratory of Mathematics and Computer Science

Collaborative research around spatial foundation models and interpretable analysis for spatial transcriptomics and tissue organization.

Spatial Transcriptomics Novae

Research engagements with model developers

Independent mechanistic interpretability work informed by direct discussion and feedback from the teams behind the models.

Research Engagement

GenBio AI

We are applying our mechanistic interpretability toolkit to GenBio-PathFM, a histopathology foundation model by GenBio AI. The work benefits from discussion and input from the team while remaining an independent interpretability effort.

GenBio-PathFM Histopathology
Research Engagement

InstaDeep

We are applying our mechanistic interpretability toolkit to Nucleotide Transformer. This work has been informed by direct exchange with the team while remaining separate from model development.

Nucleotide Transformer Genomics

Led by

IK

Ihor Kendiukhov

Founder & Principal Researcher
University of Tübingen, Computer Science Department

Building at the intersection of AI interpretability and systems biology. Research focus on mechanistic understanding of biological foundation models, gene regulatory network inference, and agentic R&D automation.