Research Project

Understanding biology's
foundation models

We apply mechanistic interpretability to single-cell foundation models — turning black-box AI into causal, verifiable insights about gene regulation, cell programs, and perturbation responses.

4

Research Tracks

8+

Workshop Projects

3

Evaluation Benchmarks

15

Agent Roles

Biology demands more than predictions — it demands understanding

Foundation models like scGPT learn powerful representations from millions of cells. But in biology, a prediction without a mechanism is just a correlation. If we can't explain why a model predicts a gene interaction, we can't trust it to guide experiments, discover drug targets, or advance scientific knowledge.

Biodyn bridges this gap. We apply mechanistic interpretability — the science of understanding what neural networks learn internally — to biological foundation models. Our goal: reduce the time from biological question to reproducible, mechanistic result by 10–100× using rigorous, causally-grounded methods.

🧬

Beyond Black Boxes

We open the hood of foundation models to find biologically meaningful circuits — gene programs, pathways, and cell-state representations.

🔬

Causal, Not Correlational

Every interpretability claim must survive causal intervention tests. We ablate, patch, and perturb to verify mechanistic hypotheses.

Automation as Advantage

Every solved research step becomes reusable infrastructure, compounding our R&D velocity across projects.

Four interconnected tracks

Our research spans mechanistic interpretability, network inference, perturbation modeling, and automated R&D — each feeding into the others.

🧠

Mechanistic Interpretability

Convert black-box single-cell foundation models into mechanistically understood systems. We use representation probes, sparse autoencoders, activation patching, and targeted ablations to identify gene programs, pathways, and cell-state circuits within transformer models — with causal verification at every step.

Activation Patching Sparse Features Concept Probes Causal Tracing
🕸️

Biological Network Inference

Build and benchmark gene regulatory network (GRN) and signaling inference pipelines from single-cell data. We extract attention-based interaction scores, calibrate against ground-truth databases (TRRUST, DoRothEA), and produce versioned, queryable network objects for downstream analysis.

GRN Recovery Signaling Chains Confidence Calibration
💊

Perturbation Modeling

Predict cellular responses to CRISPR knockouts, drug treatments, and genetic perturbations across cell types and doses. We use perturbation-derived edges from Perturb-seq experiments as ground truth to validate model predictions and build perturbation-to-network benchmarks.

Perturb-seq CRISPR Screens Active Learning
🤖

Agentic R&D Automation

Automate the entire research loop — from data ingestion and quality control to experiment design, execution, evaluation, and reporting. Coordinated AI agents handle the repetitive work while humans provide scientific steering and strategic direction.

Dataset Factory Experiment Factory Auto-Reporting

The R&D Flywheel

Our operating loop compounds progress. Every cycle produces reusable infrastructure, rigorous evaluation, and mechanistic insight.

01

Discover

Continuous scanning of research opportunities, market signals, and emerging datasets. AI agents produce scored Opportunity Briefs.

02

Design

Experiments are designed with falsifiable hypotheses, explicit controls, and pre-registered evaluation criteria. No fishing expeditions.

03

Implement

Reproducible pipelines with pinned data versions, tracked configurations, and deterministic seeds. Every run is auditable.

04

Evaluate

Standardized benchmarks with ablations, baselines, robustness checks, and bias-aware evaluation protocols to prevent misleading claims.

05

Interpret

Mechanistic reports with causal intervention evidence, boundary conditions, and explicit separation between biological insights and suggestive observations.

06

Automate

Every repeated step becomes a reusable command, template, or agent skill — compounding speed and consistency across future projects.

Opening biology's black boxes

Mechanistic interpretability of biological foundation models isn't just an academic exercise — it's a prerequisite for trustworthy, actionable AI in the life sciences.

🎯

Drug Target Discovery

Understanding which internal model features correspond to real gene regulatory mechanisms enables principled identification of drug targets — grounded in causal evidence rather than statistical correlation.

🧪

Scientific Rigor

Biology demands explanations that survive falsification. Our causal intervention framework — ablation, patching, perturbation validation — ensures mechanistic claims are testable and reproducible, not just pattern-matching.

📊

Evaluation Integrity

Current benchmarks are brittle: mapping and candidate-set choices dominate metrics, causing misleading ranking reversals. Our evaluation bias protocols expose and correct these hidden confounds.

Foundation models are transforming biology — learning rich, compressed representations from millions of single cells across tissues, conditions, and perturbations. But predictive power without interpretability is a liability. In domains like drug discovery and precision medicine, deploying a model that "just works" without understanding why it works can lead to false confidence, wasted experiments, and missed therapeutic opportunities.

Mechanistic interpretability changes this equation. By mapping a model's internal representations to known biology — gene programs, signaling pathways, cell-state transitions — we can verify that models learn real mechanisms rather than dataset artifacts. And by testing these circuits with causal interventions, we produce insights that are not just plausible but falsifiable — meeting the standard that biology demands.

Active research projects

Ranked by scientific leverage, feasibility, and differentiation potential.

Mechanistic interpretability explorations

Interactive atlas modules for sparse autoencoder (SAE) feature analysis across Geneformer and scGPT.

Atlas Module

Geneformer Atlas

Interactive SAE mechanistic interpretability exploration for Geneformer, focused on feature-level biological semantics and circuit inspection.

🧠 SAE MechInterp 🧬 Geneformer
Open atlas
Atlas Module

scGPT Atlas

Interactive SAE mechanistic interpretability exploration for scGPT, including atlas views for feature behavior across biological contexts.

🧠 SAE MechInterp 🧬 scGPT
Open atlas

Research outputs

Workshop papers, technical reports, and evaluation protocols from our research pipeline.

Led by

IK

Ihor Kendiukhov

Founder & Principal Researcher
University of Tübingen, Computer Science Department

Building at the intersection of AI interpretability and systems biology. Research focus on mechanistic understanding of biological foundation models, gene regulatory network inference, and agentic R&D automation.