The architecture · how it works inside

How GenAIRR thinks.

Five ideas describe the engine. A persistent IR that records every decision. A composable pipeline of passes that transform it. Contracts that refuse to let it lie. A flat AIRR record that hands you the truth, by name, in a format every aligner already speaks. And a live-call layer that reports what an evidence-based caller would say about the same record — for benchmarking against.

Read these in order if you want the architecture; skip ahead if you came for a specific mechanism. Each chapter is self-contained.

Architecture ~8 min read

The Simulation Pipeline

Three phases — recombine, mutate, corrupt — built from small composable passes. Each pass takes an IR, returns a new one. Re-order them, swap them, add your own.

Read chapter →

Data model ~6 min read

The Persistent IR

Every nucleotide lives in an arena pool. Every entity references it by a typed u32 handle. Every with_* method returns a new IR, leaving the old one intact — so any pass, any contract, any retry is reversible by design.

Read chapter →

Invariants ~5 min read

Ground-Truth Contracts

Predicates over the IR with two modes: verify after the fact, or filter sampling choices before they happen. The productive() bundle composes three: anchor preserved, junction frame divisible by three, no stop codons inside.

Read chapter →

Output schema ~4 min read

The AIRR Record

Each .run() emits one flat record per sequence — ~70 fields, AIRR-compatible names, every truth side-by-side with its evidence-based twin (truth_v_call vs v_call). One CSV away from any aligner benchmark.

sequence_id	str	unique
sequence	str	320 bp
truth_v_call	str	sampled allele
v_call	str	evidence tie-set
truth_n_mutations	int	11
… 65 more fields

Read chapter →

Reading the evidence ~6 min read

Live Calls

The caller behind the v_call / d_call / j_call fields. Per-allele scoring, strict tie at max, trim-bounded elastic boundaries, truth-first ordering when the sampled allele survives the tie.

Read chapter →

Then

Walk a real example end‑to‑end.

The 5-lesson tutorial follows a single molecule through every pass and shows the AIRR record fill in field by field. Recommended after Chapter 01.

Open the tutorial →

Or skip to

Task-shaped guides.

How to benchmark an aligner, how to build a null model, how to compose a custom pass, how to wire GenAIRR into an existing ML training loop.

Browse guides →