Five ideas describe the engine. A persistent IR that records every decision. A composable pipeline of passes that transform it. Contracts that refuse to let it lie. A flat AIRR record that hands you the truth, by name, in a format every aligner already speaks. And a live-call layer that reports what an evidence-based caller would say about the same record — for benchmarking against.
Read these in order if you want the architecture; skip ahead if you came for a specific mechanism. Each chapter is self-contained.
Three phases — recombine, mutate, corrupt — built from small composable passes. Each pass takes an IR, returns a new one. Re-order them, swap them, add your own.
Read chapter →
Every nucleotide lives in an arena pool. Every entity references it by a typed
u32 handle. Every with_* method returns a new IR, leaving
the old one intact — so any pass, any contract, any retry is reversible by design.
Predicates over the IR with two modes: verify after the fact, or
filter sampling choices before they happen. The
productive() bundle composes three: anchor preserved,
junction frame divisible by three, no stop codons inside.
Each .run() emits one flat record per sequence — ~70 fields,
AIRR-compatible names, every truth side-by-side with its evidence-based
twin (truth_v_call vs v_call). One CSV away from any
aligner benchmark.
The caller behind the v_call / d_call / j_call
fields. Per-allele scoring, strict tie at max, trim-bounded elastic boundaries,
truth-first ordering when the sampled allele survives the tie.
The 5-lesson tutorial follows a single molecule through every pass and shows the AIRR record fill in field by field. Recommended after Chapter 01.
Open the tutorial →How to benchmark an aligner, how to build a null model, how to compose a custom pass, how to wire GenAIRR into an existing ML training loop.
Browse guides →