Learn Lesson 02
Lesson 02 · ~3 min

The pipeline scrubber

A GenAIRR experiment is three biological phases that run in a fixed order. Click through them to see exactly what each one does to your molecule.

02 / 05

Why order matters

Biology first, lab second. Recombine before mutate — naive B cells acquire a receptor in the bone marrow before they meet antigen. Mutate before corrupt — somatic hypermutation happens in the germinal center, long before you stick the cell on a sequencer. Get the order wrong and your simulator is fiction.

Each phase corresponds to a real step. .recombine() builds the receptor in the bone marrow. .mutate() evolves it in the germinal center. .corrupt_*() models everything that happens once you've extracted the DNA and put it through wet-lab and sequencing — library prep, PCR, the sequencer itself, base calling. Order is biology.

Scrub through the pipeline

Click any node on the timeline. The same molecule, transformed by everything up to that point. Watch how the metadata at the bottom updates with each phase.

emptyrecombinemutatecorrupt

Legend: V · NP · D · J · mutation · N-base · trimmed

The corresponding code

The phases you scrubbed through map directly to method calls. Add only what you need — leave a phase off and that step never runs. Click any stop above to see the chain rebuild itself.

lesson_2.py
import GenAIRR as ga

result = (
    ga.Experiment.on("human_igh")

       .run_records(n=1000, seed=42)
)
Try it now

Run with .recombine().mutate(model="s5f", count=(5, 15)) only — every record should have n_mutations > 0 but n_pcr_errors == 0. Add .corrupt_pcr(count=(1, 10)) and rerun: PCR errors should now appear in n_pcr_errors, and mutation_rate stays the same. The two corruption sources are tracked as independent fields on the record.

Exercise

Build the same molecule three times: (a) .recombine() only, (b) + .mutate(model="s5f", count=(5, 15)), (c) + .corrupt_5prime_loss(length=(10, 30)). Compare sequence_length across the three. Which phases preserve length, which change it, and why?