Recipe C · 03 · ~10 min · advanced

Replay a simulation with one knob changed.

The persistent IR makes counterfactual analysis cheap. Fix a seed, take any intermediate revision (say, the post-recombination snapshot), swap a single downstream pass, and re-run from that point. You isolate exactly the effect you wanted — no upstream randomness in the way, no full re-execution to pay for.

01 Run once with seed save the outcome
02 Pick the fork point revision_after("recombine")
03 Re-run downstream swapped pass · fresh seed
PART 01

Why this is cheap.

Each pass commits a new revision rather than mutating the previous one. So a snapshot from any pass — recombination, mutation, the first corruption step — is a valid starting point for a new pipeline. You don't have to re-recombine to test a different mutation model on the same record. Persistent IR pays for itself the moment you start asking counterfactuals.

What you can fork at
  • After recombinevary mutation or corruption while keeping V·D·J fixed
  • After mutatevary corruption while keeping the mutated genome fixed
  • After any specific passrevision_after("pass_name") gives you the IR immediately after that pass
What this gives you
  • Single-variable isolationno upstream RNG drift between conditions
  • Time savingsskip the phases that aren't changing
  • Causal attributionany downstream difference is attributable to the changed knob
PART 02

The fork pattern.

Run once to materialize the IR. Pick a revision. Build a new PassPlan that starts from that revision instead of an empty IR. Apply only the passes downstream of your fork point.

Fork after recombination
import GenAIRR as ga
from GenAIRR._engine import PassPlan, CompiledSimulator

# pass A — baseline run (recombine only)
base = (
    ga.Experiment.on("human_igh")
      .recombine()
      .run(n=100, seed=42)
)

# snapshot the post-recombination IR for record 0
recombined = base[0].revision_after("assemble_segment.j")

# condition 1: S5F mutation
plan_a = PassPlan.from_simulation(recombined)
plan_a.push_mutate_s5f(count_pairs=[(15, 1.0)])
out_s5f = CompiledSimulator(plan_a).run(n=1, seed=100)[0]

# condition 2: uniform mutation, same starting point
plan_b = PassPlan.from_simulation(recombined)
plan_b.push_mutate_uniform(count_pairs=[(15, 1.0)])
out_uni = CompiledSimulator(plan_b).run(n=1, seed=100)[0]

# truth_v_call is identical between the two — only the SHM differs
assert out_s5f.truth_v_call == out_uni.truth_v_call
PART 03

Sweeping a parameter.

The same fork point can drive a parameter sweep — same recombination, every value of a single knob you want to study. Useful for the kind of ablation plot that goes into a paper figure.

SHM rate sweep on a single recombination
# 1 recombination, 5 mutation rates — all sharing V·D·J
records = []
for mu in [5, 10, 15, 20, 25]:
    plan = PassPlan.from_simulation(recombined)
    plan.push_mutate_s5f(count_pairs=[(mu, 1.0)])
    out = CompiledSimulator(plan).run(n=1, seed=100)[0]
    records.append({
        "mu":        mu,
        "v_id":      out.final_simulation().v_identity,
        "junction":  out.final_simulation().junction_aa,
    })
Related recipes

Where to next.

B · 02 · Compare two SHM models →

A standardized A/B harness built on the same fork pattern.

Concept · Persistent IR →

The architecture that makes forking cheap — every pass writes a revision, none mutate in place.