AlignAIR

AlignAIR is an AGPL-3 licensed sequence alignment tool specifically designed for Adaptive Immune Receptor Repertoire (AIRR) sequences. AlignAIR v2.0 features a unified architecture that dynamically supports both single-chain and multi-chain analysis with seamless GenAIRR integration.

Overview

AlignAIR v2.0 represents a major architectural advancement, combining powerful sequence alignment algorithms with a unified, modular interface. The new system features:

Features

Installation Options

To install the latest stable release of AlignAIR, use:

pip install AlignAIR

For installation from the GitHub repository for the latest development version:

git clone https://github.com/MuteJester/AlignAIR.git
cd AlignAIR
pip install .

Quick Start Guide

  1. Input Preparation: Ensure your input data is in a compatible format (e.g., FASTA or CSV with sequences).
  2. Choose Configuration: Select appropriate GenAIRR dataconfig(s) for your receptor type(s):
    • Single chain: --genairr-dataconfig=HUMAN_IGH_OGRDB
    • Multi-chain: --genairr-dataconfig=HUMAN_IGK_OGRDB,HUMAN_IGL_OGRDB
  3. Running AlignAIR: Use the unified CLI interface:
     python app.py run --model-dir=model_path \
                      --genairr-dataconfig=HUMAN_IGH_OGRDB \
                      --sequences=my_sequences.csv \
                      --save-path=results/
    
  4. Results: AlignAIR automatically detects single vs. multi-chain scenarios and adapts accordingly.

Docker Support

AlignAIR provides a Docker image to ensure a consistent runtime environment. To use AlignAIR with Docker:

  1. Pull the Docker Image:
     docker pull thomask90/alignair:latest
    
  2. Run the Container (entrypoint style): bash docker run -it --rm \ -v $(pwd):/data \ -v $(pwd)/results:/downloads \ thomask90/alignair:latest run \ --model-dir=/app/pretrained_models/IGH_S5F_576 \ --genairr-dataconfig=HUMAN_IGH_OGRDB \ --sequences=/data/my_sequences.fasta \ --save-path=/downloads/

Documentation Resources

Development and Contributions

AlignAIR is an open-source project. We welcome contributions from the community:

Publications and References

The detailed methodology and performance benchmarks are discussed in the main manuscript and supplementary documentation here.