SeqChain¶

Pure functions. Strong opinions.

Composable operations

Load, scan, annotate, interpret. Each step takes Regions and returns Regions.

regions = regex_map(sequences, preset.pam)
guides = interpret_guides(regions, sequences, preset)
annotated = annotate_locus(guide, gene_index, chrom_len)

CRISPR guide design

Genome in, ranked guide library out. SpCas9 and Cas12a with circular genome support.
```
seqchain design genome.gb --preset spcas9
```
Barcode counting

Self-discovering read structure. Paired-end aware. FASTQ to per-barcode counts.
```
seqchain count barcodes.fasta reads.fastq
```
Tn-seq profiling

Raw reads to insertion counts. Himar1 and Tn5, Bowtie and BWA mappers.
```
seqchain tnseq --reads1 R1.fq.gz --ref genome.fna
```
Chromatin annotation

Truth-table state assignment over signal tracks. YAML-configurable mark combinations.
```
annotate_chromatin(config, signals, chrom_lengths)
```
Differential comparison

Align two tracks by coordinate, compute fold change with normalization.
```
interval_fold_change(control, treated)
```

How it works¶

Everything before binarize() is signal-specific: each mark type has its own distribution and needs its own threshold strategy. Everything after is composable machinery that operates on binary present/absent tracks. The output is domain-specific — state labels like acetylated, destabilized, SAGA_acetylated come from your preset YAML, encoding the biological model that makes the analysis meaningful.

Region¶

The universal coordinate record. A CRISPR cut site, a gene boundary, and a barcode hit are all Regions with different tags. Frozen dataclass: operations produce new Regions, never mutate.

Track¶

Protocol container with three implementations. IntervalTrack for peaks and genes, SignalTrack for per-base BigWig data, TableTrack for counts and scores.

Operations¶

Protocol-typed steps: map, annotate, filter, discover, quantify, score. Each has a protocol and pluggable implementations. Swap aligners, scorers, or annotators without changing the code that uses them.

Recipes¶

Domain workflows that compose operations. CRISPR guide design, barcode counting, Tn-seq profiling, chromatin annotation. All dependencies injected via __init__.

No class hierarchies. Objects wired together through dependency injection (OLOO). Any object with the right methods satisfies the protocol.