-
Composable operations
Load, scan, annotate, interpret. Each step takes Regions and returns Regions.
-
CRISPR guide design
Genome in, ranked guide library out. SpCas9 and Cas12a with circular genome support.
-
Barcode counting
Self-discovering read structure. Paired-end aware. FASTQ to per-barcode counts.
-
Tn-seq profiling
Raw reads to insertion counts. Himar1 and Tn5, Bowtie and BWA mappers.
-
Chromatin annotation
Truth-table state assignment over signal tracks. YAML-configurable mark combinations.
-
Differential comparison
Align two tracks by coordinate, compute fold change with normalization.
How it works¶
Everything before binarize() is signal-specific: each mark type has its own distribution and needs its own threshold strategy. Everything after is composable machinery that operates on binary present/absent tracks. The output is domain-specific — state labels like acetylated, destabilized, SAGA_acetylated come from your preset YAML, encoding the biological model that makes the analysis meaningful.
Region¶
The universal coordinate record. A CRISPR cut site, a gene boundary, and a barcode hit are all Regions with different tags. Frozen dataclass: operations produce new Regions, never mutate.
Track¶
Protocol container with three implementations. IntervalTrack for peaks and genes, SignalTrack for per-base BigWig data, TableTrack for counts and scores.
Operations¶
Protocol-typed steps: map, annotate, filter, discover, quantify, score. Each has a protocol and pluggable implementations. Swap aligners, scorers, or annotators without changing the code that uses them.
Recipes¶
Domain workflows that compose operations. CRISPR guide design, barcode counting, Tn-seq profiling, chromatin annotation. All dependencies injected via __init__.
No class hierarchies. Objects wired together through dependency injection (OLOO). Any object with the right methods satisfies the protocol.