Skip to content

Alignment

Pair regions across two tracks for comparison. Two modes:

  • exact — match by region name (e.g., gene names, barcode IDs)
  • overlap — bin the genome and compare signal in each bin

Returns AlignedPair objects that record which track each region came from and its value in each condition.

AlignedPair dataclass

AlignedPair(region: Region, value_a: float, value_b: float, in_a: bool = True, in_b: bool = True)

A single paired observation from two Tracks.

Parameters:

Name Type Description Default
region Region

The coordinate or key this pair refers to.

required
value_a float

Observation from track a.

required
value_b float

Observation from track b.

required
in_a bool

Whether the region was present in track a.

True
in_b bool

Whether the region was present in track b.

True

Examples:

>>> AlignedPair(Region("chr1", 100, 200), 5.0, 10.0, True, True)
AlignedPair(region=Region(...), value_a=5.0, value_b=10.0, in_a=True, in_b=True)

align_tracks

align_tracks(a: Track, b: Track, *, missing: float = float('nan'), mode: Literal['exact', 'overlap'] = 'exact', bin_size: int = 200, chrom_sizes: dict[str, int] | None = None) -> Iterator[AlignedPair]

Align two Tracks and yield paired observations.

Auto-detects the track types and dispatches to the appropriate alignment mode. Both tracks must be the same type.

Parameters:

Name Type Description Default
a Track

First track.

required
b Track

Second track.

required
missing float

Value to use when a region exists in one track but not the other. Defaults to NaN so missing data propagates through arithmetic rather than silently producing results from imputed zeros.

float('nan')
mode Literal['exact', 'overlap']

IntervalTrack alignment mode. "exact" (default) pairs only regions with identical boundaries. "overlap" uses the union of all boundaries and queries overlap-weighted signal (ragged boundaries produce redundant pairs). Ignored for TableTrack and SignalTrack alignment.

'exact'
bin_size int

Window size for SignalTrack alignment. Defaults to 200.

200
chrom_sizes dict[str, int] | None

Chromosome sizes for SignalTrack alignment. Required when aligning SignalTracks.

None

Returns:

Type Description
Iterator[AlignedPair]

Iterator of AlignedPair objects.

Raises:

Type Description
TypeError

If track types don't match or aren't supported.

ValueError

If chrom_sizes is missing for SignalTrack alignment or if mode is not "exact" or "overlap".

Examples:

>>> list(align_tracks(table_a, table_b))
[AlignedPair(...), ...]
Source code in src/seqchain/compare/align.py
def align_tracks(
    a: Track,
    b: Track,
    *,
    missing: float = float("nan"),
    mode: Literal["exact", "overlap"] = "exact",
    bin_size: int = 200,
    chrom_sizes: dict[str, int] | None = None,
) -> Iterator[AlignedPair]:
    """Align two Tracks and yield paired observations.

    Auto-detects the track types and dispatches to the appropriate
    alignment mode. Both tracks must be the same type.

    Args:
        a: First track.
        b: Second track.
        missing: Value to use when a region exists in one track
            but not the other. Defaults to ``NaN`` so missing data
            propagates through arithmetic rather than silently
            producing results from imputed zeros.
        mode: IntervalTrack alignment mode. ``"exact"`` (default)
            pairs only regions with identical boundaries.
            ``"overlap"`` uses the union of all boundaries and
            queries overlap-weighted signal (ragged boundaries
            produce redundant pairs). Ignored for TableTrack and
            SignalTrack alignment.
        bin_size: Window size for SignalTrack alignment.
            Defaults to ``200``.
        chrom_sizes: Chromosome sizes for SignalTrack alignment.
            Required when aligning SignalTracks.

    Returns:
        Iterator of `AlignedPair` objects.

    Raises:
        TypeError: If track types don't match or aren't supported.
        ValueError: If chrom_sizes is missing for SignalTrack alignment
            or if mode is not ``"exact"`` or ``"overlap"``.

    Examples:
        >>> list(align_tracks(table_a, table_b))
        [AlignedPair(...), ...]
    """
    if isinstance(a, IntervalTrack) and isinstance(b, IntervalTrack):
        yield from _align_interval_tracks(a, b, missing=missing, mode=mode)
    elif isinstance(a, TableTrack) and isinstance(b, TableTrack):
        yield from _align_table_tracks(a, b, missing=missing)
    elif isinstance(a, SignalTrack) and isinstance(b, SignalTrack):
        if chrom_sizes is None:
            raise ValueError(
                "chrom_sizes is required for SignalTrack alignment"
            )
        yield from _align_signal_tracks(
            a, b, missing=missing, bin_size=bin_size,
            chrom_sizes=chrom_sizes,
        )
    else:
        raise TypeError(
            f"Cannot align {type(a).__name__} with {type(b).__name__}. "
            "Both tracks must be the same type "
            "(IntervalTrack, TableTrack, or SignalTrack)."
        )