Skip to content

Region

The universal coordinate record that flows through every pipeline step. Frozen (immutable) so pipeline steps produce new Regions via dataclasses.replace() rather than mutating in place.

Region lives at the package top level (seqchain.region) rather than inside any layer because every layer depends on it: primitives use it, tracks contain it, operations produce it, transforms modify it, compare aligns it, and the API serializes it. It's a frozen dataclass with zero dependencies beyond stdlib — pure data that every other abstraction is defined in terms of, without pulling any of them in.

Region dataclass

Region(chrom: str, start: int, end: int, strand: str = '.', score: float = 0.0, name: str = '', tags: dict = dict())

One element on a Track: a peak, a gene, a hit, a barcode locus.

Replaces the old Hit/Feature/AnnotatedHit proliferation — all are Regions with different tags. Frozen (immutable) so pipeline steps produce new Regions via dataclasses.replace() rather than mutating in place.

Parameters:

Name Type Description Default
chrom str

Chromosome or contig name.

required
start int

Start position (0-based, inclusive).

required
end int

End position (0-based, exclusive).

required
strand str

Strand indicator ("+", "-", or "."). Defaults to ".".

'.'
score float

Numeric score. Defaults to 0.0.

0.0
name str

Display name or identifier. Defaults to "".

''
tags dict

Arbitrary key-value metadata. Domain-specific data lives here rather than as extra fields.

dict()

Examples:

>>> Region("chr1", 100, 200, strand="+", name="geneA")
Region(chrom='chr1', start=100, end=200, strand='+', score=0.0, name='geneA', tags={})

length property

length: int

Interval length in base pairs.

Returns:

Type Description
int

end - start.

Examples:

>>> Region("chr1", 100, 200).length
100

center property

center: int

Integer midpoint of the interval.

Returns:

Type Description
int

(start + end) // 2.

Examples:

>>> Region("chr1", 100, 200).center
150