Coordinates¶
Pure functions for genomic coordinate math: circular genome wrapping, interval overlap, distance calculation, strand-aware offset, and promoter region extraction.
Normalization¶
normalize
¶
Wrap a position into the range [0, length).
When circular is False, returns pos unchanged — the caller
is on a linear chromosome and wrapping is not applicable.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pos
|
int
|
Genomic position (may be negative or >= length). |
required |
length
|
int
|
Chromosome or sequence length. |
required |
circular
|
bool
|
Whether to apply modular wrapping. Defaults to
|
True
|
Returns:
| Type | Description |
|---|---|
int
|
Position wrapped to |
Examples:
Source code in src/seqchain/primitives/coordinates.py
Overlap & distance¶
interval_overlap
¶
interval_overlap(a_start: int, a_end: int, b_start: int, b_end: int, chrom_length: int | None = None) -> int
Compute the overlap in bp between two intervals.
In linear mode (chrom_length is None), performs a standard
interval overlap calculation.
In circular mode (chrom_length given), coordinates are first
normalised with % chrom_length and origin-wrapping features
(where start > end after normalisation) are handled. This
replicates the logic of legacy get_overlap from targets.py.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
a_start
|
int
|
Start of the first interval (the "target"). |
required |
a_end
|
int
|
End of the first interval. |
required |
b_start
|
int
|
Start of the second interval (the "feature"). |
required |
b_end
|
int
|
End of the second interval. |
required |
chrom_length
|
int | None
|
Chromosome length for circular mode.
|
None
|
Returns:
| Type | Description |
|---|---|
int
|
Overlap in base pairs. |
Examples:
Source code in src/seqchain/primitives/coordinates.py
distance_to_feature
¶
distance_to_feature(pos: int, feat_start: int, feat_end: int, chrom_length: int | None = None) -> int
Compute the shortest distance from a position to a feature.
Returns 0 when pos is inside [feat_start, feat_end).
When chrom_length is given, the shortest circular-genome distance
is returned (wrapping around the origin if that path is shorter).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pos
|
int
|
Query position. |
required |
feat_start
|
int
|
Feature start (inclusive). |
required |
feat_end
|
int
|
Feature end (exclusive). |
required |
chrom_length
|
int | None
|
Chromosome length for circular mode.
|
None
|
Returns:
| Type | Description |
|---|---|
int
|
Distance in base pairs (always >= 0). |
Examples:
Source code in src/seqchain/primitives/coordinates.py
relative_position
¶
Compute the fractional position within a feature.
Returns 0.0 at feat_start and 1.0 at feat_end. Values
outside [0, 1] indicate the position is outside the feature.
When chrom_length is positive and the feature uses virtual
coordinates (feat_end > chrom_length), positions in the wrapped
portion (pos < feat_start) are shifted by chrom_length so the
fraction is computed correctly.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
pos
|
int
|
Query position. |
required |
feat_start
|
int
|
Feature start. |
required |
feat_end
|
int
|
Feature end. |
required |
chrom_length
|
int
|
Chromosome length for origin-wrapping features.
|
0
|
Returns:
| Type | Description |
|---|---|
float
|
Fraction in the range |
float
|
zero-length features. |
Examples:
Source code in src/seqchain/primitives/coordinates.py
offset_in_feature
¶
offset_in_feature(target_start: int, target_end: int, feat_start: int, feat_end: int, strand: str | int, chrom_length: int) -> int
Compute a strand-aware offset of a target within a genomic feature.
Replicates the logic of legacy get_offset from targets.py.
All four coordinates are normalised with % chrom_length before
the calculation, and origin-wrapping genes (where feat_start >
feat_end after normalisation) are handled correctly.
For forward-strand features the offset is the distance from the feature start to the target start. For reverse-strand features it is the distance from the target end to the feature end.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
target_start
|
int
|
Target start position. |
required |
target_end
|
int
|
Target end position. |
required |
feat_start
|
int
|
Feature (gene) start position. |
required |
feat_end
|
int
|
Feature (gene) end position. |
required |
strand
|
str | int
|
Strand indicator. Accepted values: |
required |
chrom_length
|
int
|
Chromosome length for coordinate normalisation. |
required |
Returns:
| Type | Description |
|---|---|
int
|
Signed offset in base pairs. |
Examples:
Raises:
| Type | Description |
|---|---|
ValueError
|
If strand is not a recognized strand indicator. |
Source code in src/seqchain/primitives/coordinates.py
Region geometry¶
expand_region
¶
expand_region(start: int, end: int, upstream: int, downstream: int, chrom_length: int | None = None) -> tuple[int, int]
Expand an interval by the given number of bases on each side.
In linear mode the start is clamped to 0 (no negative coordinates). In circular mode coordinates wrap via modulo.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
start
|
int
|
Interval start. |
required |
end
|
int
|
Interval end. |
required |
upstream
|
int
|
Bases to add before start. |
required |
downstream
|
int
|
Bases to add after end. |
required |
chrom_length
|
int | None
|
Chromosome length for circular wrapping.
|
None
|
Returns:
| Type | Description |
|---|---|
tuple[int, int]
|
|
Examples:
Source code in src/seqchain/primitives/coordinates.py
promoter_region
¶
promoter_region(gene_start: int, gene_end: int, strand: str | int, upstream: int = 500) -> tuple[int, int]
Derive promoter coordinates from gene boundaries and strand.
The promoter is the region of upstream bp immediately before the transcription start site. For forward-strand genes that is upstream of gene_start; for reverse-strand genes it is downstream of gene_end.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gene_start
|
int
|
Gene start position (0-based). |
required |
gene_end
|
int
|
Gene end position (exclusive). |
required |
strand
|
str | int
|
Strand indicator. Accepted values: |
required |
upstream
|
int
|
Promoter length in bp. Defaults to 500. |
500
|
Returns:
| Type | Description |
|---|---|
int
|
|
int
|
negative for genes near position 0; use |
tuple[int, int]
|
|
Examples:
Raises:
| Type | Description |
|---|---|
ValueError
|
If strand is not a recognized strand indicator. |
Source code in src/seqchain/primitives/coordinates.py
terminator_region
¶
terminator_region(gene_start: int, gene_end: int, strand: str | int, downstream: int = 500) -> tuple[int, int]
Derive terminator coordinates from gene boundaries and strand.
The terminator is the region of downstream bp immediately after the transcription stop site. For forward-strand genes that is downstream of gene_end; for reverse-strand genes it is upstream of gene_start.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
gene_start
|
int
|
Gene start position (0-based). |
required |
gene_end
|
int
|
Gene end position (exclusive). |
required |
strand
|
str | int
|
Strand indicator. Accepted values: |
required |
downstream
|
int
|
Terminator length in bp. Defaults to 500. |
500
|
Returns:
| Type | Description |
|---|---|
int
|
|
int
|
negative for reverse-strand genes near position 0; use |
tuple[int, int]
|
|
Examples:
Raises:
| Type | Description |
|---|---|
ValueError
|
If strand is not a recognized strand indicator. |