Chromatin Annotation¶
Truth-table lookup over binned signal tracks. Given a set of named signal tracks and a YAML-configured set of state definitions (mark present/absent), assigns a chromatin state label to each genomic bin.
States describe mark combinations only — no gene names, no functional interpretation. Gene features are a separate annotation layer.
Configuration¶
TrackThreshold
dataclass
¶
Per-track threshold for present/absent classification.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Track name (must match a key in the tracks dict). |
required |
threshold
|
float | None
|
Signal value above which the track is "present".
|
None
|
Examples:
StateDefinition
dataclass
¶
A state defined by track presence/absence requirements.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
State label (e.g. |
required |
requirements
|
dict[str, Literal['present', 'absent']]
|
Map of track name to |
required |
Examples:
>>> StateDefinition("open_active", {"ATAC": "present", "H3K4me3": "present"})
StateDefinition(name='open_active', requirements={'ATAC': 'present', 'H3K4me3': 'present'})
ChromatinConfig
dataclass
¶
ChromatinConfig(organism: str, bin_size: int, thresholds: list[TrackThreshold], states: list[StateDefinition], default_state: str = 'unclassified')
Configuration for chromatin state annotation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
organism
|
str
|
Organism name. |
required |
bin_size
|
int
|
Bin size in bp for genome walking. |
required |
thresholds
|
list[TrackThreshold]
|
Per-track thresholds for present/absent calls. |
required |
states
|
list[StateDefinition]
|
Ordered list of state definitions (priority order). |
required |
default_state
|
str
|
Label for windows matching no defined state. |
'unclassified'
|
Examples:
>>> config = ChromatinConfig(
... organism="S. cerevisiae",
... bin_size=200,
... thresholds=[TrackThreshold("ATAC", 2.0)],
... states=[StateDefinition("open", {"ATAC": "present"})],
... default_state="unclassified",
... )
chromatin_config_from_dict
¶
Build a ChromatinConfig from a parsed YAML dict.
Tracks with missing or null threshold values are accepted —
the threshold must be provided at runtime via a threshold function.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
dict
|
Dictionary with chromatin config fields. |
required |
Returns:
| Type | Description |
|---|---|
ChromatinConfig
|
A |
Examples:
Source code in src/seqchain/operations/annotate/chromatin.py
Function¶
annotate_chromatin
¶
annotate_chromatin(config: ChromatinConfig, tracks: dict[str, Track], chroms: dict[str, int], threshold_fn: PredicateFactory | None = None) -> Iterator[Region]
Annotate all chromosomes and yield state-labeled Regions.
Each input track is wrapped in a BinaryAdapter that applies
the per-mark threshold on-the-fly via signal_at(). No
intermediate track is materialized. The output is the lazy
iterator from classify_truth_table() — one Region at a time.
When a track's threshold is None in the config, threshold_fn
is called to compute a predicate from the track's scores.
If both are None, raises ValueError.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
ChromatinConfig
|
A |
required |
tracks
|
dict[str, Track]
|
Named input tracks keyed by track name. |
required |
chroms
|
dict[str, int]
|
Chromosome name to size mapping. |
required |
threshold_fn
|
PredicateFactory | None
|
Optional predicate factory that computes a
predicate from scores. Called when a track's config
threshold is |
None
|
Yields:
| Type | Description |
|---|---|
Region
|
State-labeled Regions with |
Raises:
| Type | Description |
|---|---|
ValueError
|
If a track has no threshold and no threshold_fn. |
Examples: