FASTA¶
Stdlib-only FASTA parsing — no BioPython required.
from seqchain.io.fasta import iter_fasta, load_chrom_sizes
for name, seq in iter_fasta("reference.fa"):
print(f"{name}: {len(seq)} bp")
sizes = load_chrom_sizes("reference.fa")
iter_fasta
¶
Iterate over records in a FASTA file, yielding (name, sequence).
Stdlib-only FASTA parser — no BioPython required. Supports plain
and gzip-compressed (.gz) files. Sequences are returned as
uppercase strings.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Path to a FASTA file. |
required |
Yields:
| Type | Description |
|---|---|
tuple[str, str]
|
|
Examples:
Source code in src/seqchain/io/fasta.py
load_chrom_sizes
¶
Load chromosome sizes from a FASTA file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
path
|
str | Path
|
Path to a FASTA file. |
required |
Returns:
| Type | Description |
|---|---|
dict[str, int]
|
Dict of |
Raises:
| Type | Description |
|---|---|
FileNotFoundError
|
If the file does not exist. |
Examples: