Jul 20, 2023

Efficient pythonic random access to fasta subsequences

FASTA is a format to exchange generic information, partial or of the entire organism.

A function “faidx” FAsta InDeX creates a small flat index file “.fai” allowing for fast random access to any subsequence in the indexed FASTA file, while loading a minimal amount of the file in to memory. This python module implements pure Python classes for indexing, retrieval, and in-place modification of FASTA files using a samtools compatible index. The pyfaidx module is API compatible with the pygr seqdb module. A command-line script “faidx” is installed alongside the pyfaidx module, and facilitates complex manipulation of FASTA files without any programming knowledge.

