Jul 20, 2023

Python module for reading, manipulating and writing genomic data sets

bx-python is a python library and associated set of scripts to allow for rapid implementation of genome scale analyses. The library contains a variety of useful modules, but the particular strengths are

Classes for reading and working with genome-scale multiple local alignments in MAF, AXT, and LAV formats. Generic data structure for indexing on disk files that contain blocks of data associated with intervals on various sequences used, for example, to provide random access to individual alignments in huge files; optomized for use over network filesystems. Data structures for working with intervals on sequences

  • “Binned bitsets” which act just like chromosome sized bit arrays, but lazily allocate regions and allow large blocks of all set or all unset bits to be stored compactly
  • “Intersecter” for performing fast intersection tests that preserve both query and target intervals and associated annotation

Checkout these related ports:
  • Wise - Intelligent algorithms for DNA searches
  • Wfa2-lib - Exact gap-affine algorithm using homology to accelerate alignment
  • Vt - Discovers short variants from Next Generation Sequencing data
  • Vsearch - Versatile open-source tool for metagenomics
  • Viennarna - Alignment tools for the structural analysis of RNA
  • Velvet - Sequence assembler for very short reads
  • Vcftools - Tools for working with VCF genomics files
  • Vcflib - C++ library and CLI tools for parsing and manipulating VCF files
  • Vcf2hap - Generate .hap file from VCF for haplohseq
  • Vcf-split - Split a multi-sample VCF into single-sample VCFs
  • Unikmer - Toolkit for nucleic acid k-mer analysis, set operations on k-mers
  • Unanimity - Pacific Biosciences consensus library and applications
  • Ugene - Integrated bioinformatics toolkit
  • Ucsc-userapps - Command line tools from the UCSC Genome Browser project
  • Trimmomatic - Flexible read trimming tool for Illumina NGS data