Jul 20, 2023

C++ library and CLI tools for parsing and manipulating VCF files

The Variant Call Format VCF is a flat-file, tab-delimited textual format intended to concisely describe reference-indexed variations between individuals. VCF provides a common interchange format for the description of variation in individuals and populations of samples, and has become the defacto standard reporting format for a wide array of genomic variant detectors.

vcflib provides methods to manipulate and interpret sequence variation as it can be described by VCF. It is both

an API for parsing and operating on records of genomic variation as it can
be described by the VCF format

and a collection of command-line utilities for executing complex
manipulations on VCF files.

The API itself provides a quick and extremely permissive method to read and write VCF files. Extensions and applications of the library provided in the included utilities *.cpp comprise the vast bulk of the library’s utility for most users.

Checkout these related ports:
  • Wise - Intelligent algorithms for DNA searches
  • Wfa2-lib - Exact gap-affine algorithm using homology to accelerate alignment
  • Vt - Discovers short variants from Next Generation Sequencing data
  • Vsearch - Versatile open-source tool for metagenomics
  • Viennarna - Alignment tools for the structural analysis of RNA
  • Velvet - Sequence assembler for very short reads
  • Vcftools - Tools for working with VCF genomics files
  • Vcf2hap - Generate .hap file from VCF for haplohseq
  • Vcf-split - Split a multi-sample VCF into single-sample VCFs
  • Unikmer - Toolkit for nucleic acid k-mer analysis, set operations on k-mers
  • Unanimity - Pacific Biosciences consensus library and applications
  • Ugene - Integrated bioinformatics toolkit
  • Ucsc-userapps - Command line tools from the UCSC Genome Browser project
  • Trimmomatic - Flexible read trimming tool for Illumina NGS data
  • Trimadap - Trim adapter sequences from Illumina data using heuristic rules