Prodigal

Jul 20, 2023

Protein-coding gene prediction for prokaryotic genomes

Fast, reliable protein-coding gene prediction for prokaryotic genomes.

Features

  • Predicts protein-coding genes Prodigal provides fast, accurate protein-coding gene predictions in GFF3, Genbank, or Sequin table format.
  • Handles draft genomes and metagenomes Prodigal runs smoothly on finished genomes, draft genomes, and metagenomes.
  • Runs quickly Prodigal analyzes the E. coli K-12 genome in 10 seconds on a modern MacBook Pro.
  • Runs unsupervised Prodigal is an unsupervised machine learning algorithm. It does not need to be provided with any training data, and instead automatically learns the properties of the genome from the sequence itself, including RBS motif usage, start codon usage, and coding statistics.
  • Handles gaps and partial genes The user can specify if Prodigal should build genes across runs of N’s as well as how to handle genes at the edges of contigs.
  • Identifies translation initiation sites Prodigal predicts the correct translation initiation site for most genes, and can output information about every potential start site in the genome, including confidence score, RBS motif, and much more.


Checkout these related ports:
  • Wise - Intelligent algorithms for DNA searches
  • Wfa2-lib - Exact gap-affine algorithm using homology to accelerate alignment
  • Vt - Discovers short variants from Next Generation Sequencing data
  • Vsearch - Versatile open-source tool for metagenomics
  • Viennarna - Alignment tools for the structural analysis of RNA
  • Velvet - Sequence assembler for very short reads
  • Vcftools - Tools for working with VCF genomics files
  • Vcflib - C++ library and CLI tools for parsing and manipulating VCF files
  • Vcf2hap - Generate .hap file from VCF for haplohseq
  • Vcf-split - Split a multi-sample VCF into single-sample VCFs
  • Unikmer - Toolkit for nucleic acid k-mer analysis, set operations on k-mers
  • Unanimity - Pacific Biosciences consensus library and applications
  • Ugene - Integrated bioinformatics toolkit
  • Ucsc-userapps - Command line tools from the UCSC Genome Browser project
  • Trimmomatic - Flexible read trimming tool for Illumina NGS data