NVBIO
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
NVBIO
  

NVBIO is a library of reusable components designed by NVIDIA to accelerate bioinformatics applications using CUDA. Though it is specifically designed to unleash the power of NVIDIA GPUs, most of its components are completely cross-platform and can be used both from host C++ and device CUDA code.
The purpose of NVBIO is twofold: it can be thought of both as a solid basis to build new, modern applications targeting GPUs, which deferring the core computations to a library will always automatically and transparently benefit from new advances in GPU computing, and as example material to design novel bioinformatics algorithms for massively parallel architectures.
Additionally, NVBIO contains a suite of applications built on top of it, including a re-engineered implementation of the famous Bowtie2 short read aligner. Unlike many prototypes, nvBowtie is an attempt to build an industrial strength aligner, reproducing most of Bowtie2's original features as well as adding a few more, such as efficient support for direct BAM (and soon CRAM) output.
Similarly, it contains the fastest BWT builders for strings and string-sets available to date. NVBIO can in fact perform the BWT of individual strings containing several billion characters or string-sets containing up to ~100 billion symbols at an unprecedented speed of roughly 80M symbols/s, on a single Tesla K40.

Features

Here's a short list of some of the most noteworthy features of NVBIO:
  • many strings and string-sets representations, including packed string encodings for arbitrary alphabets
  • FM-index and Bidirectional FM-index construction and lookup algorithms on arbitrary strings, and compile-time alphabets.
  • novel, state-of-the-art parallel suffix sorting and BWT construction algorithms for very large strings and string sets, on arbitrary alphabets
  • Wavelet Trees and other succint data structures for large alphabet strings
  • efficient parallel k-Maximal Extension Match filters
  • novel, state-of-the-art parallel Q-Gram (or k-mer) Index construction and lookup
  • Dynamic Programming Alignment with parallel implementations of most scoring algorithms (e.g. Edit-Distance, Smith-Waterman, Gotoh,...) with support for global, semi-global or local alignment, in both banded and full-matrix modes, on arbitrary string types
  • parallel Bloom Filter construction and lookup
  • state-of-the-art support for many parallel primitives, like sorting, reduction, prefix-sum, stream compaction, run-length encoding, etc
  • state-of-the-art I/O of common sequencing formats (e.g. FASTA, FASTQ, BAM), with transparent, parallel compression/decompression
importantly, all of the above algorithms and data-structures have both CPU and GPU implementations, easily selected through template specialization.

Links

   Browse or fork NVBIO at GitHub
   The NVBIO users forum
   Download NVBIO 1.1

Documentation

Documentation for the NVBIO suite can be found here:
  • NVBIO - the NVBIO library
  • nvBowtie - a re-engineered implementation of the famous Bowtie2 short read aligner
  • nvBWT - a tool to perform BWT-based reference indexing
  • nvSSA - a tool to build auxiliary Sampled Suffix Arrays needed for reference indexing
  • nvFM-server - a shared memory FM-index server
  • nvSetBWT - a tool to perform BWT-based read indexing
  • nvLighter - a re-engineered implementation of the Lighter error corrector

Recent News

13/2/2015
NVBIO 1.1.50 (dev)
  • New Features:
    • Added nvLighter, a re-engineered implementation of the Lighter error corrector.
    • Added support for sequencing data output in many formats (.txt, FASTA, FASTQ, with optional gzip and LZ4 compression).
13/2/2015
NVBIO 1.1
24/1/2015
NVBIO 1.0
  • New Features:
    • Major stability and specificity and sensitivity improvements to nvBowtie
    • Added support for finding and reporting discordant alignments
    • Improved direct BAM output speed by over an order of magnitude with transparent parallel compression
13/1/2015
NVBIO 0.9.99
  • New Features:
    • Made nvBowtie's command line compatible with Bowtie2
    • Added many missing features to nvBowtie, including support for:
      • trimming (-3/–trim3, -5/–trim5)
      • forward/reverse alignment (–nofw,–norc)
      • presets (–fast,–very-fast,–sensitive,–very-sensitive)
20/12/2014
NVBIO 0.9.98
  • New Features:
    • Added multi-GPU support to nvBowtie, together with many accuracy and sensitivity improvements
    • Released a new version of nvSetBWT based on our freshly developed massively parallel algorithm: http://arxiv.org/pdf/1410.0562.pdf
15/09/2014
NVBIO 0.9.97
  • New Features:
    • Added many Bowtie2-compatible command line options to nvBowtie
    • Added –ungapped-mates option to nvBowtie, which doubles speed at a minor cost in sensitivity (<0.2%)
  • Bug Fixes:
    • Fixed several memory-corruption bugs in nvBowtie's paired-end alignment
26/05/2014
NVBIO 0.9.90
  • New Features:
    • Added a new Alphabets module for managing different alphabets.
    • Added a new Sequence IO module for managing all kinds of sequence data; unlike the old io::ReadData, the containers in this module support various types of alphabet encodings, and can be used both for loading reads as well as for reading or mapping large reference data.
    • Rewrote the FM-Index IO module; the new io::FMIndexData offers more uniform interfaces, and unlike its predecessor no longer holds any reference data, which can now be separately loaded through the new Sequence IO module.
    • Rewrote the Batch Alignment Schedulers, adding support for a new OpenMP host backend.
16/05/2014
NVBIO 0.9.7
  • New Features:
    • Simplified the interfaces for Packed Streams
    • Added a suite of easy-to-use and highly efficient host / device parallel primitives
    • More uniform handling of nvbio::vector views
  • Bug Fixes:
    • Fixed option parsing bug in nvBWT.
08/05/2014
NVBIO 0.9.6
  • New Features:
    • Added a set of step-by-step introductory tutorials
    • Sped up MEM filtering
15/04/2014
NVBIO 0.9.5
  • New Features:
    • Added a whole new Q-Gram Module for q-gram indexing and q-gram counting.
    • Added a banded Myers bit-vector edit-distance algorithm running just short of 1 TCUPS!
    • Added parallel primitives for simple and efficient seed extraction from string and string-sets
    • Added parallel FM-index and MEM filters
    • Further improved GPU BWT construction throughput by ~20%.
    • Added a set of examples showing how to build q-gram and FM-index based all-mappers in a few lines of code, as well as how to extract all MEMs from a set of reads
  • Bug Fixes:
    • Fixed access violation bug in ModernGPU's SegmentedSort (thanks Sean Baxter), used at the core of our suffix sorting routines
    • Fixed FMIndexDataDevice constructor bug when loading the reverse index

Dependencies

NVBIO depends on the following external libraries:

Requirements

NVBIO has been designed for GPUs supporting at least CUDA's Compute Capability 3.5. Due to the high memory requirements typical of bionformatics algorithms, Tesla K20, K20x or K40 are recommended.

Licensing

NVBIO has been developed by NVIDIA Corporation and is licensed under BSD.

Contributors

The main contributors of NVBIO are Jacopo Pantaleoni and Nuno Subtil.