NVBIO is a library of reusable components designed by NVIDIA to accelerate bioinformatics applications using CUDA. Though it is specifically designed to unleash the power of NVIDIAGPUs, most of its components are completely cross-platform and can be used both from host C++ and device CUDA code.
The purpose of NVBIO is twofold: it can be thought of both as a solid basis to build new, modern applications targeting GPUs, which deferring the core computations to a library will always automatically and transparently benefit from new advances in GPU computing, and as example material to design novel bioinformatics algorithms for massively parallel architectures.
Additionally, NVBIO contains a suite of applications built on top of it, including a re-engineered implementation of the famous Bowtie2 short read aligner. Unlike many prototypes, nvBowtie is an attempt to build an industrial strength aligner, reproducing most of Bowtie2's original features as well as adding a few more, such as efficient support for direct BAM (and soon CRAM) output.
Similarly, it contains the fastest BWT builders for strings and string-sets available to date. NVBIO can in fact perform the BWT of individual strings containing several billion characters or string-sets containing up to ~100 billion symbols at an unprecedented speed of roughly 80M symbols/s, on a single Tesla K40.
Here's a short list of some of the most noteworthy features of NVBIO:
novel, state-of-the-art parallel Q-Gram (or k-mer) Index construction and lookup
Dynamic Programming Alignment with parallel implementations of most scoring algorithms (e.g. Edit-Distance, Smith-Waterman, Gotoh,...) with support for global, semi-global or local alignment, in both banded and full-matrix modes, on arbitrary string types
Added many Bowtie2-compatible command line options to nvBowtie
Added –ungapped-mates option to nvBowtie, which doubles speed at a minor cost in sensitivity (<0.2%)
Fixed several memory-corruption bugs in nvBowtie's paired-end alignment
Added a new Alphabets module for managing different alphabets.
Added a new Sequence IO module for managing all kinds of sequence data; unlike the old io::ReadData, the containers in this module support various types of alphabet encodings, and can be used both for loading reads as well as for reading or mapping large reference data.
Rewrote the FM-Index IO module; the new io::FMIndexData offers more uniform interfaces, and unlike its predecessor no longer holds any reference data, which can now be separately loaded through the new Sequence IO module.