NVBIO
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Modules | Classes | Typedefs | Enumerations | Functions
Sequence Data Input

Detailed Description

This module contains a series of classes to load and represent read streams. The idea is that a read stream is an object implementing a simple interface, SequenceDataStream, which allows to stream through a file or other set of reads in batches, which are represented in memory with an object inheriting from SequenceData. There are several kinds of SequenceData containers to keep the reads in the host RAM, or in CUDA device memory. Additionally, the same container can be viewed with different SequenceData Views, in order to allow reinterpreting the base arrays as arrays of different types, e.g. to perform vector loads or use LDG.

Modules

 SequenceData Views
 
 SequenceIODetail
 

Classes

struct  nvbio::io::SequenceDataInfo
 
struct  nvbio::io::SequenceData
 
struct  nvbio::io::SequenceDataStorage< system_tag >
 
struct  nvbio::io::SequenceDataInputStream
 
struct  nvbio::io::SequenceDataOutputStream
 
struct  nvbio::io::SequenceDataAccess< SEQUENCE_ALPHABET_T, SequenceDataT >
 
struct  nvbio::io::SequenceDataEdit< SEQUENCE_ALPHABET_T, SequenceDataT >
 
struct  nvbio::io::SequenceDataMMAPServer
 
struct  nvbio::io::SequenceDataMMAP
 
struct  nvbio::io::SequenceDataTraits< SEQUENCE_ALPHABET >
 

Typedefs

typedef SequenceDataStorage
< host_tag > 
nvbio::io::SequenceDataHost
 a SequenceData object stored in host memory More...
 
typedef SequenceDataStorage
< device_tag > 
nvbio::io::SequenceDataDevice
 a SequenceData object stored in device memory More...
 
typedef SequenceDataInputStream nvbio::io::SequenceDataStream
 

Enumerations

enum  nvbio::io::QualityEncoding {
  nvbio::io::Phred = 0, nvbio::io::Phred33 = 1, nvbio::io::Phred64 = 2, nvbio::io::Solexa = 3,
  nvbio::io::Phred = 0, nvbio::io::Phred33 = 1, nvbio::io::Phred64 = 2, nvbio::io::Solexa = 3
}
 
enum  nvbio::io::SequenceEncoding { nvbio::io::FORWARD = 0x0001, nvbio::io::REVERSE = 0x0002, nvbio::io::FORWARD_COMPLEMENT = 0x0004, nvbio::io::REVERSE_COMPLEMENT = 0x0008 }
 
enum  nvbio::io::SequenceFlags { nvbio::io::SEQUENCE_DATA = 0x0001, nvbio::io::SEQUENCE_QUALS = 0x0002, nvbio::io::SEQUENCE_NAMES = 0x0004 }
 
enum  nvbio::io::PairedEndPolicy {
  nvbio::io::PE_POLICY_FF = 0, nvbio::io::PE_POLICY_FR = 1, nvbio::io::PE_POLICY_RF = 2, nvbio::io::PE_POLICY_RR = 3,
  nvbio::io::PE_POLICY_FF = 0, nvbio::io::PE_POLICY_FR = 1, nvbio::io::PE_POLICY_RF = 2, nvbio::io::PE_POLICY_RR = 3
}
 

Functions

NVBIO_FORCEINLINE
NVBIO_HOST_DEVICE bool 
nvbio::io::operator== (const SequenceDataInfo &op1, const SequenceDataInfo &op2)
 
NVBIO_FORCEINLINE
NVBIO_HOST_DEVICE bool 
nvbio::io::operator!= (const SequenceDataInfo &op1, const SequenceDataInfo &op2)
 
template<Alphabet ALPHABET, typename SequenceDataT >
SequenceDataAccess< ALPHABET,
SequenceDataT > 
nvbio::io::make_access (const SequenceDataT &data)
 
SequenceDataMMAP * nvbio::io::map_sequence_file (const char *sequence_file_name)
 
int next (const Alphabet alphabet, SequenceDataHost *data, SequenceDataInputStream *stream, const uint32 batch_size, const uint32 batch_bps=uint32(-1))
 
int append (const Alphabet alphabet, SequenceDataHost *data, SequenceDataInputStream *stream, const uint32 batch_size, const uint32 batch_bps=uint32(-1))
 
int skip (SequenceDataInputStream *stream, const uint32 batch_size)
 
SequenceDataInputStream * open_sequence_file (const char *sequence_file_name, const QualityEncoding qualities=Phred33, const uint32 max_seqs=uint32(-1), const uint32 max_sequence_len=uint32(-1), const SequenceEncoding flags=FORWARD, const uint32 trim3=0, const uint32 trim5=0)
 
SequenceDataOutputStream * open_output_sequence_file (const char *sequence_file_name, const char *compression)
 

Typedef Documentation

typedef SequenceDataStorage<device_tag> nvbio::io::SequenceDataDevice

a SequenceData object stored in device memory

Definition at line 593 of file sequence.h.

typedef SequenceDataStorage<host_tag> nvbio::io::SequenceDataHost

a SequenceData object stored in host memory

Definition at line 592 of file sequence.h.

typedef SequenceDataInputStream nvbio::io::SequenceDataStream

legacy typedef

Definition at line 619 of file sequence.h.

Enumeration Type Documentation

Enumerator
PE_POLICY_FF 
PE_POLICY_FR 
PE_POLICY_RF 
PE_POLICY_RR 
PE_POLICY_FF 
PE_POLICY_FR 
PE_POLICY_RF 
PE_POLICY_RR 

Definition at line 190 of file sequence.h.

Enumerator
Phred 

phred quality

Phred33 

phred quality + 33

Phred64 

phred quality + 64

Solexa 

Solexa quality.

Phred 

phred quality

Phred33 

phred quality + 33

Phred64 

phred quality + 64

Solexa 

Solexa quality.

Definition at line 163 of file sequence.h.

Enumerator
FORWARD 
REVERSE 
FORWARD_COMPLEMENT 
REVERSE_COMPLEMENT 

Definition at line 172 of file sequence.h.

Enumerator
SEQUENCE_DATA 
SEQUENCE_QUALS 
SEQUENCE_NAMES 

Definition at line 181 of file sequence.h.

Function Documentation

int append ( const Alphabet  alphabet,
SequenceDataHost data,
SequenceDataInputStream stream,
const uint32  batch_size,
const uint32  batch_bps = uint32(-1) 
)
related

utility method to append the next batch from a SequenceDataInputStream

Definition at line 504 of file sequence_encoder.cpp.

template<Alphabet ALPHABET, typename SequenceDataT >
SequenceDataAccess<ALPHABET,SequenceDataT> nvbio::io::make_access ( const SequenceDataT &  data)

Definition at line 207 of file sequence_access.h.

SequenceDataMMAP * nvbio::io::map_sequence_file ( const char *  sequence_file_name)

map a sequence file into mapped system memory

Parameters
sequence_file_namethe file to open
Returns
a heap allocated SequenceDataMMAP object on success, NULL otherwise

Definition at line 113 of file sequence_mmap.cpp.

int next ( const Alphabet  alphabet,
SequenceDataHost data,
SequenceDataInputStream stream,
const uint32  batch_size,
const uint32  batch_bps = uint32(-1) 
)
related

utility method to get the next batch from a SequenceDataInputStream

Definition at line 455 of file sequence_encoder.cpp.

SequenceDataOutputStream * open_output_sequence_file ( const char *  sequence_file_name,
const char *  compression 
)
related

factory method to open a read file for writing

Parameters
sequence_file_namethe file to open
compressioncompression options

Definition at line 276 of file sequence_priv.cpp.

SequenceDataInputStream * open_sequence_file ( const char *  sequence_file_name,
const QualityEncoding  qualities = Phred33,
const uint32  max_seqs = uint32(-1),
const uint32  max_sequence_len = uint32(-1),
const SequenceEncoding  flags = FORWARD,
const uint32  trim3 = 0,
const uint32  trim5 = 0 
)
related

factory method to open a read file

Parameters
sequence_file_namethe file to open
qualitiesthe encoding of the qualities
max_seqsmaximum number of reads to input
max_sequence_lenmaximum read length - reads will be truncated
flagsa set of flags indicating which strands to encode in the batch for each read. For example, passing FORWARD | REVERSE_COMPLEMENT will result in a stream containing BOTH the forward and reverse-complemented strands.

Definition at line 85 of file sequence_priv.cpp.

NVBIO_FORCEINLINE NVBIO_HOST_DEVICE bool nvbio::io::operator!= ( const SequenceDataInfo &  op1,
const SequenceDataInfo &  op2 
)

comparison operator for SequenceDataInfo

Definition at line 263 of file sequence.h.

NVBIO_FORCEINLINE NVBIO_HOST_DEVICE bool nvbio::io::operator== ( const SequenceDataInfo &  op1,
const SequenceDataInfo &  op2 
)

comparison operator for SequenceDataInfo

Definition at line 244 of file sequence.h.

int skip ( SequenceDataInputStream stream,
const uint32  batch_size 
)
related

utility method to skip a batch from a SequenceDataInputStream