NVBIO
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Public Methods | Static Public Methods | Public Members | List of all members
nvbio::cuda::CompressionSort Struct Reference

Detailed description

A sorting enactor for sorting strings or string suffixes using Iterative Compression Sorting - an algorithm inspired by Tero Karras and Timo Aila's Flexible Parallel Sorting through Iterative Key Compression.
The algorithm is essentially a parallel MSD radix sort, where the digits are words obtained packing as many symbols as possible into a 32-bit word. At each radix pass, the sorter prunes completely sorted strings, and keeps sorting ties using Sean Baxter's fast segmented sort algorithm available inside his Modern-GPU library.

Definition at line 131 of file compression_sort.h.

#include <compression_sort.h>

Public Methods

 CompressionSort (mgpu::ContextPtr _mgpu)
 
template<typename string_type , typename output_iterator , typename delay_list_type >
void sort (const typename string_type::index_type string_len, const string_type string, const uint32 n_suffixes, output_iterator d_suffixes, const uint32 delay_min_threshold, const uint32 delay_max_threshold, delay_list_type &delay_list)
 
template<typename set_type , typename input_iterator , typename output_iterator , typename delay_list_type >
void sort (set_type set, const uint32 n_strings, const uint32 max_words, input_iterator d_input, output_iterator d_output, const uint32 delay_threshold, delay_list_type &delay_list, const uint32 slice_size=8u)
 
void reserve (const uint32 n)
 
uint64 allocated_device_memory () const
 

Static Public Methods

static uint64 needed_device_memory (const uint32 n)
 

Public Members

float extract_time
 timing stats More...
 
float radixsort_time
 timing stats More...
 
float stablesort_time
 timing stats More...
 
float compress_time
 timing stats More...
 
float compact_time
 timing stats More...
 
float copy_time
 timing stats More...
 
float scatter_time
 timing stats More...
 

Constructor & Destructor Documentation

nvbio::cuda::CompressionSort::CompressionSort ( mgpu::ContextPtr  _mgpu)
inline

constructor

Definition at line 135 of file compression_sort.h.

Member Function Documentation

uint64 nvbio::cuda::CompressionSort::allocated_device_memory ( ) const
inline

return the amount of used device memory

Definition at line 255 of file compression_sort.h.

static uint64 nvbio::cuda::CompressionSort::needed_device_memory ( const uint32  n)
inlinestatic

return the amount of needed device memory

Definition at line 241 of file compression_sort.h.

void nvbio::cuda::CompressionSort::reserve ( const uint32  n)
inline

reserve enough storage for sorting n strings/suffixes

Definition at line 220 of file compression_sort.h.

template<typename string_type , typename output_iterator , typename delay_list_type >
void nvbio::cuda::CompressionSort::sort ( const typename string_type::index_type  string_len,
const string_type  string,
const uint32  n_suffixes,
output_iterator  d_suffixes,
const uint32  delay_min_threshold,
const uint32  delay_max_threshold,
delay_list_type &  delay_list 
)

Sort a given batch of suffixes using Iterative Compression Sorting - an algorithm inspired by Tero Karras and Timo Aila's Flexible Parallel Sorting through Iterative Key Compression. The sorter will attempt to sort all suffixes based on at least d and at most D words (each formed by packing as many characters as possible in a 32-bit word). If after d words there are ties which need to be solved, and the suffixes forming ties are less than 1/1000 of the input, those suffixes will be saved in a delay list. Analogously, after D words have been sorted, any ties, no matter how many, will be saved to the delay list.

Parameters
string_lenstring length
stringstring iterator
n_suffixesnumber of suffixes to sort
d_indicesdevice vector of input suffixes
d_suffixesdevice vector of output suffixes
delay_min_thresholdminimum number of words d used for sorting, before delaying is enabled
delay_max_thresholdmaximum number of words D used for sorting
delay_listthe output delay list

All the other parameters are temporary device buffers

Sufsort

Definition at line 301 of file compression_sort.h.

template<typename set_type , typename input_iterator , typename output_iterator , typename delay_list_type >
void nvbio::cuda::CompressionSort::sort ( set_type  set,
const uint32  n_strings,
const uint32  max_words,
input_iterator  d_input,
output_iterator  d_output,
const uint32  delay_threshold,
delay_list_type &  delay_list,
const uint32  slice_size = 8u 
)

Sort a given batch of strings using Iterative Compression Sorting - an algorithm inspired by Tero Karras and Timo Aila's "Flexible Parallel Sorting through Iterative Key Compression".

Template Parameters
set_typea set of "word" strings, needs to provide the following interface:
interface WordStringSet
{
// prepare to extract a "slice" of words from a given subset of the strings
void init_slice(
const uint32 n_strings, // number of strings in the slice
const uint32* strings, // a device pointer to the subset of strings in the slice (if NULL, the range [0,n_strings) is implied)
const uint32 word_range_begin, // the beginning of the range of words that will be required
const uint32 word_range_end); // the end of the range of words that will be required
// extract the given word from the given subset of strings
void extract(
const uint32 n_strings, // number of strings in the slice
const uint32* strings, // a device pointer to the subset of strings in the slice (if NULL, the range [0,n_strings) is implied)
const uint32 word_idx, // the index of the word to be extraced from each string
const uint32 word_range_begin, // the beginning of the range of words defining the slice we are operating on
const uint32 word_range_end, // the end of the range of words defining the slice we are operating on
uint32* out_words); // the output set of words extracted from the given strings
};
Parameters
setthe set of items to sort
n_stringsthe number of items to sort
d_inputdevice vector of input indices
d_outputdevice vector of output indices

All the other parameters are temporary device buffers

Definition at line 586 of file compression_sort.h.

Member Data Documentation

float nvbio::cuda::CompressionSort::compact_time

timing stats

Definition at line 272 of file compression_sort.h.

float nvbio::cuda::CompressionSort::compress_time

timing stats

Definition at line 271 of file compression_sort.h.

float nvbio::cuda::CompressionSort::copy_time

timing stats

Definition at line 273 of file compression_sort.h.

float nvbio::cuda::CompressionSort::extract_time

timing stats

Definition at line 268 of file compression_sort.h.

float nvbio::cuda::CompressionSort::radixsort_time

timing stats

Definition at line 269 of file compression_sort.h.

float nvbio::cuda::CompressionSort::scatter_time

timing stats

Definition at line 274 of file compression_sort.h.

float nvbio::cuda::CompressionSort::stablesort_time

timing stats

Definition at line 270 of file compression_sort.h.


The documentation for this struct was generated from the following file: