NVBIO
|
Implements a multi-pass WorkQueue that uses thrust::copy_if to compact continuations between kernel launches (see Work-Queues). Mantains input ordering in the thread assignment, hence mantaining memory access coherence, but has potentially much higher overhead than the Ordered queue if the queue capacity is low. High continuation overhead at low queue capacities. At high queue capacities (e.g. millions of elements), the kernel launch overheads are amortized.
see Work-Queues
Definition at line 56 of file work_queue_multipass.h.
#include <work_queue_multipass.h>
Classes | |
struct | Context |
Public Types | |
typedef WorkUnitT | WorkUnit |
Public Methods | |
WorkQueue () | |
void | set_capacity (const uint32 capacity) |
void | set_separate_loads (const bool flag) |
template<typename WorkStream > | |
void | consume (const WorkStream stream, WorkQueueStats *stats=NULL) |
template<typename WorkStream , typename WorkMover > | |
void | consume (const WorkStream stream, const WorkMover mover, WorkQueueStats *stats=NULL) |
typedef WorkUnitT nvbio::cuda::WorkQueue< MultiPassQueueTag, WorkUnitT, BLOCKDIM >::WorkUnit |
Definition at line 61 of file work_queue_multipass.h.
|
inline |
constructor
Definition at line 65 of file work_queue_multipass.h.
|
inline |
consume a stream of work units
Definition at line 78 of file work_queue_multipass.h.
|
inline |
set queue capacity
Definition at line 69 of file work_queue_multipass.h.
|
inline |
enable separate loads
Definition at line 73 of file work_queue_multipass.h.