Fermat
|
An efficient warp-synchronous atomic adder, to add or subtract compile-time constants to a shared integer.
Given a pointer to an integer (e.g. representing a "pool"), this class allows the threads in a warp to add (allocate) or subtract (deallocate) a compile-time constant to that integer in a predicated fashion.
#include <warp_atomics.h>
Public Methods | |
__device__ __forceinline__ | warp_static_atomic (uint32 *pool) |
template<uint32 N> | |
__device__ __forceinline__ void | add (bool p) |
template<uint32 N> | |
__device__ __forceinline__ void | sub (bool p) |
template<uint32 N> | |
__device__ __forceinline__ void | add (bool p, uint32 *result) |
template<uint32 N> | |
__device__ __forceinline__ void | sub (bool p, uint32 *result) |
Static Public Methods | |
template<uint32 N> | |
__device__ static __forceinline__ void | static_add (bool p, uint32 *dest) |
template<uint32 N> | |
__device__ static __forceinline__ void | static_sub (bool p, uint32 *dest) |
template<uint32 N> | |
__device__ static __forceinline__ void | static_add (bool p, uint32 *dest, uint32 *result) |
template<uint32 N> | |
__device__ static __forceinline__ void | static_sub (bool p, uint32 *dest, uint32 *result) |
|
inline |
stateful constructor
|
inline |
add zero or exactly N per thread to a shared value without waiting for the result: useful to alloc N entries from a common pool
n | number of elements to alloc |
|
inline |
add zero or exactly N per thread to a shared value: useful to alloc N entries from a common pool
n | number of elements to alloc |
|
inlinestatic |
add zero or exactly N per thread to a shared value without waiting for the result: useful to alloc N entries from a common pool
p | allocation predicate |
dest | the destination of the atomic |
|
inlinestatic |
add zero or exactly N per thread to a shared value: useful to alloc N entries from a common pool
p | allocation predicate |
dest | the destination of the atomic |
result | output result |
|
inlinestatic |
subtract zero or exactly N per thread to a shared value without waiting for the result: useful to dealloc N entries from a common pool
p | allocation predicate |
dest | the destination of the atomic |
|
inlinestatic |
subtract zero or exactly N per thread to a shared value: useful to dealloc N entries from a common pool
p | allocation predicate |
dest | the destination of the atomic |
result | output result |
|
inline |
subtract zero or exactly N per thread to a shared value without waiting for the result: useful to alloc N entries from a common pool
p | allocation predicate |
|
inline |
subtract zero or exactly N per thread to a shared value: useful to alloc N entries from a common pool
p | allocation predicate |