CUB
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Groups
Public Methods | Static Public Members | List of all members
cub::CachingDeviceAllocator Struct Reference

Detailed description

A simple caching allocator for device memory allocations.

Overview
The allocator is thread-safe and stream-safe and is capable of managing cached device allocations on multiple devices. It behaves as follows:
  • Allocations from the allocator are associated with an active_stream. Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.
  • Allocations are categorized and cached by bin size. A new allocation request of a given size will only consider cached allocations within the corresponding bin.
  • Bin limits progress geometrically in accordance with the growth factor bin_growth provided during construction. Unused device allocations within a larger bin cache are not reused for allocation requests that categorize to smaller bin sizes.
  • Allocation requests below (bin_growth ^ min_bin) are rounded up to (bin_growth ^ min_bin).
  • Allocations above (bin_growth ^ max_bin) are not rounded up to the nearest bin and are simply freed when they are deallocated instead of being returned to a bin-cache.
  • If the total storage of cached allocations on a given device will exceed max_cached_bytes, allocations for that device are simply freed when they are deallocated instead of being returned to their bin-cache.
For example, the default-constructed CachingDeviceAllocator is configured with:
  • bin_growth = 8
  • min_bin = 3
  • max_bin = 7
  • max_cached_bytes = 6MB - 1B
which delineates five bin-sizes: 512B, 4KB, 32KB, 256KB, and 2MB and sets a maximum of 6,291,455 cached bytes per device
Examples:
example_device_partition_flagged.cu, example_device_partition_if.cu, example_device_radix_sort.cu, example_device_reduce.cu, example_device_scan.cu, example_device_select_flagged.cu, example_device_select_if.cu, and example_device_select_unique.cu.

Public Methods

 CachingDeviceAllocator (unsigned int bin_growth, unsigned int min_bin=1, unsigned int max_bin=INVALID_BIN, size_t max_cached_bytes=INVALID_SIZE, bool skip_cleanup=false, bool debug=false)
 Constructor. More...
 
 CachingDeviceAllocator (bool skip_cleanup=false, bool debug=false)
 Default constructor. More...
 
cudaError_t SetMaxCachedBytes (size_t max_cached_bytes)
 Sets the limit on the number bytes this allocator is allowed to cache per device. More...
 
cudaError_t DeviceAllocate (int device, void **d_ptr, size_t bytes, cudaStream_t active_stream=0)
 Provides a suitable allocation of device memory for the given size on the specified device. More...
 
cudaError_t DeviceAllocate (void **d_ptr, size_t bytes, cudaStream_t active_stream=0)
 Provides a suitable allocation of device memory for the given size on the current device. More...
 
cudaError_t DeviceFree (int device, void *d_ptr)
 Frees a live allocation of device memory on the specified device, returning it to the allocator. More...
 
cudaError_t DeviceFree (void *d_ptr)
 Frees a live allocation of device memory on the current device, returning it to the allocator. More...
 
cudaError_t FreeAllCached ()
 Frees all cached device allocations on all devices.
 
virtual ~CachingDeviceAllocator ()
 Destructor.
 

Static Public Members

static const unsigned int INVALID_BIN = (unsigned int) -1
 Out-of-bounds bin.
 
static const size_t INVALID_SIZE = (size_t) -1
 Invalid size.
 

Constructor & Destructor Documentation

cub::CachingDeviceAllocator::CachingDeviceAllocator ( unsigned int  bin_growth,
unsigned int  min_bin = 1,
unsigned int  max_bin = INVALID_BIN,
size_t  max_cached_bytes = INVALID_SIZE,
bool  skip_cleanup = false,
bool  debug = false 
)
inline

Constructor.

Parameters
bin_growthGeometric growth factor for bin-sizes
min_binMinimum bin (default is bin_growth ^ 1)
max_binMaximum bin (default is no max bin)
max_cached_bytesMaximum aggregate cached bytes per device (default is no limit)
skip_cleanupWhether or not to skip a call to FreeAllCached() when the destructor is called (default is to deallocate)
debugWhether or not to print (de)allocation events to stdout (default is no stderr output)
cub::CachingDeviceAllocator::CachingDeviceAllocator ( bool  skip_cleanup = false,
bool  debug = false 
)
inline

Default constructor.

Configured with:

  • bin_growth = 8
  • min_bin = 3
  • max_bin = 7
  • max_cached_bytes = (bin_growth ^ max_bin) * 3) - 1 = 6,291,455 bytes

which delineates five bin-sizes: 512B, 4KB, 32KB, 256KB, and 2MB and sets a maximum of 6,291,455 cached bytes per device

Member Function Documentation

cudaError_t cub::CachingDeviceAllocator::SetMaxCachedBytes ( size_t  max_cached_bytes)
inline

Sets the limit on the number bytes this allocator is allowed to cache per device.

Changing the ceiling of cached bytes does not cause any allocations (in-use or cached-in-reserve) to be freed. See FreeAllCached().

cudaError_t cub::CachingDeviceAllocator::DeviceAllocate ( int  device,
void **  d_ptr,
size_t  bytes,
cudaStream_t  active_stream = 0 
)
inline

Provides a suitable allocation of device memory for the given size on the specified device.

Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.

Parameters
[in]deviceDevice on which to place the allocation
[out]d_ptrReference to pointer to the allocation
[in]bytesMinimum number of bytes for the allocation
[in]active_streamThe stream to be associated with this allocation
cudaError_t cub::CachingDeviceAllocator::DeviceAllocate ( void **  d_ptr,
size_t  bytes,
cudaStream_t  active_stream = 0 
)
inline

Provides a suitable allocation of device memory for the given size on the current device.

Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.

Parameters
[out]d_ptrReference to pointer to the allocation
[in]bytesMinimum number of bytes for the allocation
[in]active_streamThe stream to be associated with this allocation
cudaError_t cub::CachingDeviceAllocator::DeviceFree ( int  device,
void *  d_ptr 
)
inline

Frees a live allocation of device memory on the specified device, returning it to the allocator.

Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.

cudaError_t cub::CachingDeviceAllocator::DeviceFree ( void *  d_ptr)
inline

Frees a live allocation of device memory on the current device, returning it to the allocator.

Once freed, the allocation becomes available immediately for reuse within the active_stream with which it was associated with during allocation, and it becomes available for reuse within other streams when all prior work submitted to active_stream has completed.


The documentation for this struct was generated from the following file: