All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Groups Pages
Bloom Filters
Bloom filters are a probabilistic data-structure useful to conservatively represent membership in a set using less than 1 bit per set item.
NVBIO provides two simple generic classes to incrementally build and work with standard and blocked Bloom filters both on the host and the device:


// populate a filter in parallel
template <typename bloom_filter_type, typename T>
__global__ void populate_kernel(const uint32 N, const T* vector, bloom_filter_type filter)
const uint32 i = threadIdx.x + blockIdx.x * blockDim.x;
if (i < N)
filter.insert( vector[i] );
bool bloom_test()
// build a set of 1M random integers
const uint32 N = 1000000;
// fill it up
for (uint32 i = 0; i < N; ++i)
h_vector[i] = rand();
// copy it to the device
nvbio::vector<device_tag,uint32> d_vector = h_vector;
// construct an empty Bloom filter
typedef bloom_filter<2,hash_functor1,hash_functor2,uint32*> bloom_filter_type;
const uint32 filter_words = N; // let's use 32-bits per input item;
// NOTE: this is still a lot less than 1-bit for each
// of the 4+ billion integers out there...
nvbio::vector<device_tag,uint32> d_filter_storage( filter_words, 0u );
bloom_filter_type d_filter(
filter_words * 32,
plain_view( d_filter_storage ) );
// and populate it
populate_kernel<<<util::divide_ri(N,128),128>>>( N, plain_view( d_vector ), d_filter );
// copy the filter back on the host
nvbio::vector<host_tag,uint32> h_filter_storage = filter_storage;
bloom_filter_type h_filter(
filter_words * 32,
plain_view( h_filter_storage ) );
// and now ask The Question: is 42 in there?
return h_filter.has( 42 );