>> Real-time Parallel Hashing on the GPU

This work has been superseded by our newer hash table design;
download my dissertation for the new one.

Overview of our construction using a voxelized Lucy model, colored by
mapping x, y, and z coordinates to red, green, and blue respectively (far left).
The 3.5 million voxels (left) are inputted as 32-bit keys into our hash table.
Keys are placed into buckets of ≤ 512 keys (center), averaging 409 per bucket.
Each bucket then builds a cuckoo hash with three subtables and stores them in a
larger structure with 5 million entries (right).
The close-ups follow the progress of a single bucket, showing the keys allocated to it (center) and
each of its completed cuckoo subtables (right).
Finding any key requires checking only three possible locations.

Timings on random data, comparing hash table construction and retrieval versus radix sort and binary search. Retrievals search the structures for all of the inputted keys in randomized order. These timings were collected using CUDA 2.2.