TitleParallel Lossless Data Compression on the GPU (In Proceedings)
inInnovative Parallel Computing
Author(s) Ritesh A Patel, Yao Zhang, Jason Mak, Andrew Davidson, John D. Owens
Keyword(s)GPU Computing, Lossless Data Compression, Burrows-Wheeler Transform
Year May 2012
LocationSan Jose, CA
Abstract We present parallel algorithms and implementations of a bzip2-like lossless data compression scheme for GPU architectures. Our approach parallelizes three main stages in the bzip2 compression pipeline: Burrows-Wheeler transform (BWT), move-to-front transform (MTF), and Huffman coding. In particular, we utilize a two-level hierarchical sort for BWT, design a novel scan-based parallel MTF algorithm, and implement a parallel reduction scheme to build the Huffman tree. For each algorithm, we perform detailed performance analysis, discuss its strengths and weaknesses, and suggest future directions for improvements. Overall, our GPU implementation is dominated by BWT performance and is 2.78x slower than bzip2, with BWT and MTF-Huffman respectively 2.89x and 1.34x slower on average.