High-Performance Spatial Data Compression for Scientific Applications

Abstract
We implement an efficient data compression algorithm that reduces the memory footprint of spatial datasets generated during scientific simulations. Storing regularly these datasets is typically needed for checkpoint/restart or for post-processing purposes. Our lossy compression approach, codenamed HLRcompress (https://gitlab.mis.mpg.de/rok/HLRcompress), combines a hierarchical low-rank approximation technique with binary compression. This novel hybrid method is agnostic to the particular domain of application. We study the impact of HLRcompress on accuracy using synthetic datasets to demonstrate the software capabilities, including robustness and versatility. We assess different algebraic compression methods and report performance results on various parallel architectures. We then integrate it into a workflow of a direct numerical simulation solver for turbulent combustion on distributed-memory systems. We compress the generated snapshots during time integration using accuracy thresholds for each individual chemical species, without degrading the practical accuracy of the overall pressure and temperature. We eventually compare against state-of-the-art compression software. Our implementation achieves on average greater than 100-fold compression of the original size of the datasets.

Citation
Kriemann, R., Ltaief, H., Luong, M. B., Pérez, F. E. H., Im, H. G., & Keyes, D. (2022). High-Performance Spatial Data Compression for Scientific Applications. Lecture Notes in Computer Science, 403–418. https://doi.org/10.1007/978-3-031-12597-3_25

Acknowledgements
For computer time, this research used Shaheen-2 Supercomputer hosted at the Supercomputing Laboratory at KAUST.

Publisher
Springer International Publishing

DOI
10.1007/978-3-031-12597-3_25

Additional Links
https://link.springer.com/10.1007/978-3-031-12597-3_25

Permanent link to this record