Abstract
We introduce a hierarchical sparse matrix representation (HiSparse) tailored for the graphics processing unit (GPU). The representation adapts to the local nonzero pattern at all levels of the hierarchy and uses reduced bit length for addressing the entries. This allows a smaller memory footprint than standard formats. Executing algorithms on a hierarchical structure on the GPU usually entails significant synchronization and management overhead or slowdowns due to diverging execution paths and memory access patterns. We address these issues by means of a dynamic scheduling strategy specifically designed for executing algorithms on top of a hierarchical matrix on the GPU. The evaluation of our implementation of basic linear algebra routines, suggests that our hierarchical format is competitive to highly optimized standard libraries and significantly outperforms them in the case of transpose matrix operations. The results point towards the viability of hierarchical matrix formats on massively parallel devices such as the GPU.
Originalsprache | englisch |
---|---|
Titel | ICS '17: Proceedings of the International Conference on Supercomputing |
Erscheinungsort | New York, NY, USA |
Herausgeber (Verlag) | ACM SIGWEB |
Seiten | 1-10 |
ISBN (Print) | 978-1-4503-5020-4 |
DOIs | |
Publikationsstatus | Veröffentlicht - 2017 |
Extern publiziert | Ja |
Veranstaltung | International Conference on Supercomputing: ICS 2017 - Chicago, USA / Vereinigte Staaten Dauer: 14 Juni 2017 → 16 Juni 2017 |
Konferenz
Konferenz | International Conference on Supercomputing |
---|---|
Kurztitel | ICS '17 |
Land/Gebiet | USA / Vereinigte Staaten |
Ort | Chicago |
Zeitraum | 14/06/17 → 16/06/17 |