Dynamic Scheduling for Efficient Hierarchical Sparse Matrix Operations on the GPU

Andreas Derler; Rhaleb Zayer; Hans-Peter Seidel; Markus Steinberger

doi:10.1145/3079079.3079085

Dynamic Scheduling for Efficient Hierarchical Sparse Matrix Operations on the GPU

Andreas Derler, Rhaleb Zayer, Hans-Peter Seidel, Markus Steinberger

Research output: Chapter in Book/Report/Conference proceeding › Conference paper › peer-review

Abstract

We introduce a hierarchical sparse matrix representation (HiSparse) tailored for the graphics processing unit (GPU). The representation adapts to the local nonzero pattern at all levels of the hierarchy and uses reduced bit length for addressing the entries. This allows a smaller memory footprint than standard formats. Executing algorithms on a hierarchical structure on the GPU usually entails significant synchronization and management overhead or slowdowns due to diverging execution paths and memory access patterns. We address these issues by means of a dynamic scheduling strategy specifically designed for executing algorithms on top of a hierarchical matrix on the GPU. The evaluation of our implementation of basic linear algebra routines, suggests that our hierarchical format is competitive to highly optimized standard libraries and significantly outperforms them in the case of transpose matrix operations. The results point towards the viability of hierarchical matrix formats on massively parallel devices such as the GPU.

Original language	English
Title of host publication	ICS '17: Proceedings of the International Conference on Supercomputing
Place of Publication	New York, NY, USA
Publisher	ACM SIGWEB
Pages	1-10
ISBN (Print)	978-1-4503-5020-4
DOIs	https://doi.org/10.1145/3079079.3079085
Publication status	Published - 2017
Externally published	Yes
Event	International Conference on Supercomputing: ICS 2017 - Chicago, United States Duration: 14 Jun 2017 → 16 Jun 2017

Conference

Conference	International Conference on Supercomputing
Abbreviated title	ICS '17
Country/Territory	United States
City	Chicago
Period	14/06/17 → 16/06/17

Keywords

GPU, hierarchical, linear algebra, sparse matrix

Access to Document

10.1145/3079079.3079085

Cite this

Derler, A, Zayer, R, Seidel, H-P & Steinberger, M 2017, Dynamic Scheduling for Efficient Hierarchical Sparse Matrix Operations on the GPU. in ICS '17: Proceedings of the International Conference on Supercomputing., 7, ACM SIGWEB , New York, NY, USA, pp. 1-10, International Conference on Supercomputing, Chicago, Illinois, United States, 14/06/17. https://doi.org/10.1145/3079079.3079085

@inproceedings{81deed778c9e44b9acb0b16e4d482e09,

title = "Dynamic Scheduling for Efficient Hierarchical Sparse Matrix Operations on the GPU",

abstract = "We introduce a hierarchical sparse matrix representation (HiSparse) tailored for the graphics processing unit (GPU). The representation adapts to the local nonzero pattern at all levels of the hierarchy and uses reduced bit length for addressing the entries. This allows a smaller memory footprint than standard formats. Executing algorithms on a hierarchical structure on the GPU usually entails significant synchronization and management overhead or slowdowns due to diverging execution paths and memory access patterns. We address these issues by means of a dynamic scheduling strategy specifically designed for executing algorithms on top of a hierarchical matrix on the GPU. The evaluation of our implementation of basic linear algebra routines, suggests that our hierarchical format is competitive to highly optimized standard libraries and significantly outperforms them in the case of transpose matrix operations. The results point towards the viability of hierarchical matrix formats on massively parallel devices such as the GPU.",

keywords = "GPU, hierarchical, linear algebra, sparse matrix",

author = "Andreas Derler and Rhaleb Zayer and Hans-Peter Seidel and Markus Steinberger",

year = "2017",

doi = "10.1145/3079079.3079085",

language = "English",

isbn = "978-1-4503-5020-4",

pages = "1--10",

booktitle = "ICS '17: Proceedings of the International Conference on Supercomputing",

publisher = "ACM SIGWEB ",

note = "International Conference on Supercomputing : ICS 2017, ICS '17 ; Conference date: 14-06-2017 Through 16-06-2017",

}

TY - GEN

T1 - Dynamic Scheduling for Efficient Hierarchical Sparse Matrix Operations on the GPU

AU - Derler, Andreas

AU - Zayer, Rhaleb

AU - Seidel, Hans-Peter

AU - Steinberger, Markus

PY - 2017

Y1 - 2017

N2 - We introduce a hierarchical sparse matrix representation (HiSparse) tailored for the graphics processing unit (GPU). The representation adapts to the local nonzero pattern at all levels of the hierarchy and uses reduced bit length for addressing the entries. This allows a smaller memory footprint than standard formats. Executing algorithms on a hierarchical structure on the GPU usually entails significant synchronization and management overhead or slowdowns due to diverging execution paths and memory access patterns. We address these issues by means of a dynamic scheduling strategy specifically designed for executing algorithms on top of a hierarchical matrix on the GPU. The evaluation of our implementation of basic linear algebra routines, suggests that our hierarchical format is competitive to highly optimized standard libraries and significantly outperforms them in the case of transpose matrix operations. The results point towards the viability of hierarchical matrix formats on massively parallel devices such as the GPU.

AB - We introduce a hierarchical sparse matrix representation (HiSparse) tailored for the graphics processing unit (GPU). The representation adapts to the local nonzero pattern at all levels of the hierarchy and uses reduced bit length for addressing the entries. This allows a smaller memory footprint than standard formats. Executing algorithms on a hierarchical structure on the GPU usually entails significant synchronization and management overhead or slowdowns due to diverging execution paths and memory access patterns. We address these issues by means of a dynamic scheduling strategy specifically designed for executing algorithms on top of a hierarchical matrix on the GPU. The evaluation of our implementation of basic linear algebra routines, suggests that our hierarchical format is competitive to highly optimized standard libraries and significantly outperforms them in the case of transpose matrix operations. The results point towards the viability of hierarchical matrix formats on massively parallel devices such as the GPU.

KW - GPU, hierarchical, linear algebra, sparse matrix

U2 - 10.1145/3079079.3079085

DO - 10.1145/3079079.3079085

M3 - Conference paper

SN - 978-1-4503-5020-4

SP - 1

EP - 10

BT - ICS '17: Proceedings of the International Conference on Supercomputing

PB - ACM SIGWEB

CY - New York, NY, USA

T2 - International Conference on Supercomputing

Y2 - 14 June 2017 through 16 June 2017

ER -

Dynamic Scheduling for Efficient Hierarchical Sparse Matrix Operations on the GPU

Abstract

Conference

Keywords

Access to Document

Fingerprint

Cite this