Adaptive sparse matrix-matrix multiplication on the GPU

Martin Winter, Daniel Mlakar, Rhaleb Zayer, Hans-Peter Seidel, Markus Steinberger

Publikation: Beitrag in Buch/Bericht/KonferenzbandBeitrag in einem KonferenzbandForschungBegutachtung

Abstract

In the ongoing efforts targeting the vectorization of linear algebra primitives, sparse matrix-matrix multiplication (SpGEMM) has received considerably less attention than sparse Matrix-Vector multiplication (SpMV). While both are equally important, this disparity can be attributed mainly to the additional formidable challenges raised by SpGEMM.
In this paper, we present a dynamic approach for addressing SpGEMM on the GPU. Our approach works directly on the standard compressed sparse rows (CSR) data format. In comparison to previous SpGEMM implementations, our approach guarantees a homogeneous, load-balanced access pattern to the first input matrix and improves memory access to the second input matrix. It adaptively re-purposes GPU threads during execution and maximizes the time efficient on-chip scratchpad memory can be used. Adhering to a completely deterministic scheduling pattern …
Originalspracheenglisch
TitelPPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming
ErscheinungsortNew York, NY
Herausgeber (Verlag)Association of Computing Machinery
Seiten68-81
Seitenumfang14
ISBN (Print)978-1-4503-6225-2
DOIs
PublikationsstatusVeröffentlicht - 2019
Veranstaltung24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming - Washington, DC, USA / Vereinigte Staaten
Dauer: 16 Feb 201920 Feb 2019

Konferenz

Konferenz24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
KurztitelPPoPP '19
LandUSA / Vereinigte Staaten
OrtWashington, DC
Zeitraum16/02/1920/02/19

Fingerprint

Data storage equipment
Linear algebra
Scheduling
Graphics processing unit

Dies zitieren

Winter, M., Mlakar, D., Zayer, R., Seidel, H-P., & Steinberger, M. (2019). Adaptive sparse matrix-matrix multiplication on the GPU. in PPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming (S. 68-81). New York, NY: Association of Computing Machinery. https://doi.org/10.1145/3293883.3295701

Adaptive sparse matrix-matrix multiplication on the GPU. / Winter, Martin; Mlakar, Daniel; Zayer, Rhaleb; Seidel, Hans-Peter; Steinberger, Markus.

PPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming . New York, NY : Association of Computing Machinery, 2019. S. 68-81.

Publikation: Beitrag in Buch/Bericht/KonferenzbandBeitrag in einem KonferenzbandForschungBegutachtung

Winter, M, Mlakar, D, Zayer, R, Seidel, H-P & Steinberger, M 2019, Adaptive sparse matrix-matrix multiplication on the GPU. in PPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming . Association of Computing Machinery, New York, NY, S. 68-81, Washington, DC, USA / Vereinigte Staaten, 16/02/19. https://doi.org/10.1145/3293883.3295701
Winter M, Mlakar D, Zayer R, Seidel H-P, Steinberger M. Adaptive sparse matrix-matrix multiplication on the GPU. in PPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming . New York, NY: Association of Computing Machinery. 2019. S. 68-81 https://doi.org/10.1145/3293883.3295701
Winter, Martin ; Mlakar, Daniel ; Zayer, Rhaleb ; Seidel, Hans-Peter ; Steinberger, Markus. / Adaptive sparse matrix-matrix multiplication on the GPU. PPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming . New York, NY : Association of Computing Machinery, 2019. S. 68-81
@inproceedings{0c519d9d9af44710a7810169b09e7b15,
title = "Adaptive sparse matrix-matrix multiplication on the GPU",
abstract = "In the ongoing efforts targeting the vectorization of linear algebra primitives, sparse matrix-matrix multiplication (SpGEMM) has received considerably less attention than sparse Matrix-Vector multiplication (SpMV). While both are equally important, this disparity can be attributed mainly to the additional formidable challenges raised by SpGEMM.In this paper, we present a dynamic approach for addressing SpGEMM on the GPU. Our approach works directly on the standard compressed sparse rows (CSR) data format. In comparison to previous SpGEMM implementations, our approach guarantees a homogeneous, load-balanced access pattern to the first input matrix and improves memory access to the second input matrix. It adaptively re-purposes GPU threads during execution and maximizes the time efficient on-chip scratchpad memory can be used. Adhering to a completely deterministic scheduling pattern …",
author = "Martin Winter and Daniel Mlakar and Rhaleb Zayer and Hans-Peter Seidel and Markus Steinberger",
year = "2019",
doi = "10.1145/3293883.3295701",
language = "English",
isbn = "978-1-4503-6225-2",
pages = "68--81",
booktitle = "PPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming",
publisher = "Association of Computing Machinery",
address = "United States",

}

TY - GEN

T1 - Adaptive sparse matrix-matrix multiplication on the GPU

AU - Winter, Martin

AU - Mlakar, Daniel

AU - Zayer, Rhaleb

AU - Seidel, Hans-Peter

AU - Steinberger, Markus

PY - 2019

Y1 - 2019

N2 - In the ongoing efforts targeting the vectorization of linear algebra primitives, sparse matrix-matrix multiplication (SpGEMM) has received considerably less attention than sparse Matrix-Vector multiplication (SpMV). While both are equally important, this disparity can be attributed mainly to the additional formidable challenges raised by SpGEMM.In this paper, we present a dynamic approach for addressing SpGEMM on the GPU. Our approach works directly on the standard compressed sparse rows (CSR) data format. In comparison to previous SpGEMM implementations, our approach guarantees a homogeneous, load-balanced access pattern to the first input matrix and improves memory access to the second input matrix. It adaptively re-purposes GPU threads during execution and maximizes the time efficient on-chip scratchpad memory can be used. Adhering to a completely deterministic scheduling pattern …

AB - In the ongoing efforts targeting the vectorization of linear algebra primitives, sparse matrix-matrix multiplication (SpGEMM) has received considerably less attention than sparse Matrix-Vector multiplication (SpMV). While both are equally important, this disparity can be attributed mainly to the additional formidable challenges raised by SpGEMM.In this paper, we present a dynamic approach for addressing SpGEMM on the GPU. Our approach works directly on the standard compressed sparse rows (CSR) data format. In comparison to previous SpGEMM implementations, our approach guarantees a homogeneous, load-balanced access pattern to the first input matrix and improves memory access to the second input matrix. It adaptively re-purposes GPU threads during execution and maximizes the time efficient on-chip scratchpad memory can be used. Adhering to a completely deterministic scheduling pattern …

U2 - 10.1145/3293883.3295701

DO - 10.1145/3293883.3295701

M3 - Conference contribution

SN - 978-1-4503-6225-2

SP - 68

EP - 81

BT - PPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming

PB - Association of Computing Machinery

CY - New York, NY

ER -