Adaptive sparse matrix-matrix multiplication on the GPU

Martin Winter; Daniel Mlakar; Rhaleb Zayer; Hans-Peter Seidel; Markus Steinberger

doi:10.1145/3293883.3295701

Adaptive sparse matrix-matrix multiplication on the GPU

Martin Winter, Daniel Mlakar, Rhaleb Zayer, Hans-Peter Seidel, Markus Steinberger

Institute of Computer Graphics and Vision (7100)

Research output: Chapter in Book/Report/Conference proceeding › Conference paper › peer-review

Abstract

In the ongoing efforts targeting the vectorization of linear algebra primitives, sparse matrix-matrix multiplication (SpGEMM) has received considerably less attention than sparse Matrix-Vector multiplication (SpMV). While both are equally important, this disparity can be attributed mainly to the additional formidable challenges raised by SpGEMM.
In this paper, we present a dynamic approach for addressing SpGEMM on the GPU. Our approach works directly on the standard compressed sparse rows (CSR) data format. In comparison to previous SpGEMM implementations, our approach guarantees a homogeneous, load-balanced access pattern to the first input matrix and improves memory access to the second input matrix. It adaptively re-purposes GPU threads during execution and maximizes the time efficient on-chip scratchpad memory can be used. Adhering to a completely deterministic scheduling pattern …

Original language	English
Title of host publication	PPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming
Place of Publication	New York, NY
Publisher	Association of Computing Machinery
Pages	68-81
Number of pages	14
ISBN (Print)	978-1-4503-6225-2
DOIs	https://doi.org/10.1145/3293883.3295701
Publication status	Published - 2019
Event	24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming - Washington, DC, United States Duration: 16 Feb 2019 → 20 Feb 2019

Conference

Conference	24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Abbreviated title	PPoPP '19
Country/Territory	United States
City	Washington, DC
Period	16/02/19 → 20/02/19

Access to Document

10.1145/3293883.3295701

Cite this

Adaptive sparse matrix-matrix multiplication on the GPU. / Winter, Martin; Mlakar, Daniel; Zayer, Rhaleb et al.
PPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming . New York, NY: Association of Computing Machinery, 2019. p. 68-81.

Research output: Chapter in Book/Report/Conference proceeding › Conference paper › peer-review

Winter, M, Mlakar, D, Zayer, R, Seidel, H-P & Steinberger, M 2019, Adaptive sparse matrix-matrix multiplication on the GPU. in PPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming . Association of Computing Machinery, New York, NY, pp. 68-81, 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, Washington, DC, District of Columbia, United States, 16/02/19. https://doi.org/10.1145/3293883.3295701

@inproceedings{0c519d9d9af44710a7810169b09e7b15,

title = "Adaptive sparse matrix-matrix multiplication on the GPU",

abstract = "In the ongoing efforts targeting the vectorization of linear algebra primitives, sparse matrix-matrix multiplication (SpGEMM) has received considerably less attention than sparse Matrix-Vector multiplication (SpMV). While both are equally important, this disparity can be attributed mainly to the additional formidable challenges raised by SpGEMM.In this paper, we present a dynamic approach for addressing SpGEMM on the GPU. Our approach works directly on the standard compressed sparse rows (CSR) data format. In comparison to previous SpGEMM implementations, our approach guarantees a homogeneous, load-balanced access pattern to the first input matrix and improves memory access to the second input matrix. It adaptively re-purposes GPU threads during execution and maximizes the time efficient on-chip scratchpad memory can be used. Adhering to a completely deterministic scheduling pattern …",

author = "Martin Winter and Daniel Mlakar and Rhaleb Zayer and Hans-Peter Seidel and Markus Steinberger",

year = "2019",

doi = "10.1145/3293883.3295701",

language = "English",

isbn = "978-1-4503-6225-2",

pages = "68--81",

booktitle = "PPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming",

publisher = "Association of Computing Machinery",

address = "United States",

note = "24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '19 ; Conference date: 16-02-2019 Through 20-02-2019",

}

TY - GEN

T1 - Adaptive sparse matrix-matrix multiplication on the GPU

AU - Winter, Martin

AU - Mlakar, Daniel

AU - Zayer, Rhaleb

AU - Seidel, Hans-Peter

AU - Steinberger, Markus

PY - 2019

Y1 - 2019

N2 - In the ongoing efforts targeting the vectorization of linear algebra primitives, sparse matrix-matrix multiplication (SpGEMM) has received considerably less attention than sparse Matrix-Vector multiplication (SpMV). While both are equally important, this disparity can be attributed mainly to the additional formidable challenges raised by SpGEMM.In this paper, we present a dynamic approach for addressing SpGEMM on the GPU. Our approach works directly on the standard compressed sparse rows (CSR) data format. In comparison to previous SpGEMM implementations, our approach guarantees a homogeneous, load-balanced access pattern to the first input matrix and improves memory access to the second input matrix. It adaptively re-purposes GPU threads during execution and maximizes the time efficient on-chip scratchpad memory can be used. Adhering to a completely deterministic scheduling pattern …

AB - In the ongoing efforts targeting the vectorization of linear algebra primitives, sparse matrix-matrix multiplication (SpGEMM) has received considerably less attention than sparse Matrix-Vector multiplication (SpMV). While both are equally important, this disparity can be attributed mainly to the additional formidable challenges raised by SpGEMM.In this paper, we present a dynamic approach for addressing SpGEMM on the GPU. Our approach works directly on the standard compressed sparse rows (CSR) data format. In comparison to previous SpGEMM implementations, our approach guarantees a homogeneous, load-balanced access pattern to the first input matrix and improves memory access to the second input matrix. It adaptively re-purposes GPU threads during execution and maximizes the time efficient on-chip scratchpad memory can be used. Adhering to a completely deterministic scheduling pattern …

U2 - 10.1145/3293883.3295701

DO - 10.1145/3293883.3295701

M3 - Conference paper

SN - 978-1-4503-6225-2

SP - 68

EP - 81

BT - PPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming

PB - Association of Computing Machinery

CY - New York, NY

T2 - 24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming

Y2 - 16 February 2019 through 20 February 2019

ER -

Adaptive sparse matrix-matrix multiplication on the GPU

Abstract

Conference

Access to Document

Fingerprint

Cite this