Adaptive sparse matrix-matrix multiplication on the GPU

Martin Winter, Daniel Mlakar, Rhaleb Zayer, Hans-Peter Seidel, Markus Steinberger

Research output: Chapter in Book/Report/Conference proceedingConference contributionResearchpeer-review

Abstract

In the ongoing efforts targeting the vectorization of linear algebra primitives, sparse matrix-matrix multiplication (SpGEMM) has received considerably less attention than sparse Matrix-Vector multiplication (SpMV). While both are equally important, this disparity can be attributed mainly to the additional formidable challenges raised by SpGEMM.
In this paper, we present a dynamic approach for addressing SpGEMM on the GPU. Our approach works directly on the standard compressed sparse rows (CSR) data format. In comparison to previous SpGEMM implementations, our approach guarantees a homogeneous, load-balanced access pattern to the first input matrix and improves memory access to the second input matrix. It adaptively re-purposes GPU threads during execution and maximizes the time efficient on-chip scratchpad memory can be used. Adhering to a completely deterministic scheduling pattern …
Original languageEnglish
Title of host publicationPPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming
Place of PublicationNew York, NY
PublisherAssociation of Computing Machinery
Pages68-81
Number of pages14
ISBN (Print)978-1-4503-6225-2
DOIs
Publication statusPublished - 2019
Event24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming - Washington, DC, United States
Duration: 16 Feb 201920 Feb 2019

Conference

Conference24th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
Abbreviated titlePPoPP '19
CountryUnited States
CityWashington, DC
Period16/02/1920/02/19

    Fingerprint

Cite this

Winter, M., Mlakar, D., Zayer, R., Seidel, H-P., & Steinberger, M. (2019). Adaptive sparse matrix-matrix multiplication on the GPU. In PPoPP '19, Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming (pp. 68-81). New York, NY: Association of Computing Machinery. https://doi.org/10.1145/3293883.3295701