MNC: Structure-Exploiting Sparsity Estimation for Matrix Expressions

Johanna Sommer, Matthias Boehm, Alexandre V. Evfimievski, Berthold Reinwald, Peter J. Haas

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

Efficiently computing linear algebra expressions is central to machine learning (ML) systems. Most systems support sparse formats and operations because sparse matrices are ubiquitous and their dense representation can cause prohibitive overheads. Estimating the sparsity of intermediates, however, remains a key challenge when generating execution plans or performing sparse operations. These sparsity estimates are used for cost and memory estimates, format decisions, and result allocation. Existing estimators tend to focus on matrix products only, and struggle to attain good accuracy with low estimation overhead. However, a key observation is that real-world sparse matrices commonly exhibit structural properties such as a single non-zero per row, or columns with varying sparsity. In this paper, we introduce MNC (Matrix Non-zero Count), a remarkably simple, count-based matrix synopsis that exploits these structural properties for efficient, accurate, and general sparsity estimation. We describe estimators and sketch propagation for realistic linear algebra expressions. Our experiments - on a new estimation benchmark called SparsEst - show that the MNC estimator yields good accuracy with very low overhead. This behavior makes MNC practical and broadly applicable in ML systems.
Original languageEnglish
Title of host publicationSIGMOD
Pages1607-1623
ISBN (Electronic)978-1-4503-5643-5
DOIs
Publication statusPublished - 2019
EventSIGMOD '19 - Amsterdam, Netherlands
Duration: 30 Jun 20195 Jul 2019

Conference

ConferenceSIGMOD '19
Country/TerritoryNetherlands
CityAmsterdam
Period30/06/195/07/19

Fingerprint

Dive into the research topics of 'MNC: Structure-Exploiting Sparsity Estimation for Matrix Expressions'. Together they form a unique fingerprint.

Cite this