Compressed linear algebra for declarative large-scale machine learning

Ahmed Elgohary; Matthias Boehm; Peter J. Haas; Frederick R. Reiss; Berthold Reinwald

doi:10.1145/3318221

Compressed linear algebra for declarative large-scale machine learning

Ahmed Elgohary, Matthias Boehm, Peter J. Haas, Frederick R. Reiss, Berthold Reinwald

Institute of Interactive Systems and Data Science (7060)

Publikation: Beitrag in einer Fachzeitschrift › Artikel › Begutachtung

Abstract

Large-scale Machine Learning (ML) algorithms are often iterative, using repeated read-only data access and I/O-bound matrix-vector multiplications. Hence, it is crucial for performance to fit the data into single-node or distributed main memory to enable fast matrix-vector operations. General-purpose compression struggles to achieve both good compression ratios and fast decompression for block-wise uncompressed operations. Therefore, we introduce Compressed Linear Algebra (CLA) for lossless matrix compression. CLA encodes matrices with lightweight, value-based compression techniques and executes linear algebra operations directly on the compressed representations. We contribute effective column compression schemes, cache-conscious operations, and an efficient sampling-based compression algorithm. Our experiments show good compression ratios and operations performance close to the uncompressed case, which enables fitting larger datasets into available memory. We thereby obtain significant end-to-end performance improvements.

Originalsprache	englisch
Seiten (von - bis)	83-91
Fachzeitschrift	Communications of the ACM
Jahrgang	62
Ausgabenummer	5
DOIs	https://doi.org/10.1145/3318221
Publikationsstatus	Veröffentlicht - 2019

Zugriff auf Dokument

10.1145/3318221

Dieses zitieren

@article{85c737603541411293dabd1bdea6c857,

title = "Compressed linear algebra for declarative large-scale machine learning",

abstract = "Large-scale Machine Learning (ML) algorithms are often iterative, using repeated read-only data access and I/O-bound matrix-vector multiplications. Hence, it is crucial for performance to fit the data into single-node or distributed main memory to enable fast matrix-vector operations. General-purpose compression struggles to achieve both good compression ratios and fast decompression for block-wise uncompressed operations. Therefore, we introduce Compressed Linear Algebra (CLA) for lossless matrix compression. CLA encodes matrices with lightweight, value-based compression techniques and executes linear algebra operations directly on the compressed representations. We contribute effective column compression schemes, cache-conscious operations, and an efficient sampling-based compression algorithm. Our experiments show good compression ratios and operations performance close to the uncompressed case, which enables fitting larger datasets into available memory. We thereby obtain significant end-to-end performance improvements.",

author = "Ahmed Elgohary and Matthias Boehm and Haas, {Peter J.} and Reiss, {Frederick R.} and Berthold Reinwald",

year = "2019",

doi = "10.1145/3318221",

language = "English",

volume = "62",

pages = "83--91",

journal = "Communications of the ACM",

issn = "0001-0782",

publisher = "Association of Computing Machinery",

number = "5",

}

TY - JOUR

T1 - Compressed linear algebra for declarative large-scale machine learning

AU - Elgohary, Ahmed

AU - Boehm, Matthias

AU - Haas, Peter J.

AU - Reiss, Frederick R.

AU - Reinwald, Berthold

PY - 2019

Y1 - 2019

N2 - Large-scale Machine Learning (ML) algorithms are often iterative, using repeated read-only data access and I/O-bound matrix-vector multiplications. Hence, it is crucial for performance to fit the data into single-node or distributed main memory to enable fast matrix-vector operations. General-purpose compression struggles to achieve both good compression ratios and fast decompression for block-wise uncompressed operations. Therefore, we introduce Compressed Linear Algebra (CLA) for lossless matrix compression. CLA encodes matrices with lightweight, value-based compression techniques and executes linear algebra operations directly on the compressed representations. We contribute effective column compression schemes, cache-conscious operations, and an efficient sampling-based compression algorithm. Our experiments show good compression ratios and operations performance close to the uncompressed case, which enables fitting larger datasets into available memory. We thereby obtain significant end-to-end performance improvements.

AB - Large-scale Machine Learning (ML) algorithms are often iterative, using repeated read-only data access and I/O-bound matrix-vector multiplications. Hence, it is crucial for performance to fit the data into single-node or distributed main memory to enable fast matrix-vector operations. General-purpose compression struggles to achieve both good compression ratios and fast decompression for block-wise uncompressed operations. Therefore, we introduce Compressed Linear Algebra (CLA) for lossless matrix compression. CLA encodes matrices with lightweight, value-based compression techniques and executes linear algebra operations directly on the compressed representations. We contribute effective column compression schemes, cache-conscious operations, and an efficient sampling-based compression algorithm. Our experiments show good compression ratios and operations performance close to the uncompressed case, which enables fitting larger datasets into available memory. We thereby obtain significant end-to-end performance improvements.

U2 - 10.1145/3318221

DO - 10.1145/3318221

M3 - Article

SN - 0001-0782

VL - 62

SP - 83

EP - 91

JO - Communications of the ACM

JF - Communications of the ACM

IS - 5

ER -

Compressed linear algebra for declarative large-scale machine learning

Abstract

Zugriff auf Dokument

Fingerprint

Dieses zitieren