Towards Real-Time Single-Channel Singing-Voice Separation with Pruned Multi-Scaled DenseNets

Markus Huber, Günther Schindler, Wolfgang Roth, Holger Fröning, Christian Schörkhuber, Franz Pernkopf

Publikation: Beitrag in Buch/Bericht/KonferenzbandBeitrag in einem KonferenzbandBegutachtung

Abstract

Modern musical source separation systems based on deep neural networks reach unprecedented levels of separation quality. However, harnessing the power of these large-scale models in typical audio production environments, which frequently offer only limited computing resources while demanding real-time processing, remains challenging. We extend the multi-scaled DenseNet in several aspects to facilitate real-time source separation scenarios. Specifically, we reduce the computational requirements by inferring Mel-scaled masks and decrease the model size via effective use of bottleneck layers, while improving performance using a deep clustering objective. In addition, we are able to further increase the model efficiency by applying parameterized structured pruning of convolutional weights without any significant impact on the separation performance. We significantly reduce the model size and increase the computational efficiency by a factor of 1.6 and 4.3, respectively, while maintaining the separation performance.

Originalspracheenglisch
Titel2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings
Seiten806-810
Seitenumfang5
ISBN (elektronisch)9781509066315
DOIs
PublikationsstatusVeröffentlicht - Mai 2020
Veranstaltung2020 IEEE International Conference on Acoustics, Speech and Signal Processing: ICASSP 2020 - Virtuell, Barcelona, Spanien
Dauer: 4 Mai 20208 Mai 2020

Publikationsreihe

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Band2020-May
ISSN (Print)1520-6149

Konferenz

Konferenz2020 IEEE International Conference on Acoustics, Speech and Signal Processing
KurztitelICASSP 2020
Land/GebietSpanien
OrtVirtuell, Barcelona
Zeitraum4/05/208/05/20

ASJC Scopus subject areas

  • Software
  • Signalverarbeitung
  • Elektrotechnik und Elektronik

Fingerprint

Untersuchen Sie die Forschungsthemen von „Towards Real-Time Single-Channel Singing-Voice Separation with Pruned Multi-Scaled DenseNets“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren