Towards Real-Time Single-Channel Singing-Voice Separation with Pruned Multi-Scaled DenseNets

Markus Huber, Günther Schindler, Wolfgang Roth, Holger Fröning, Christian Schörkhuber, Franz Pernkopf

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

Modern musical source separation systems based on deep neural networks reach unprecedented levels of separation quality. However, harnessing the power of these large-scale models in typical audio production environments, which frequently offer only limited computing resources while demanding real-time processing, remains challenging. We extend the multi-scaled DenseNet in several aspects to facilitate real-time source separation scenarios. Specifically, we reduce the computational requirements by inferring Mel-scaled masks and decrease the model size via effective use of bottleneck layers, while improving performance using a deep clustering objective. In addition, we are able to further increase the model efficiency by applying parameterized structured pruning of convolutional weights without any significant impact on the separation performance. We significantly reduce the model size and increase the computational efficiency by a factor of 1.6 and 4.3, respectively, while maintaining the separation performance.

Original languageEnglish
Title of host publication2020 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2020 - Proceedings
Pages806-810
Number of pages5
ISBN (Electronic)9781509066315
DOIs
Publication statusPublished - May 2020
Event2020 IEEE International Conference on Acoustics, Speech and Signal Processing : ICASSP 2020 - Virtuell, Barcelona, Spain
Duration: 4 May 20208 May 2020

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume2020-May
ISSN (Print)1520-6149

Conference

Conference2020 IEEE International Conference on Acoustics, Speech and Signal Processing
Abbreviated titleICASSP 2020
Country/TerritorySpain
CityVirtuell, Barcelona
Period4/05/208/05/20

Keywords

  • Multi-scaled DenseNet
  • Musical Source Separation
  • Parameterized Structured Pruning
  • Real-time

ASJC Scopus subject areas

  • Software
  • Signal Processing
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Towards Real-Time Single-Channel Singing-Voice Separation with Pruned Multi-Scaled DenseNets'. Together they form a unique fingerprint.

Cite this