Single-channel speech enhancement with correlated spectral components: Limits-potential

Pejman Mowlaee*, Johannes K.W. Stahl

*Corresponding author for this work

Research output: Contribution to journalArticle

Abstract

In this paper, we investigate single-channel speech enhancement algorithms that operate in the short-time Fourier transform and take into account dependencies w.r.t. frequency. As a result of allowing for inter-frequency dependencies, the minimum mean square error optimal estimates of the short-time Fourier transform expansion coefficients are functions of complex-valued covariance matrices in general. The covariance matrices are not known a priori and have to be estimated from the observed data. This work is dedicated to analyzing how this affects the respective single-channel speech enhancement algorithms. We propose a statistical model that circumvents the need to estimate complex-valued second order statistics and derive a linear multidimensional short-time spectral amplitude estimator that is motivated by these assumptions. Further, we provide empirical evidence for the assumptions that form the basis of this model. We evaluate the potential of taking into account inter-frequency dependencies for single-channel speech enhancement and subsequently compare the estimator resulting from the proposed statistical model to relevant benchmark methods. The results indicate that estimators that consider inter-frequency dependencies are capable of pushing the limits of standard approaches in terms of joint speech quality and intelligibility improvement when the second order statistics are estimated from isolated speech data. The proposed linear multidimensional short-time spectral amplitude estimator preserves this trend in fully blind scenarios.

Original languageEnglish
Pages (from-to)58-69
Number of pages12
JournalSpeech Communication
Volume121
DOIs
Publication statusPublished - Aug 2020

Keywords

  • Inter-frequency dependency
  • Multidimensional Wiener filter
  • Noise reduction
  • Speech enhancement
  • Speech intelligibility

ASJC Scopus subject areas

  • Software
  • Modelling and Simulation
  • Communication
  • Language and Linguistics
  • Linguistics and Language
  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Fingerprint Dive into the research topics of 'Single-channel speech enhancement with correlated spectral components: Limits-potential'. Together they form a unique fingerprint.

Cite this