Single-channel speech enhancement with correlated spectral components: Limits-potential

Pejman Mowlaee*, Johannes K.W. Stahl

*Korrespondierende/r Autor/-in für diese Arbeit

Publikation: Beitrag in einer FachzeitschriftArtikelBegutachtung

Abstract

In this paper, we investigate single-channel speech enhancement algorithms that operate in the short-time Fourier transform and take into account dependencies w.r.t. frequency. As a result of allowing for inter-frequency dependencies, the minimum mean square error optimal estimates of the short-time Fourier transform expansion coefficients are functions of complex-valued covariance matrices in general. The covariance matrices are not known a priori and have to be estimated from the observed data. This work is dedicated to analyzing how this affects the respective single-channel speech enhancement algorithms. We propose a statistical model that circumvents the need to estimate complex-valued second order statistics and derive a linear multidimensional short-time spectral amplitude estimator that is motivated by these assumptions. Further, we provide empirical evidence for the assumptions that form the basis of this model. We evaluate the potential of taking into account inter-frequency dependencies for single-channel speech enhancement and subsequently compare the estimator resulting from the proposed statistical model to relevant benchmark methods. The results indicate that estimators that consider inter-frequency dependencies are capable of pushing the limits of standard approaches in terms of joint speech quality and intelligibility improvement when the second order statistics are estimated from isolated speech data. The proposed linear multidimensional short-time spectral amplitude estimator preserves this trend in fully blind scenarios.

Originalspracheenglisch
Seiten (von - bis)58-69
Seitenumfang12
FachzeitschriftSpeech Communication
Jahrgang121
DOIs
PublikationsstatusVeröffentlicht - Aug. 2020

ASJC Scopus subject areas

  • Software
  • Modellierung und Simulation
  • Kommunikation
  • Sprache und Linguistik
  • Linguistik und Sprache
  • Maschinelles Sehen und Mustererkennung
  • Angewandte Informatik

Fingerprint

Untersuchen Sie die Forschungsthemen von „Single-channel speech enhancement with correlated spectral components: Limits-potential“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren