Binaural Codebook-Based Speech Enhancement with Atomic Speech Presence Probability

Sean U.N. Wood, Johannes K.W. Stahl, Pejman Mowlaee

Research output: Contribution to journalArticleResearchpeer-review

Abstract

In this work, we present a universal codebook-based speech enhancement framework that relies on a single codebook to encode both speech and noise components. The atomic speech presence probability (ASPP) is defined as the probability that a given codebook atom encodes speech at a given point in time. We develop ASPP estimators based on binaural cues including the interaural phase and level difference (IPD and ILD), the interaural coherence magnitude (ICM), as well as a combined version leveraging the full interaural transfer function (ITF). We evaluate the performance of the resulting ASPP-based speech enhancement algorithms on binaural mixtures of reverberant speech and real-world noise. The proposed approach improves both objective speech quality and intelligibility over a wide range of input SNR, as measured with PESQ and binaural STOI metrics, outperforming two binaural speech enhancement benchmark methods. We show that the proposed ITF-based ASPP approach achieves a good balance of the trade-off between binaural noise reduction and binaural cue preservation.

Original languageEnglish
Article number8811601
Pages (from-to)2150-2161
Number of pages12
JournalIEEE/ACM Transactions on Audio Speech and Language Processing
Volume27
Issue number12
DOIs
Publication statusPublished - 1 Dec 2019

    Fingerprint

Keywords

  • atomic speech presence probability
  • Binaural speech enhancement
  • interaural transfer function
  • nonnegative matrix factorization

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Acoustics and Ultrasonics
  • Computational Mathematics
  • Electrical and Electronic Engineering

Cite this