Deep Beamforming and Data Augmentation for Robust Speech Recognition: Results of the 4th CHiME Challenge

Tobias Schrank, Lukas Pfeifenberger, Matthias Zöhrer, Johannes Stahl, Pejman Mowlaee Beikzadehmahaleh, Franz Pernkopf

Research output: Chapter in Book/Report/Conference proceedingConference paperpeer-review

Abstract

Robust automatic speech recognition in adverse environments is a challenging task. We address the 4th CHiME challenge [1] multi-channel tracks by proposing a deep eigenvector beamformer as front-end. To train the acoustic models, we propose to supplement the beamformed data by the noisy audio streams of the individual microphones provided in the real set. Furthermore, we perform data augmentation by modulating the amplitude and time-scale of the audio. Our proposed system
achieves a word error rate of 4.22% on the real development and 8.98% on the real evaluation data for 6-channels and 6.45% and 13.69% for 2-channels, respectively.
Original languageEnglish
Title of host publicationCHiME 4 Workshop
Publication statusPublished - 2016

Fingerprint

Dive into the research topics of 'Deep Beamforming and Data Augmentation for Robust Speech Recognition: Results of the 4th CHiME Challenge'. Together they form a unique fingerprint.

Cite this