Deep Beamforming and Data Augmentation for Robust Speech Recognition: Results of the 4th CHiME Challenge

Tobias Schrank, Lukas Pfeifenberger, Matthias Zöhrer, Johannes Stahl, Pejman Mowlaee Beikzadehmahaleh, Franz Pernkopf

Publikation: Beitrag in Buch/Bericht/KonferenzbandBeitrag in einem KonferenzbandBegutachtung

Abstract

Robust automatic speech recognition in adverse environments is a challenging task. We address the 4th CHiME challenge [1] multi-channel tracks by proposing a deep eigenvector beamformer as front-end. To train the acoustic models, we propose to supplement the beamformed data by the noisy audio streams of the individual microphones provided in the real set. Furthermore, we perform data augmentation by modulating the amplitude and time-scale of the audio. Our proposed system
achieves a word error rate of 4.22% on the real development and 8.98% on the real evaluation data for 6-channels and 6.45% and 13.69% for 2-channels, respectively.
Originalspracheenglisch
TitelCHiME 4 Workshop
PublikationsstatusVeröffentlicht - 2016

Fingerprint

Untersuchen Sie die Forschungsthemen von „Deep Beamforming and Data Augmentation for Robust Speech Recognition: Results of the 4th CHiME Challenge“. Zusammen bilden sie einen einzigartigen Fingerprint.

Dieses zitieren