Deep Beamforming and Data Augmentation for Robust Speech Recognition: Results of the 4th CHiME Challenge

Tobias Schrank; Lukas Pfeifenberger; Matthias Zöhrer; Johannes Stahl; Pejman Mowlaee Beikzadehmahaleh; Franz Pernkopf

Deep Beamforming and Data Augmentation for Robust Speech Recognition: Results of the 4th CHiME Challenge

Tobias Schrank, Lukas Pfeifenberger, Matthias Zöhrer, Johannes Stahl, Pejman Mowlaee Beikzadehmahaleh, Franz Pernkopf

Institute of Signal Processing and Speech Communication (4420)

Research output: Chapter in Book/Report/Conference proceeding › Conference paper › peer-review

Abstract

Robust automatic speech recognition in adverse environments is a challenging task. We address the 4th CHiME challenge [1] multi-channel tracks by proposing a deep eigenvector beamformer as front-end. To train the acoustic models, we propose to supplement the beamformed data by the noisy audio streams of the individual microphones provided in the real set. Furthermore, we perform data augmentation by modulating the amplitude and time-scale of the audio. Our proposed system
achieves a word error rate of 4.22% on the real development and 8.98% on the real evaluation data for 6-channels and 6.45% and 13.69% for 2-channels, respectively.

Original language	English
Title of host publication	CHiME 4 Workshop
Publication status	Published - 2016

Cite this

@inproceedings{fe24b54ba85d49faaf15da237a9ae19f,

title = "Deep Beamforming and Data Augmentation for Robust Speech Recognition: Results of the 4th CHiME Challenge",

abstract = "Robust automatic speech recognition in adverse environments is a challenging task. We address the 4th CHiME challenge [1] multi-channel tracks by proposing a deep eigenvector beamformer as front-end. To train the acoustic models, we propose to supplement the beamformed data by the noisy audio streams of the individual microphones provided in the real set. Furthermore, we perform data augmentation by modulating the amplitude and time-scale of the audio. Our proposed systemachieves a word error rate of 4.22% on the real development and 8.98% on the real evaluation data for 6-channels and 6.45% and 13.69% for 2-channels, respectively.",

author = "Tobias Schrank and Lukas Pfeifenberger and Matthias Z{\"o}hrer and Johannes Stahl and {Mowlaee Beikzadehmahaleh}, Pejman and Franz Pernkopf",

year = "2016",

language = "English",

booktitle = "CHiME 4 Workshop",

}

TY - GEN

T1 - Deep Beamforming and Data Augmentation for Robust Speech Recognition: Results of the 4th CHiME Challenge

AU - Schrank, Tobias

AU - Pfeifenberger, Lukas

AU - Zöhrer, Matthias

AU - Stahl, Johannes

AU - Mowlaee Beikzadehmahaleh, Pejman

AU - Pernkopf, Franz

PY - 2016

Y1 - 2016

N2 - Robust automatic speech recognition in adverse environments is a challenging task. We address the 4th CHiME challenge [1] multi-channel tracks by proposing a deep eigenvector beamformer as front-end. To train the acoustic models, we propose to supplement the beamformed data by the noisy audio streams of the individual microphones provided in the real set. Furthermore, we perform data augmentation by modulating the amplitude and time-scale of the audio. Our proposed systemachieves a word error rate of 4.22% on the real development and 8.98% on the real evaluation data for 6-channels and 6.45% and 13.69% for 2-channels, respectively.

AB - Robust automatic speech recognition in adverse environments is a challenging task. We address the 4th CHiME challenge [1] multi-channel tracks by proposing a deep eigenvector beamformer as front-end. To train the acoustic models, we propose to supplement the beamformed data by the noisy audio streams of the individual microphones provided in the real set. Furthermore, we perform data augmentation by modulating the amplitude and time-scale of the audio. Our proposed systemachieves a word error rate of 4.22% on the real development and 8.98% on the real evaluation data for 6-channels and 6.45% and 13.69% for 2-channels, respectively.

M3 - Conference paper

BT - CHiME 4 Workshop

ER -

Deep Beamforming and Data Augmentation for Robust Speech Recognition: Results of the 4th CHiME Challenge

Abstract

Fingerprint

Cite this