Automatic speech recognition for conversational speech, or: What we can learn from human talk in interaction

Schuppler, B. (Speaker)

Institute of Signal Processing and Speech Communication (4420)

Activity: Talk or presentation › Invited talk › Science to science

Description

In the last decade, conversational speech has received a lot of attention among speech scientists. On the one hand, accurate automatic speech recognition (ASR) systems are essential for conversational dialogue systems, as these become more interactional and social rather than solely transactional. On the other hand, linguists study natural conversations, as they reveal additional insights to controlled experiments with respect to how speech processing works. Investigating conversational speech, however, does not only require applying existing methods to new data, but developing new categories, new modeling techniques and including new knowledge sources. Whereas traditional models are trained on either text or acoustic information, I propose language models that incorporate information on the phonetic variation of the words (i.e., pronunciation variation and prosody) and relate this information to the semantic context of the conversation and to the communicative functions in the conversation. This approach to language modeling is in line with the theoretical model proposed by Hawkins and Smith (2001), where the perceptual system accesses meaning from speech by using the most salient sensory information from any combination of levels/layers of formal linguistic analysis. The overal aim of my research is to create cross-layer models for conversational speech. In this talk, I will illustrate general challenges for ASR with conversational speech, I will present results from my recent and ongoing projects on pronunciation and prosody modeling, and I will discuss directions for future research.

Period	31 Oct 2019
Held at	Brno University of Technology, Czech Republic
Degree of Recognition	Regional

Documents & Links

http://vgs-it.fit.vutbr.cz/2019/10/08/barbara-schuppler-automatic-speech-recognition-for-conversational-speech-or-what-we-can-learn-from-human-talk-in-interaction/

Automatic speech recognition for conversational speech, or: What we can learn from human talk in interaction

Description

Documents & Links

Related content

Publications

On the use of acoustic features for automatic disambiguation of homophones in spontaneous German

Introduction, or: why rethink reduction?

Prosodic Effects on Plosive Duration in German and Austrian German

Automatic detection of prosodic boundaries in two varieties of German

Prizes

Elise Richter

Projects

FWF - CLCS_2 - Cross-layer prosodic models for conversational speech