The DIRHA project addresses the development of voice-enabled automated home environments based on distant-speech interaction in different languages. A distributed microphone network is installed in the rooms of a house in order to monitor selectively acoustic and speech activities observable inside any space, and to eventually run a spoken dialogue session with a given user in order to implement a service or to have access to appliances and other devices. The multi-microphone front-end is based on the use of arrays consisting of analog microphones or Micro Electro-Mechanical Systems (MEMS) digital microphones. The targeted system analyses the given multi-space acoustic scene in a coherent way, by processing in a parallelized fashion simultaneous activities which occur in different rooms, and in case by supporting at the same time the interaction with users who may speak in different areas of the house.
These very challenging objectives require advances in different scientific and technical fields. In fact, based on the given network of microphone arrays, multi-microphone front-end processing includes, among the others, tasks as speaker localization, acoustic echo cancellation, speech enhancement, acoustic event segmentation and classification. It is then necessary to have robust technologies for distant-speech recognition and speaker identification (and verification). Effective solutions for language modeling in the selected languages, speech understanding, concurrent management of spoken dialogue interaction, together with user interface and integration between the resulting technological components, will also represent fundamental features for the implementation of the proposed smart home interface. The final prototype will be integrated in an automated home and evaluated by real users.