Atypical Speech

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

EURASIP Journal on Audio, Speech, and Music Processing, 2010, -90 pp.
One of the most important aspects of spoken language is its large degree of variability. Variability in speech is caused by many different sources, for instance, changes of the acoustic environment or transmission channel and differences between speakers or various speaking styles. Successful speech processing systems typically combine several different means to cope with the unwanted variability of the input signal. In the last two decades, large progress has been made in the areas of feature-normalization, speaker-independent and speaker-adaptive acoustic modeling, and robust estimation methods for statistical language models. This has led to many useful applications of speech processing, like spoken dialogue systems that are connected to the telephone network, medical dictation, broadcast news transcription, or spoken destination entry for navigation systems in the car. Unfortunately, the algorithms used in current systems for robust modeling, speaker normalization and adaptation have many limitations, in particular for speech that deviates significantly from the data in the training corpus. Atypical speakers like nonnative speakers, children, or members of the elderly population still lead to much higher error rates in state-of-the-art speech recognizers than normal, or typical, adult native speakers.
This limits the practical applications of automatic speech processing significantly. For instance, a spoken dialogue system should be able to understand any user, even if he or she belongs to the elderly population. Furthermore, the system should be able to react in an adequate manner if the user’s emotional state changes. A software for computer-aided language learning needs to be able to cope with nonnative speech.
As research in the past has concentrated more on typical speech than on atypical speech, some important questions in this area are still largely unanswered. For instance, there is no good definition of the term atypical speech yet. The articles we present in this special issue investigate speech from speakers with disabilities, nonnative speech, children’s speech, speech from the elderly, speech with emotional content and singing. For many types of variability, the reasons for the increased error rates are still unknown. Furthermore, it is unclear whether the error rates could be reduced by collecting adequate amounts of training (or adaptation) data or whether novel processing methods have to be developed. We hope that the papers in this special issue help to advance in the direction of getting an answer to these questions. The majority of the articles analyses the influence of atypical speech on automatic speech recognition performance in great detail, and different methods to reduce the error rates for atypical speech are proposed and evaluated. Two papers investigate how different voice qualities can be distinguished automatically.
Atypical Speech
On the Impact of Children’s Emotional Speech on Acoustic and Language Models
Ageing Voices: The Effect of Changes in Voice Parameters on ASR Performance
Automatic Recognition of Lyrics in Singing
Exploring the Effect of Differences in the Acoustic Correlates of Adults’ and Children’s Speech in the Context of Automatic Speech Recognition
Optimizing Automatic Speech Recognition for Low-Proficient Non-Native Speakers
Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer
Analysis of the Roles and the Dynamics of Breathy and Whispery Voice Qualities in Dialogue Speech

Author(s): Stemmer G., Nöth E., Parsa V. (eds.)

Language: English
Commentary: 706067
Tags: Информатика и вычислительная техника;Обработка медиа-данных;Обработка звука;Обработка речи