Improvements in Speech Synthesis

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Издательство John Wiley, 2002, -407 pp.
Making machines speak like humans is a dream that is slowly coming to fruition. When the first automatic computer voices emerged from their laboratories twenty years ago, their robotic sound quality severely curtailed their general use. But now after a long period of maturation, synthetic speech is beginning to reach an initial level of acceptability. Some systems are so good that one even wonders if the recording was authentic or manufactured.
The effort to get to this point has been considerable. A variety of quite different technologies had to be developed perfected and examined in depth, requiring skills and interdisciplinary efforts in mathematics, signal processing, linguistics, statistics, phonetics and several other fields. The current compendium in research on speech synthesis is quite representative of this effort, in that it presents work in signal processing as well as in linguistics and the phonetic sciences, performed with the explicit goal of arriving at a greater degree of naturalness in synthesised speech.
But more than just describing the status quo, the current volume points the way to the future. The researchers assembled here generally concur that the current, increasingly healthy state of speech synthesis is by no means the end of a technological development, much rather that it is an excellent starting point. A great deal more work is still needed to bring about much greater variety and flexibility to our synthetic voices, so that they can be used in a much wider set of everyday applications. That is what the current volume traces out in some detail.
Work in signal processing is perhaps the most crucial for the further success of speech synthesis, since it lays the theoretical and technological foundation for developments to come. But right behind follows more extensive research on prosody and styles of speech, work which will trace out the types of voices that will be appropriate to a variety of contexts. And finally, work on the increasingly standardised user interfaces in the form of system options and text mark-up is making it possible to open speech synthesis to a wide variety of non-specialist
The research published here emerges from the four-year European COST 258 project which has served primarily to assemble the authors of this volume in a set of twice-yearly meetings from 1997 to 2001.
Part 1 Issues in Signal Generation
Towards Greater Naturalness
Towards More Versatile Signal Generation Systems
Parametric Harmonic + Noise Model
COST 258 Signal Generation Test Array
Concatenative Text-to-Speech Synthesis Based on Sinusoidal Modelling
Shape Invariant Pitch & Time-Scale Modification of Speech Based on Harmonic Model
Concatenative Speech Synthesis using SRELP
Part 2 Issues in Prosody
Prosody in Synthetic Speech
State-of-the-Art Summary of European Synthetic Prosody R&D
Modelling F0 in Various Romance Languages
Acoustic Characterisation of Tonic Syllable in Portuguese
Prosodic Parameters of Synthetic Czech
MFGI, a Linguistically Motivated Quantitative Model of German Prosody
Improvements in Modelling F0 Contour for Different Types of Intonation Units in Slovence
Representing Speech Rhythm
Phonetic & Timing Considerations in Swiss High German TTS System
Corpus-Based Development of Prosodic Models across 6 Languages
Vowel Reduction in German Read Speech
Part 3 Issues in Styles of Speech
Variability & Speaking Styles in Speech Synthesis
Auditory Analysis of Prosody of Fast & Slow Speech Styles in English, Dutch & German
Automatic Prosody Modelling of Galician & its Application to Spanish
Reduction & Assimilatory Processes in Conversational French Speech Implications for Speech Synthesis
Acoustic Patterns of Emotions
Role of Pitch & Tempo in Spanish Emotional Speech
Voice Quality & Synthesis of Affect
Prosodic Parameters of "Fun" Speaking Style
Dynamics of Glottal Source Signal
Nonlinear Rhythmic Component in Various Styles of Speech
Part 4 Issues in Segmentation & Mark-Up
Issues in Segmentation & Mark-Up
Use & Potential of Extensible Mark-Up (XML) in Speech Generation
Mark-Up for Speech Synthesis
Automatic Analysis of Prosody for Multi-Lingual Speech Corpora
Automatic Speech Segmentation Based on Alignment with Text-to-Speech System
Using COST 249 Reference Speech Recogniser for Automatic Speech Segmentation
Part 5 Future Challenges
Future Challenges
Towards Naturalness or Challenge of Subjectiveness
Synthesis within Multi-Modal Systems
Multi-Modal Speech Synthesis Tool applied to Audio-Visual Prosody
Interface Design for Speech Synthesis Systems

Author(s): Keller E., Bailly G., Monaghan A., Terken J., Huckvale M. (eds.)

Language: English
Commentary: 1300548
Tags: Информатика и вычислительная техника;Обработка медиа-данных;Обработка звука;Обработка речи