Automatic Speech Translation: Fundamental Technology for Future Cross-Language Communications

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Automatic Speech Translation introduces recent results of Japanese research and development in speech translation and speech recognition. Topics covered include: fundamental concepts of speech recognition; speech pattern representation; phoneme-based HMM phoneme recognition; continuous speech recognition; speaker adaptation; speaker-independent speech recognition; utterance analysis, utterance transfer, utterance generation; contextual processĀ­ing; speech synthesis and an experimental system of speech translation. This book presents the complicated technological aspects of machine translation and speech recognition, and outlines the future directions of this rapidly developing area of technology.

Author(s): Akira Kurematsu, Tsuyoshi Morimoto
Series: Japanese Technology Reviews
Publisher: CRC Press
Year: 2023

Language: English
Pages: 131
City: Boca Raton

Cover
Half Title
Title Page
Copyright Page
Table of Contents
Preface to the Series
Preface
1. Introduction to Speech Translation
1.1. Introduction
1.2. Configuration of Automatic Speech Translation Systems
1.3. Requirements for an Automatic Speech Translation System
1.4. History in Brief
2. Speech Recognition
2.1. Introduction
2.2. Fundamental Concepts of Speech Recognition
2.2.1 The Concept of Speech Recognition
2.2.2 The Problem of Speech Recognition
2.3. Speech Pattern Representation
2.3.1 Characteristics of the Japanese Speech Signal
2.3.2 Representation of Acoustical Patterns of Speech
2.3.3 Signal Processing and Speech Analysis Methods
2.3.4 Discrete Representation of the Speech Pattern
2.3.5 Speech Units
2.4. Phoneme-Based HMM Phoneme Recognition
2.4.1 The Hidden Markov Model
2.4.2 Discrete HMM Phoneme Model
2.4.3 Continuous-Mixture HMM
2.4.4 Hidden Markov Network
2.4.5 Successive State Splitting Algorithm
2.5. Continuous Speech Recognition
2.5.1 Approach to Large-Vocabulary Continuous Speech Recognition
2.5.2 Measure of task Complexity
2.6. HMM-LR Continuous Recognition
2.6.1 Outline of HMM-LR
2.6.2 Speech Recognition Using Context-free Grammar
2.6.3 LR Parsing
2.6.4 Generalized LR parsing
2.6.5 Operation of HMM-LR Speech Recognition
2.6.6 Japanese Speech Recognition System by HMM-LR
2.6.7 Sentence Recognition Using Two-level LR Parsing
2.7. Speaker Adaptation
2.7.1 Speaker Adaptation by Vector Quantization
2.7.2 Speaker Adaptation by Vector Field Smoothing
2.7.3 Speaker Adaptation Based on Vector Field Smoothing with Continuous Mixture Density HMM
2.7.4 Speaker Adaptation of Hidden Markov Network
2.8. Speaker-Independent Speech Recognition
2.9. Performance Score of Continuous Speech Recognition
3. Language Translation of Spoken Language
3.1. Problems in Spoken Language Translation
3.2. Intention Translation
3.3. Unification-Based Utterance Analysis
3.3.1 Basic Concept of Parsing
3.3.2 Unification-Based Utterance Analysis
3.3.3 HPSG Style Grammar for Spoken Japanese
3.3.4 Syntactic and Semantic Constraints
3.3.5 Parsing Based on Unification Grammar
3.3.6 Resolution of Zero-Pronoun
3.3.7 Ruling out the Erroneous Spoken Input
3.3.8 Experimental Results
3.4. Utterance Transfer
3.4.1 The Transfer Process
3.4.2 The Use of Domain and Language Knowledge
3.5. Utterance Generation
3.5.1 Language Generation Based on Feature Structure
3.5.2 Knowledge Representation of Phrase Description
3.5.3 Generation Algorithm
3.5.4 Towards Efficient Generation
3.6. Contextual Processing Based on Dialogue Interpretation
3.6.1 Plan recognition
3.6.2 Dialogue Interpretation
3.6.3 Contextual Processing
3.7. New Approach for Language Translation
3.7.1 Example-Based Language Translation
4. Speech Synthesis
4.1. Introduction
4.2. Speech Synthesis by Rule
4.2.1 Outline of Speech Synthesis by Rule
4.2.2 More Natural Sounding Speech
4.3. Speech Synthesis Using a Non-uniform Speech Unit
4.3.1 ATR-v Talk System
4.3.2 Selection of Appropriate Speech Unit
4.3.3 Phonetic Tagging of Speech Data Set
4.3.4 Unit Combination Design
4.3.5 Unit Segment Reduction
4.4. Prosody Control
4.4.1 Segmental duration control
4.4.2 Amplitude Control
4.4.3 Fundamental Frequency Control
4.5. Voice Conversion
4.5.1 Voice Conversion Based on Vector Quantization
4.5.2 Generation of Mapping Codebook
4.5.3 Experiment on Voice Conversion
4.5.4 Cross-language Voice Conversion
5. Experimental System of Speech Translation
5.1. ASURA
5.1.1 Speech Recognition
5.1.2 Spoken Language Translation
5.1.3 Performance
5.2. International Joint Experiment on Interpreting Telephony
5.3. Intertalker
5.3.1 Overview
5.3.2 Speech Recognition
5.3.3 Language Translation
5.3.4 Speech Synthesis
5.3.5 Performance
6. Future Directions
6.1. Introduction
6.2. Future Directions of Speech Translation
6.2.1 Recognition of Spontaneous Speech
6.2.2 Prosody Extraction and Control in Speech Processing
6.2.3 Translation of Colloquial Utterance
6.2.4 Integrated Control of Speech and Language Processing
6.2.5 Mechanism of Spontaneous Speech Interpretation
6.3. International Cooperation
References
Index