Dialogue systems are a very appealing technology with an extraordinary future. Spoken, Multilingual and Multimodal Dialogues Systems: Development and Assessment addresses the great demand for information about the development of advanced dialogue systems combining speech with other modalities under a multilingual framework. It aims to give a systematic overview of dialogue systems and recent advances in the practical application of spoken dialogue systems.Spoken Dialogue Systems are computer-based systems developed to provide information and carry out simple tasks using speech as the interaction mode. Examples include travel information and reservation, weather forecast information, directory information and product order. Multimodal Dialogue Systems aim to overcome the limitations of spoken dialogue systems which use speech as the only communication means, while Multilingual Systems allow interaction with users that speak different languages.Presents a clear snapshot of the structure of a standard dialogue system, by addressing its key components in the context of multilingual and multimodal interaction and the assessment of spoken, multilingual and multimodal systemsIn addition to the fundamentals of the technologies employed, the development and evaluation of these systems are describedHighlights recent advances in the practical application of spoken dialogue systemsThis comprehensive overview is a must for graduate students and academics in the fields of speech recognition, speech synthesis, speech processing, language, and human–computer interaction technolgy. It will also prove to be a valuable resource to system developers working in these areas.
Author(s): Ramon Lopez Cozar Delgado, Masahiro Araki
Edition: 1
Year: 2005
Language: English
Pages: 272
Tags: Информатика и вычислительная техника;Обработка медиа-данных;Обработка звука;Обработка речи;
Spoken, Multilingual and Multimodal Dialogue Systems......Page 3
Contents......Page 7
Preface......Page 11
1.1 Human-Computer Interaction and Speech Processing......Page 13
1.2 Spoken Dialogue Systems......Page 14
1.2.1 Technological Precedents......Page 15
1.3 Multimodal Dialogue Systems......Page 16
1.5 Dialogue Systems Referenced in This Book......Page 19
1.6 Area Organisation and Research Directions......Page 23
1.7 Overview of the Book......Page 25
1.8 Further Reading......Page 27
2.1 Input Interface......Page 28
2.1.1 Automatic Speech Recognition......Page 29
2.1.2 Natural Language Processing......Page 34
2.1.3 Face Localisation and Tracking......Page 36
2.1.4 Gaze Tracking......Page 38
2.1.5 Lip-reading Recognition......Page 40
2.1.6 Gesture Recognition......Page 42
2.1.7 Handwriting Recognition......Page 45
2.2.1 Multimodal Data Fusion......Page 46
2.2.2 Multimodal Data Storage......Page 48
2.2.4 Task Module......Page 53
2.2.5 Database Module......Page 54
2.2.6 Response Generation......Page 55
2.3.1 Graphic Generation......Page 56
2.3.2 Natural Language Generation......Page 59
2.3.3 Speech Synthesis......Page 60
2.4 Summary......Page 63
2.5 Further Reading......Page 65
3.1.1 In Terms of System Input......Page 66
3.1.2 In Terms of System Processing......Page 68
3.1.3 In Terms of System Output......Page 70
3.2.1 Development Techniques......Page 71
3.2.2 Data Fusion......Page 75
3.2.3 Architectures of Multimodal Systems......Page 79
3.2.4 Animated Agents......Page 82
3.2.5 Research Trends......Page 91
3.3 Summary......Page 96
3.4 Further Reading......Page 97
4.1.1 Consideration of Alternatives in Multilingual Dialogue Systems......Page 98
4.1.2 Interlingua Approach......Page 103
4.1.3 Semantic Frame Conversion Approach......Page 104
4.1.4 Dialogue-Control Centred Approach......Page 106
4.2.1 MIT Voyager System......Page 107
4.2.2 MIT Jupiter System......Page 110
4.2.3 KIT System......Page 112
4.3 Multilingual Dialogue Systems Based on Web Applications......Page 119
4.3.2 Dialogue Systems Based on Web Applications......Page 120
4.3.3 Multilingual Dialogue Systems Based on the MVC Framework......Page 123
4.3.4 Implementation of Multilingual Voice Portals......Page 126
4.5 Further Reading......Page 129
5.1.1 Annotation of Spoken Dialogue Corpora......Page 130
5.1.2 Annotation of Multimodal Dialogue Corpora......Page 133
5.2 Dialogue Modelling......Page 136
5.2.1 State-Transition Networks......Page 137
5.2.2 Plans......Page 138
5.3.1 Interaction Strategies......Page 139
5.3.2 Confirmation Strategies......Page 140
5.4.1 Interaction Complexity......Page 143
5.4.2 Confirmations......Page 145
5.4.3 Social and Emotional Dialogue......Page 146
5.4.4 Contextual Information......Page 147
5.4.5 User References......Page 149
5.4.6 Response Generation......Page 152
5.5.1 Reference Resolution in Multilingual Dialogue Systems......Page 153
5.5.2 Ambiguity of Speech Acts in Multilingual Dialogue Systems......Page 154
5.5.3 Differences in the Interactive Behaviour of Multilingual Dialogue Systems......Page 155
5.6.1 Dialogue Task Classification......Page 156
5.6.2 Task Modification in Each Task Class......Page 158
5.7 Summary......Page 161
5.8 Further Reading......Page 162
6.1.1 Tools to Develop System Modules......Page 163
6.1.2 Web-Oriented Standards and Tools for Spoken Dialogue Systems......Page 171
6.1.3 Internet Portals......Page 182
6.2.1 Web-Oriented Multimodal Dialogue......Page 188
6.2.2 Face and Body Animation......Page 191
6.2.3 System Development Tools......Page 193
6.2.4 Multimodal Annotation Tools......Page 197
6.4 Further Reading......Page 199
7.1 Overview of Evaluation Techniques......Page 201
7.1.1 Classification of Evaluation Techniques......Page 202
7.2.1 Subsystem-Level Evaluation......Page 204
7.2.2 End-to-End Evaluation......Page 208
7.2.3 Dialogue Processing Evaluation......Page 209
7.2.4 System-to-System Automatic Evaluation......Page 211
7.3 Evaluation of Multimodal Dialogue Systems......Page 214
7.3.1 System-Level Evaluation......Page 215
7.3.2 Subsystem-Level Evaluation......Page 220
7.3.3 Evaluation of Multimodal Data Fusion......Page 222
7.3.4 Evaluation of Animated Agents......Page 224
7.4 Summary......Page 229
7.5 Further Reading......Page 230
Appendix A Basic Tutorial on VoiceXML......Page 231
Appendix B Multimodal Databases......Page 241
Appendix C Coding Schemes for Multimodal Resources......Page 245
Appendix D URLs of Interest......Page 247
Appendix E List of Abbreviations......Page 249
References......Page 251
Index......Page 265