Simulating Conversations for the Prediction of Speech Quality

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This book discusses the simulation of conversations through a novel approach of predicting speech quality based on the interactions of two simulated interlocutors. The author describes the setup of a simulation environment that is capable of simulating human dialogue on the speech level. The impact of delay and bursty packet loss on VoIP conversations is investigated and modeled for the use in the simulation. Based on parameters extracted from simulated conversations, the author proposes extensions to the E-model, a parametric model standardized by the International Telecommunications Union, in order to predict the quality of the simulated conversations. The author shows that predictions based on the simulated conversations outperform models that rely on the transmission parameters alone. 

Author(s): Thilo Michael
Series: T-Labs Series in Telecommunication Services
Publisher: Springer-T-Labs
Year: 2023

Language: English
Pages: 156
City: Berlin

Preface
Acknowledgments
Contents
Acronyms
1 Introduction
1.1 Motivation
1.2 Objective and Research Questions
1.3 Structure of This Book
2 Fundamentals
2.1 Speech Transmission
2.2 Speech Quality and Assessment
2.3 Conversational Quality
2.3.1 Standardized Conversation Tests
Short Conversation Test
Random Number Verification Task
2.3.2 Multidimensional Conversation Quality
2.3.3 Delay and Interactivity
2.3.4 Parametric Conversation Analysis
State Probabilities and Sojourn Times
Speaker Alternation Rate
Interruptions and Double Talk
Pauses
Conversational Temperature
2.4 Parametric Quality Prediction
2.4.1 E-Model
Narrowband E-Model
Wideband E-Model
Fullband E-Model
2.5 Signal-Based Quality Prediction
2.6 Hybrid Quality Prediction Models
2.6.1 Objective Conversational Speech Quality Model
2.6.2 Instrumental Diagnostic Conversational Quality
Listening Quality
Speaking Quality
Interaction Quality
Conversational Quality
2.7 Packet Loss and Understandability
2.8 Turn-Taking
2.9 Simulation of Dialogue
2.10 Incremental Dialogue Systems
3 Simulation Architecture
3.1 Retico Incremental Processing Framework
3.2 Simulation Datasets
3.2.1 SMISS Dataset
3.2.2 CONVSIM Dataset
3.2.3 UWS Dataset
3.3 Incremental Simulation Network
3.3.1 Speech Recognition and Natural Language Understanding
3.3.2 End-of-Turn Detection
3.3.3 Language Generation and Speech Synthesis
3.3.4 Speech Dispatching
3.3.5 Turn-Taking Dialogue Manager
Dialogue Management
Turn-Taking
3.3.6 Data Logging
3.3.7 Simulated Telephone Network
3.4 Evaluation of the Simulation Architecture
3.4.1 Dialogue Act Evaluation
3.4.2 Interactivity Evaluation
3.4.3 Simulation Performance
3.5 Summary
4 Simulating Interactivity and Delay
4.1 Simulating Turn-Taking in Conversations with Varying Interactivity
4.1.1 Turn-Taking on a Conversation Level
4.1.2 Modeling Turn-Taking on the Interaction Level
4.1.3 Evaluation of the Turn-Taking Model
4.2 Turn-Taking in Conversations with Delay
4.2.1 Impact of Delay on Conversations
4.2.2 Performance of the Turn-Taking Model
4.2.3 Adaptations of Turn-Taking for Delay
4.2.4 Evaluation of the Adapted Turn-Taking Model
4.3 Summary
5 Simulating Conversation Disruptions and Packet Loss
5.1 Interactivity in Conversations with Packet Loss
5.2 Disruptions in Conversations with Packet Loss
5.3 Simulating Conversations with Bursty Packet Loss
5.3.1 Modeling Conversation Disruptions in a Simulation
5.3.2 Modeling Turn-Taking in a Simulation with Packet Loss
5.4 Evaluation of Simulations with Disruptions and Packet Loss
5.5 Summary
6 Conversational Quality Predictions
6.1 Predicting Quality of Conversations with Delay
6.1.1 E-Model Extension for Interactivity and Delay
6.1.2 Quality Prediction from Interactivity Parameters
6.2 Predicting Quality of Conversations with Packet Loss
6.2.1 Bursty Packet Loss E-Model Extension
6.2.2 Conversation Disruptions and Quality
6.2.3 Interaction Between Delay and Packet Loss
6.3 Predicting Quality from Simulations with Delay
6.3.1 Prediction from Interactivity Parameters of Simulations
6.3.2 Prediction from Extended E-Model
6.4 Predicting Quality from Simulations with Packet Loss and Delay
6.5 Summary
7 Conclusions and Future Work
7.1 Future Work
A Short Conversation Test (SCT)
B Random Number Verification (RNV) Task
C Agenda of Simulated Agents
References
Index