This volume presents an interdisciplinary approach to the study of second language prosody and computer modeling. It addresses the importance of prosody’s role in communication, bridging the gap between applied linguistics and computer science.
The book illustrates the growing importance of the relationship between automated speech recognition systems and language learning assessment in light of new technologies and showcases how the study of prosody in this context in particular can offer innovative insights into the computerized process of natural discourse. The book offers detailed accounts of different methods of analysis and computer models used and demonstrates how these models can be applied to L2 discourse analysis toward predicting real-world language use. Kang, Johnson, and Kermad also use these frameworks as a jumping-off point from which to propose new models of second language prosody and future directions for prosodic computer modeling more generally.
Making the case for the use of naturalistic data for real-world applications in empirical research, this volume will foster interdisciplinary dialogues across students and researchers in applied linguistics, speech communication, speech science, and computer engineering.
Author(s): Okim Kang, David O. Johnson, Alyssa Kermad
Series: Routledge Studies in Applied Linguistics
Edition: 1
Publisher: Routledge
Year: 2021
Language: English
Pages: 188
Cover
Half Title
Series Information
Title Page
Copyright Page
Table of Contents
Figures
Tables
Introduction
Organization of the Book
Part I Linguistic Foundations of Prosody
1 Overview of Prosody
1.1 What Is Prosody?
1.2 The Role of Prosody in Discourse
1.3 History of Prosodic Approaches
1.4 The British Tradition
1.5 The American Tradition
1.6 Summary
2 Frameworks of Prosody
2.1 Two Prosodic Frameworks
2.2 David Brazil’s Framework
Tone Unit
Context of Interaction
Prominence
Tone
Key and Termination
2.3 Janet Pierrehumbert’s and Julia Hirschberg’s Prosodic Framework
2.4 Summary
3 Prosodic Analyses of Natural Speech
3.1 Second Language (L2) Prosody
3.2 Segmental Properties in Discourse
3.3 Measuring Segmental Properties
3.3.1 Measuring Segmental Accuracy
3.3.2 Measuring Vowel Space
3.3.3 Measuring Vowel Duration
3.3.4 Measuring Voice Onset Time
3.4 Fluency in Discourse
3.5 Measuring Fluency
3.6 Word Stress in Discourse
3.7 Measuring Word Stress
3.8 Sentence Prominence in Discourse
3.9 Measuring Sentence Prominence
3.10 Pitch and Intonation in Discourse
3.11 Measuring Pitch and Intonation
3.12 Proficiency and Intelligibility
3.13 Summary
Part II Computer Applications of Prosody
4 Computerized Systems for Syllabification
4.1 Syllables and Automatic Syllabification
4.2 Machine Learning
4.3 Acoustic Algorithms for Syllabification
4.4 Phonetic Algorithms for Syllabification
4.4.1 Rule-Based Phonetic Algorithms
4.4.2 Data-Driven Phonetic Algorithms
4.5 Data-Driven Phonetic Syllabification Algorithm Implementations
4.5.1 Corpora
4.5.1.1 TIMIT Corpus
4.5.1.2 Boston University Radio News Corpus (BURNC)
4.5.2 Converting Audio Files to Noisy Phonetic Sequences
4.5.3 Syllable Alignment Error
4.5.4 Syllabification-By-Grouping
4.5.5 Sonority Scale
4.5.6 Syllabification By HMM
4.5.7 Syllabification By K-Means Clustering
4.5.8 Syllabification By Genetic Algorithm
4.5.9 Comparison of Syllabification Algorithms
4.6 Summary
5 Computerized Systems for Measuring Suprasegmental Features
5.1 Prominent Syllables
5.2 Pitch Contour Models
5.2.1 TILT Pitch Contour Model
5.2.2 Bézier Pitch Contour Model
5.2.3 Quantized Contour Model (QCM) Pitch Contour Model
5.2.4 4-Point Pitch Contour Model
5.3 Algorithms for Detecting Suprasegmental Features of the ToBI Model
5.3.1 ToBI (Tones and Break Indices) Labeling Scheme
5.3.2 Supervised Machine Learning Algorithms
5.3.3 Unsupervised Machine Learning Algorithms
5.3.4 Summary of Algorithms for Detecting Suprasegmental Features of the ToBI Model
5.4 Algorithms for Detecting Suprasegmental Features Motivated By Brazil’s Model
5.4.1 Algorithms for Detecting Prominent Syllables
5.4.2 Algorithms for Detecting Tone Choice
5.4.3 Algorithms for Detecting Tone Unit
5.4.4 Algorithms for Detecting Relative Pitch
5.4.5 Summary of Algorithms for Detecting Suprasegmental Features of Brazil’s Model
5.5 Algorithms for Calculating Suprasegmental Measures
5.6 Summary
Note
6 Computer Models for Predicting Oral Proficiency and Intelligibility
6.1 Kang and Johnson Computer Model for Automatically Scoring Oral Proficiency
6.1.1 Cambridge English Language Assessment (CELA) Corpus
6.1.2 Step 1: Translate the Sound Recording Into Phones and Silent Pauses
6.1.3 Step 2: Partition the Phones and Silent Pauses Into Tone Units
6.1.4 Step 3: Syllabify the Phones
6.1.5 Step 4: Locate the Filled Pauses
6.1.6 Step 5: Identify the Prominent Syllables
6.1.7 Step 6: Determine the Tone Choice
6.1.8 Step 7: Calculate the Relative Pitch
6.1.9 Step 8: Compute Suprasegmental Measures
6.1.10 Step 9: Estimate Oral Proficiency Score
6.2 Zechner et al.’s (2009) Multiple-Regression Model for Automatically Scoring Oral Proficiency
6.3 Zechner et al.’s (2009) Classification and Regression Trees (CART) Model for Automatically Scoring Oral Proficiency
6.4 Linear Regression Model for Automatically Scoring Oral Proficiency
6.5 Automated Evaluation of Non-Native English Pronunciation Quality
6.6 Johnson and Kang Computer Model for Automatically Scoring Intelligibility
6.6.1 World Englishes Speech Corpus
6.6.2 Computer Model for Predicting Intelligibility Scores
6.7 Comparison of Feature Selection Methods for Automated Speech Analysis Applications
6.7.1 Corpus
6.7.2 Feature Sets
6.7.3 Proficiency Score Predictions
6.8 Summary
Note
Part III The Future of Prosody Models
7 Future Research and Applications
7.1 Future Research and Directions
7.2 Critical Issues in ASR-Based Applications
7.3 Future Applications of Prosodic Models
7.4 Summary
Note
Useful Resources
References
Index