An in-depth treatment of algorithms and standards for perceptual coding of high-fidelity audio, this self-contained reference surveys and addresses all aspects of the field. Coverage includes signal processing and perceptual (psychoacoustic) fundamentals, details on relevant research and signal models, details on standardization and applications, and details on performance measures and perceptual measurement systems. It includes a comprehensive bibliography with over 600 references, computer exercises, and MATLAB-based projects for use in EE multimedia, computer science, and DSP courses. An ftp site containing supplementary material such as wave files, MATLAB programs and workspaces for the students to solve some of the numerical problems and computer exercises in the book can be found at ftp://ftp.wiley.com/public/sci_tech_med/audio_signal
Author(s): Andreas Spanias, Ted Painter, Venkatraman Atti
Publisher: Wiley
Year: 2007
Language: English
Pages: 486
AUDIO SIGNAL PROCESSING AND CODING......Page 3
CONTENTS......Page 9
PREFACE......Page 17
1.1 Historical Perspective......Page 23
1.2 A General Perceptual Audio Coding Architecture......Page 26
1.3 Audio Coder Attributes......Page 27
1.3.3 Complexity......Page 28
1.4 Types of Audio Coders – An Overview......Page 29
1.5 Organization of the Book......Page 30
1.6 Notational Conventions......Page 31
Computer Exercises......Page 33
2.2 Spectra of Analog Signals......Page 35
2.3 Review of Convolution and Filtering......Page 38
2.4 Uniform Sampling......Page 39
2.5.1 Transforms for Discrete-Time Signals......Page 42
2.5.2 The Discrete and the Fast Fourier Transform......Page 44
2.5.4 The Short-Time Fourier Transform......Page 45
2.6 Difference Equations and Digital Filters......Page 47
2.7 The Transfer and the Frequency Response Functions......Page 49
2.7.1 Poles, Zeros, and Frequency Response......Page 51
2.7.2 Examples of Digital Filters for Audio Applications......Page 52
2.8.1 Down-sampling by an Integer......Page 55
2.8.2 Up-sampling by an Integer......Page 57
2.8.4 Quadrature Mirror Filter Banks......Page 58
2.9 Discrete-Time Random Signals......Page 61
2.9.1 Random Signals Processed by LTI Digital Filters......Page 64
2.10 Summary......Page 66
Problems......Page 67
Computer Exercises......Page 69
3.1 Introduction......Page 73
3.1.1 The Quantization–Bit Allocation–Entropy Coding Module......Page 74
3.2 Density Functions and Quantization......Page 75
3.3.1 Uniform Quantization......Page 76
3.3.2 Nonuniform Quantization......Page 79
3.3.3 Differential PCM......Page 81
3.4 Vector Quantization......Page 84
3.4.1 Structured VQ......Page 86
3.4.2 Split-VQ......Page 89
3.4.3 Conjugate-Structure VQ......Page 91
3.5 Bit-Allocation Algorithms......Page 92
3.6 Entropy Coding......Page 96
3.6.1 Huffman Coding......Page 99
3.6.2 Rice Coding......Page 103
3.6.3 Golomb Coding......Page 104
3.6.4 Arithmetic Coding......Page 105
Problems......Page 107
Computer Exercises......Page 108
4.1 Introduction......Page 113
4.2 LP-Based Source-System Modeling for Speech......Page 114
4.3 Short-Term Linear Prediction......Page 116
4.3.1 Long-Term Prediction......Page 117
4.4 Open-Loop Analysis-Synthesis Linear Prediction......Page 118
4.5 Analysis-by-Synthesis Linear Prediction......Page 119
4.5.1 Code-Excited Linear Prediction Algorithms......Page 122
4.6.1 Wideband Speech Coding......Page 124
4.6.2 Wideband Audio Coding......Page 126
4.7 Summary......Page 128
Problems......Page 129
Computer Exercises......Page 130
5.1 Introduction......Page 135
5.2 Absolute Threshold of Hearing......Page 136
5.3 Critical Bands......Page 137
5.4 Simultaneous Masking, Masking Asymmetry, and the Spread of Masking......Page 142
5.4.1 Noise-Masking-Tone......Page 145
5.4.4 Asymmetry of Masking......Page 146
5.4.5 The Spread of Masking......Page 147
5.5 Nonsimultaneous Masking......Page 149
5.6 Perceptual Entropy......Page 150
5.7 Example Codec Perceptual Model: ISO/IEC 11172-3 (MPEG - 1) Psychoacoustic Model 1......Page 152
5.7.2 Step 2: Identification of Tonal and Noise Maskers......Page 153
5.7.3 Step 3: Decimation and Reorganization of Maskers......Page 157
5.7.4 Step 4: Calculation of Individual Masking Thresholds......Page 158
5.8 Perceptual Bit Allocation......Page 160
Problems......Page 162
Computer Exercises......Page 163
6.1 Introduction......Page 167
6.2 Analysis-Synthesis Framework for M-band Filter Banks......Page 168
6.3 Filter Banks for Audio Coding: Design Considerations......Page 170
6.3.2 The Role of Frequency Resolution in Perceptual Bit Allocation......Page 171
6.3.3 The Role of Time Resolution in Perceptual Bit Allocation......Page 172
6.4 Quadrature Mirror and Conjugate Quadrature Filters......Page 177
6.5 Tree-Structured QMF and CQF M-band Banks......Page 178
6.6 Cosine Modulated “Pseudo QMF” M-band Banks......Page 182
6.7 Cosine Modulated Perfect Reconstruction (PR) M-band Banks and the Modified Discrete Cosine Transform (MDCT)......Page 185
6.7.2 MDCT Window Design......Page 187
6.7.3 Example MDCT Windows (Prototype FIR Filters)......Page 189
6.8 Discrete Fourier and Discrete Cosine Transform......Page 200
6.9 Pre-echo Distortion......Page 202
6.10.2 Window Switching......Page 204
6.10.3 Hybrid, Switched Filter Banks......Page 206
6.10.5 Temporal Noise Shaping......Page 207
6.11 Summary......Page 208
Problems......Page 210
Computer Exercises......Page 213
7.1 Introduction......Page 217
7.2 Optimum Coding in the Frequency Domain......Page 218
7.3 Perceptual Transform Coder......Page 219
7.3.1 PXFM......Page 220
7.3.2 SEPXFM......Page 221
7.4 Brandenburg-Johnston Hybrid Coder......Page 222
7.5.2 CNET MDCT Coder 1......Page 223
7.5.3 CNET MDCT Coder 2......Page 224
7.6 Adaptive Spectral Entropy Coding......Page 225
7.7 Differential Perceptual Audio Coder......Page 226
7.8 DFT Noise Substitution......Page 227
7.9 DCT with Vector Quantization......Page 228
7.10 MDCT with Vector Quantization......Page 229
Problems......Page 230
Computer Exercises......Page 232
8.1 Introduction......Page 233
8.1.1 Subband Algorithms......Page 234
8.2 DWT and Discrete Wavelet Packet Transform (DWPT)......Page 236
8.3.1 DWPT Coder with Globally Adapted Daubechies Analysis Wavelet......Page 240
8.3.2 Scalable DWPT Coder with Adaptive Tree Structure......Page 242
8.3.4 DWPT Coder with Adaptive Tree Structure and Locally Adapted Analysis Wavelet......Page 245
8.3.5 DWPT Coder with Perceptually Optimized Synthesis Wavelets......Page 246
8.4.1 Switched Nonuniform Filter Bank Cascade......Page 248
8.5 Hybrid WP and Adapted WP/Sinusoidal Algorithms......Page 249
8.5.1 Hybrid Sinusoidal/Classical DWPT Coder......Page 250
8.5.2 Hybrid Sinusoidal/M-band DWPT Coder......Page 251
8.5.3 Hybrid Sinusoidal/DWPT Coder with WP Tree Structure Adaptation (ARCO)......Page 252
8.6 Subband Coding with Hybrid Filter Bank/CELP Algorithms......Page 255
8.6.1 Hybrid Subband/CELP Algorithm for Low-Delay Applications......Page 256
8.6.2 Hybrid Subband/CELP Algorithm for Low-Complexity Applications......Page 257
Problems......Page 259
Computer Exercise......Page 262
9.1 Introduction......Page 263
9.2.1 Sinusoidal Analysis and Parameter Tracking......Page 264
9.2.2 Sinusoidal Synthesis and Parameter Interpolation......Page 267
9.3 Analysis/Synthesis Audio Codec (ASAC)......Page 269
9.3.3 ASAC Bit Allocation, Quantization, Encoding, and Scalability......Page 270
9.4 Harmonic and Individual Lines Plus Noise Coder (HILN)......Page 271
9.4.1 HILN Sinusoidal Analysis-by-Synthesis......Page 272
9.5 FM Synthesis......Page 273
9.5.2 Perceptual Audio Coding Using an FM Synthesis Model......Page 274
9.6 The Sines + Transients + Noise (STN) Model......Page 276
9.7 Hybrid Sinusoidal Coders......Page 277
9.7.1 Hybrid Sinusoidal-MDCT Algorithm......Page 278
9.7.2 Hybrid Sinusoidal-Vocoder Algorithm......Page 279
Problems......Page 280
Computer Exercises......Page 281
10.1 Introduction......Page 285
10.2.1 MIDI Synthesizer......Page 286
10.2.3 MIDI Applications......Page 288
10.3.1 The Evolution of Surround Sound......Page 289
10.3.3 The ITU-R BS.775 5.1-Channel Configuration......Page 290
10.4 MPEG Audio Standards......Page 292
10.4.1 MPEG-1 Audio (ISO/IEC 11172-3)......Page 297
10.4.2 MPEG-2 BC/LSF (ISO/IEC-13818-3)......Page 301
10.4.3 MPEG-2 NBC/AAC (ISO/IEC-13818-7)......Page 305
10.4.4 MPEG-4 Audio (ISO/IEC 14496-3)......Page 311
10.4.5 MPEG-7 Audio (ISO/IEC 15938-4)......Page 331
10.4.6 MPEG-21 Framework (ISO/IEC-21000)......Page 339
10.5 Adaptive Transform Acoustic Coding (ATRAC)......Page 341
10.6.1 Perceptual Audio Coder (PAC)......Page 343
10.6.3 Multichannel PAC (MPAC)......Page 345
10.7.1 Dolby AC-2, AC-2A......Page 347
10.7.2 Dolby AC-3/Dolby Digital/Dolby SR · D......Page 349
10.8 Audio Processing Technology APT-x100......Page 357
10.9.1 Framing and Subband Analysis......Page 360
10.9.3 ADPCM – Differential Subband Coding......Page 361
10.9.4 Bit Allocation, Quantization, and Multiplexing......Page 363
Computer Exercise......Page 364
11.1 Introduction......Page 365
11.2 Lossless Audio Coding (L(2)AC)......Page 366
11.2.1 L(2)AC Principles......Page 367
11.2.2 L(2)AC Algorithms......Page 368
11.3 DVD-Audio......Page 378
11.4 Super-Audio CD (SACD)......Page 380
11.4.2 Sigma-Delta Modulators (SDM)......Page 384
11.4.3 Direct Stream Digital (DSD) Encoding......Page 386
11.5 Digital Audio Watermarking......Page 390
11.5.1 Background......Page 392
11.5.2 A Generic Architecture for DAW......Page 396
11.5.3 DAW Schemes – Attributes......Page 399
11.6 Summary of Commercial Applications......Page 400
Computer Exercise......Page 404
12.1 Introduction......Page 405
12.2 Subjective Quality Measures......Page 406
12.3 Confounding Factors in Subjective Evaluations......Page 408
12.4 Subjective Evaluations of Two-Channel Standardized Codecs......Page 409
12.5 Subjective Evaluations of 5.1-Channel Standardized Codecs......Page 410
12.6 Subjective Evaluations Using Perceptual Measurement Systems......Page 411
12.6.2 NSE Perceptual Measurement Schemes......Page 412
12.7 Algorithms for Perceptual Measurement......Page 413
12.7.1 Example: Perceptual Audio Quality Measure (PAQM)......Page 414
12.7.2 Example: Noise-to-Mask Ratio (NMR)......Page 418
12.7.3 Example: Objective Audio Signal Evaluation (OASE)......Page 421
12.8 ITU-R BS.1387 and ITU-T P.861: Standards for Perceptual Quality Measurement......Page 423
12.9 Research Directions for Perceptual Codec Quality Measures......Page 424
REFERENCES......Page 427
INDEX......Page 481