An in-depth treatment of algorithms and standards for perceptual coding of high-fidelity audio, this self-contained reference surveys and addresses all aspects of the field. Coverage includes signal processing and perceptual (psychoacoustic) fundamentals, details on relevant research and signal models, details on standardization and applications, and details on performance measures and perceptual measurement systems. It includes a comprehensive bibliography with over 600 references, computer exercises, and MATLAB-based projects for use in EE multimedia, computer science, and DSP courses. An ftp site containing supplementary material such as wave files, MATLAB programs and workspaces for the students to solve some of the numerical problems and computer exercises in the book can be found at ftp://ftp.wiley.com/public/sci_tech_med/audio_signal
Author(s): Andreas Spanias, Ted Painter, Venkatraman Atti
Edition: 1
Year: 2007
Language: English
Pages: 487
AUDIO SIGNAL PROCESSING AND CODING......Page 4
CONTENTS......Page 10
PREFACE......Page 18
1.1 Historical Perspective......Page 24
1.2 A General Perceptual Audio Coding Architecture......Page 27
1.3 Audio Coder Attributes......Page 28
1.3.3 Complexity......Page 29
1.4 Types of Audio Coders – An Overview......Page 30
1.5 Organization of the Book......Page 31
1.6 Notational Conventions......Page 32
Computer Exercises......Page 34
2.2 Spectra of Analog Signals......Page 36
2.3 Review of Convolution and Filtering......Page 39
2.4 Uniform Sampling......Page 40
2.5.1 Transforms for Discrete-Time Signals......Page 43
2.5.2 The Discrete and the Fast Fourier Transform......Page 45
2.5.4 The Short-Time Fourier Transform......Page 46
2.6 Difference Equations and Digital Filters......Page 48
2.7 The Transfer and the Frequency Response Functions......Page 50
2.7.1 Poles, Zeros, and Frequency Response......Page 52
2.7.2 Examples of Digital Filters for Audio Applications......Page 53
2.8.1 Down-sampling by an Integer......Page 56
2.8.2 Up-sampling by an Integer......Page 58
2.8.4 Quadrature Mirror Filter Banks......Page 59
2.9 Discrete-Time Random Signals......Page 62
2.9.1 Random Signals Processed by LTI Digital Filters......Page 65
2.10 Summary......Page 67
Problems......Page 68
Computer Exercises......Page 70
3.1 Introduction......Page 74
3.1.1 The Quantization–Bit Allocation–Entropy Coding Module......Page 75
3.2 Density Functions and Quantization......Page 76
3.3.1 Uniform Quantization......Page 77
3.3.2 Nonuniform Quantization......Page 80
3.3.3 Differential PCM......Page 82
3.4 Vector Quantization......Page 85
3.4.1 Structured VQ......Page 87
3.4.2 Split-VQ......Page 90
3.4.3 Conjugate-Structure VQ......Page 92
3.5 Bit-Allocation Algorithms......Page 93
3.6 Entropy Coding......Page 97
3.6.1 Huffman Coding......Page 100
3.6.2 Rice Coding......Page 104
3.6.3 Golomb Coding......Page 105
3.6.4 Arithmetic Coding......Page 106
Problems......Page 108
Computer Exercises......Page 109
4.1 Introduction......Page 114
4.2 LP-Based Source-System Modeling for Speech......Page 115
4.3 Short-Term Linear Prediction......Page 117
4.3.1 Long-Term Prediction......Page 118
4.4 Open-Loop Analysis-Synthesis Linear Prediction......Page 119
4.5 Analysis-by-Synthesis Linear Prediction......Page 120
4.5.1 Code-Excited Linear Prediction Algorithms......Page 123
4.6.1 Wideband Speech Coding......Page 125
4.6.2 Wideband Audio Coding......Page 127
4.7 Summary......Page 129
Problems......Page 130
Computer Exercises......Page 131
5.1 Introduction......Page 136
5.2 Absolute Threshold of Hearing......Page 137
5.3 Critical Bands......Page 138
5.4 Simultaneous Masking, Masking Asymmetry, and the Spread of Masking......Page 143
5.4.1 Noise-Masking-Tone......Page 146
5.4.4 Asymmetry of Masking......Page 147
5.4.5 The Spread of Masking......Page 148
5.5 Nonsimultaneous Masking......Page 150
5.6 Perceptual Entropy......Page 151
5.7 Example Codec Perceptual Model: ISO/IEC 11172-3 (MPEG - 1) Psychoacoustic Model 1......Page 153
5.7.2 Step 2: Identification of Tonal and Noise Maskers......Page 154
5.7.3 Step 3: Decimation and Reorganization of Maskers......Page 158
5.7.4 Step 4: Calculation of Individual Masking Thresholds......Page 159
5.8 Perceptual Bit Allocation......Page 161
Problems......Page 163
Computer Exercises......Page 164
6.1 Introduction......Page 168
6.2 Analysis-Synthesis Framework for M-band Filter Banks......Page 169
6.3 Filter Banks for Audio Coding: Design Considerations......Page 171
6.3.2 The Role of Frequency Resolution in Perceptual Bit Allocation......Page 172
6.3.3 The Role of Time Resolution in Perceptual Bit Allocation......Page 173
6.4 Quadrature Mirror and Conjugate Quadrature Filters......Page 178
6.5 Tree-Structured QMF and CQF M-band Banks......Page 179
6.6 Cosine Modulated “Pseudo QMF” M-band Banks......Page 183
6.7 Cosine Modulated Perfect Reconstruction (PR) M-band Banks and the Modified Discrete Cosine Transform (MDCT)......Page 186
6.7.2 MDCT Window Design......Page 188
6.7.3 Example MDCT Windows (Prototype FIR Filters)......Page 190
6.8 Discrete Fourier and Discrete Cosine Transform......Page 201
6.9 Pre-echo Distortion......Page 203
6.10.2 Window Switching......Page 205
6.10.3 Hybrid, Switched Filter Banks......Page 207
6.10.5 Temporal Noise Shaping......Page 208
6.11 Summary......Page 209
Problems......Page 211
Computer Exercises......Page 214
7.1 Introduction......Page 218
7.2 Optimum Coding in the Frequency Domain......Page 219
7.3 Perceptual Transform Coder......Page 220
7.3.1 PXFM......Page 221
7.3.2 SEPXFM......Page 222
7.4 Brandenburg-Johnston Hybrid Coder......Page 223
7.5.2 CNET MDCT Coder 1......Page 224
7.5.3 CNET MDCT Coder 2......Page 225
7.6 Adaptive Spectral Entropy Coding......Page 226
7.7 Differential Perceptual Audio Coder......Page 227
7.8 DFT Noise Substitution......Page 228
7.9 DCT with Vector Quantization......Page 229
7.10 MDCT with Vector Quantization......Page 230
Problems......Page 231
Computer Exercises......Page 233
8.1 Introduction......Page 234
8.1.1 Subband Algorithms......Page 235
8.2 DWT and Discrete Wavelet Packet Transform (DWPT)......Page 237
8.3.1 DWPT Coder with Globally Adapted Daubechies Analysis Wavelet......Page 241
8.3.2 Scalable DWPT Coder with Adaptive Tree Structure......Page 243
8.3.4 DWPT Coder with Adaptive Tree Structure and Locally Adapted Analysis Wavelet......Page 246
8.3.5 DWPT Coder with Perceptually Optimized Synthesis Wavelets......Page 247
8.4.1 Switched Nonuniform Filter Bank Cascade......Page 249
8.5 Hybrid WP and Adapted WP/Sinusoidal Algorithms......Page 250
8.5.1 Hybrid Sinusoidal/Classical DWPT Coder......Page 251
8.5.2 Hybrid Sinusoidal/M-band DWPT Coder......Page 252
8.5.3 Hybrid Sinusoidal/DWPT Coder with WP Tree Structure Adaptation (ARCO)......Page 253
8.6 Subband Coding with Hybrid Filter Bank/CELP Algorithms......Page 256
8.6.1 Hybrid Subband/CELP Algorithm for Low-Delay Applications......Page 257
8.6.2 Hybrid Subband/CELP Algorithm for Low-Complexity Applications......Page 258
Problems......Page 260
Computer Exercise......Page 263
9.1 Introduction......Page 264
9.2.1 Sinusoidal Analysis and Parameter Tracking......Page 265
9.2.2 Sinusoidal Synthesis and Parameter Interpolation......Page 268
9.3 Analysis/Synthesis Audio Codec (ASAC)......Page 270
9.3.3 ASAC Bit Allocation, Quantization, Encoding, and Scalability......Page 271
9.4 Harmonic and Individual Lines Plus Noise Coder (HILN)......Page 272
9.4.1 HILN Sinusoidal Analysis-by-Synthesis......Page 273
9.5 FM Synthesis......Page 274
9.5.2 Perceptual Audio Coding Using an FM Synthesis Model......Page 275
9.6 The Sines + Transients + Noise (STN) Model......Page 277
9.7 Hybrid Sinusoidal Coders......Page 278
9.7.1 Hybrid Sinusoidal-MDCT Algorithm......Page 279
9.7.2 Hybrid Sinusoidal-Vocoder Algorithm......Page 280
Problems......Page 281
Computer Exercises......Page 282
10.1 Introduction......Page 286
10.2.1 MIDI Synthesizer......Page 287
10.2.3 MIDI Applications......Page 289
10.3.1 The Evolution of Surround Sound......Page 290
10.3.3 The ITU-R BS.775 5.1-Channel Configuration......Page 291
10.4 MPEG Audio Standards......Page 293
10.4.1 MPEG-1 Audio (ISO/IEC 11172-3)......Page 298
10.4.2 MPEG-2 BC/LSF (ISO/IEC-13818-3)......Page 302
10.4.3 MPEG-2 NBC/AAC (ISO/IEC-13818-7)......Page 306
10.4.4 MPEG-4 Audio (ISO/IEC 14496-3)......Page 312
10.4.5 MPEG-7 Audio (ISO/IEC 15938-4)......Page 332
10.4.6 MPEG-21 Framework (ISO/IEC-21000)......Page 340
10.5 Adaptive Transform Acoustic Coding (ATRAC)......Page 342
10.6.1 Perceptual Audio Coder (PAC)......Page 344
10.6.3 Multichannel PAC (MPAC)......Page 346
10.7.1 Dolby AC-2, AC-2A......Page 348
10.7.2 Dolby AC-3/Dolby Digital/Dolby SR · D......Page 350
10.8 Audio Processing Technology APT-x100......Page 358
10.9.1 Framing and Subband Analysis......Page 361
10.9.3 ADPCM – Differential Subband Coding......Page 362
10.9.4 Bit Allocation, Quantization, and Multiplexing......Page 364
Computer Exercise......Page 365
11.1 Introduction......Page 366
11.2 Lossless Audio Coding (L(2)AC)......Page 367
11.2.1 L(2)AC Principles......Page 368
11.2.2 L(2)AC Algorithms......Page 369
11.3 DVD-Audio......Page 379
11.4 Super-Audio CD (SACD)......Page 381
11.4.2 Sigma-Delta Modulators (SDM)......Page 385
11.4.3 Direct Stream Digital (DSD) Encoding......Page 387
11.5 Digital Audio Watermarking......Page 391
11.5.1 Background......Page 393
11.5.2 A Generic Architecture for DAW......Page 397
11.5.3 DAW Schemes – Attributes......Page 400
11.6 Summary of Commercial Applications......Page 401
Computer Exercise......Page 405
12.1 Introduction......Page 406
12.2 Subjective Quality Measures......Page 407
12.3 Confounding Factors in Subjective Evaluations......Page 409
12.4 Subjective Evaluations of Two-Channel Standardized Codecs......Page 410
12.5 Subjective Evaluations of 5.1-Channel Standardized Codecs......Page 411
12.6 Subjective Evaluations Using Perceptual Measurement Systems......Page 412
12.6.2 NSE Perceptual Measurement Schemes......Page 413
12.7 Algorithms for Perceptual Measurement......Page 414
12.7.1 Example: Perceptual Audio Quality Measure (PAQM)......Page 415
12.7.2 Example: Noise-to-Mask Ratio (NMR)......Page 419
12.7.3 Example: Objective Audio Signal Evaluation (OASE)......Page 422
12.8 ITU-R BS.1387 and ITU-T P.861: Standards for Perceptual Quality Measurement......Page 424
12.9 Research Directions for Perceptual Codec Quality Measures......Page 425
REFERENCES......Page 428
INDEX......Page 482