Audio Signal Processing and Coding

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This book splits the difference between a purely academic and a practical approach. It does talk a great deal about the history of various audio representations, and it has some derivations, but it also has some practical numerical examples inserted into the narrative that help explain some of the audio concepts. The exercises at the end of each chapter are on the practical side, stressing numerical problems and MATLAB computer exercises over pure derivations. The book spends the first six chapters going over the basics that you need to know to understand or implement audio coding schemes. Chapter two reviews basic signal processing concepts associated with audio coding. Chapter 3 provides introductory material to waveform quantization and entropy coding schemes. Some of the key topics covered in that chapter include scalar quantization, uniform and nonuniform quantization, pulse code modulation, differential PCM, adaptive DPCM, vector quantization, bit allocation schemes, and entropy coding techniques such as Huffman, Rice, and arithmetic methods. Chapter 4 provides information on linear prediction and its application in narrow and wideband coding. In chapter 5, where psychoacoustic principles are described, Johnston's notion of perceptual entropy is presented as a measure of the fundamental limit of transparent compression for audio. Chapter 6, on filter bank design issues and algorithms, places particular emphasis on the modified discrete cosine transform which is widely used in several perceptual audio coding algorithms. The chapter also addresses pre-echo artifacts and control strategies. Chapters 7,8, and 9 review established and emerging techniques for transparent coding of FM and CD-quality audio signals, including several algorithms that have become international standards. Transform coding methodologies are described in chapter 7, subband algorithms are discussed in chapter 8, and sinusoidal algorithms are presented in chapter 9. Chapter 10 discusses the standardization activities in audio coding. It describes coding standards and products such as the ISO/IEC MPEG family. Details on popular standards, such as the MP3 and MPEG-4 AAC algorithms, are provided. Chapter 11 focuses on lossless audio coding and digital audio watermarking techniques. In particular, the SHORTEN, the DVD algorithm, the MUSICompress, the AudioPaK, and other such coding schemes are described in detail. Chapter 12 provides information on subjective quality measures for perceptual codecs. The five-point absolute and differential subjective quality scales are addressed. A set of subjective benchmarks is provided for the various standards in both stereophonicand multichannel modes so that algorithms can be more easily compared. If you've never been exposed to the subjects discussed in chapters one through six, you'll find this book rough going, since there are entire books written on the subjects that this book is covering in just one chapter each. However, I think it is a good review and a good way for those that are accustomed to looking at these problems from a purely mathematical perspective to see them from the viewpoint of audio processing and coding and to see problems solved using MATLAB. Chapters 7 through 9 are very good at presenting the various algorithms and illustrating them, but the quality seems to drop off as far as details go in the final three chapters starting with the sections on the MPEG standards in chapter 10. This book is good for background and reference, but don't expect to be able to decode or encode anything based solely on what's presented here.

Author(s): Andreas Spanias, Ted Painter, Venkatraman Atti
Publisher: Wiley-Interscience
Year: 2007

Language: English
Pages: 486

AUDIO SIGNAL PROCESSING AND CODING......Page 3
CONTENTS......Page 9
PREFACE......Page 17
1.1 Historical Perspective......Page 23
1.2 A General Perceptual Audio Coding Architecture......Page 26
1.3 Audio Coder Attributes......Page 27
1.3.3 Complexity......Page 28
1.4 Types of Audio Coders – An Overview......Page 29
1.5 Organization of the Book......Page 30
1.6 Notational Conventions......Page 31
Computer Exercises......Page 33
2.2 Spectra of Analog Signals......Page 35
2.3 Review of Convolution and Filtering......Page 38
2.4 Uniform Sampling......Page 39
2.5.1 Transforms for Discrete-Time Signals......Page 42
2.5.2 The Discrete and the Fast Fourier Transform......Page 44
2.5.4 The Short-Time Fourier Transform......Page 45
2.6 Difference Equations and Digital Filters......Page 47
2.7 The Transfer and the Frequency Response Functions......Page 49
2.7.1 Poles, Zeros, and Frequency Response......Page 51
2.7.2 Examples of Digital Filters for Audio Applications......Page 52
2.8.1 Down-sampling by an Integer......Page 55
2.8.2 Up-sampling by an Integer......Page 57
2.8.4 Quadrature Mirror Filter Banks......Page 58
2.9 Discrete-Time Random Signals......Page 61
2.9.1 Random Signals Processed by LTI Digital Filters......Page 64
2.10 Summary......Page 66
Problems......Page 67
Computer Exercises......Page 69
3.1 Introduction......Page 73
3.1.1 The Quantization–Bit Allocation–Entropy Coding Module......Page 74
3.2 Density Functions and Quantization......Page 75
3.3.1 Uniform Quantization......Page 76
3.3.2 Nonuniform Quantization......Page 79
3.3.3 Differential PCM......Page 81
3.4 Vector Quantization......Page 84
3.4.1 Structured VQ......Page 86
3.4.2 Split-VQ......Page 89
3.4.3 Conjugate-Structure VQ......Page 91
3.5 Bit-Allocation Algorithms......Page 92
3.6 Entropy Coding......Page 96
3.6.1 Huffman Coding......Page 99
3.6.2 Rice Coding......Page 103
3.6.3 Golomb Coding......Page 104
3.6.4 Arithmetic Coding......Page 105
Problems......Page 107
Computer Exercises......Page 108
4.1 Introduction......Page 113
4.2 LP-Based Source-System Modeling for Speech......Page 114
4.3 Short-Term Linear Prediction......Page 116
4.3.1 Long-Term Prediction......Page 117
4.4 Open-Loop Analysis-Synthesis Linear Prediction......Page 118
4.5 Analysis-by-Synthesis Linear Prediction......Page 119
4.5.1 Code-Excited Linear Prediction Algorithms......Page 122
4.6.1 Wideband Speech Coding......Page 124
4.6.2 Wideband Audio Coding......Page 126
4.7 Summary......Page 128
Problems......Page 129
Computer Exercises......Page 130
5.1 Introduction......Page 135
5.2 Absolute Threshold of Hearing......Page 136
5.3 Critical Bands......Page 137
5.4 Simultaneous Masking, Masking Asymmetry, and the Spread of Masking......Page 142
5.4.1 Noise-Masking-Tone......Page 145
5.4.4 Asymmetry of Masking......Page 146
5.4.5 The Spread of Masking......Page 147
5.5 Nonsimultaneous Masking......Page 149
5.6 Perceptual Entropy......Page 150
5.7 Example Codec Perceptual Model: ISO/IEC 11172-3 (MPEG - 1) Psychoacoustic Model 1......Page 152
5.7.2 Step 2: Identification of Tonal and Noise Maskers......Page 153
5.7.3 Step 3: Decimation and Reorganization of Maskers......Page 157
5.7.4 Step 4: Calculation of Individual Masking Thresholds......Page 158
5.8 Perceptual Bit Allocation......Page 160
Problems......Page 162
Computer Exercises......Page 163
6.1 Introduction......Page 167
6.2 Analysis-Synthesis Framework for M-band Filter Banks......Page 168
6.3 Filter Banks for Audio Coding: Design Considerations......Page 170
6.3.2 The Role of Frequency Resolution in Perceptual Bit Allocation......Page 171
6.3.3 The Role of Time Resolution in Perceptual Bit Allocation......Page 172
6.4 Quadrature Mirror and Conjugate Quadrature Filters......Page 177
6.5 Tree-Structured QMF and CQF M-band Banks......Page 178
6.6 Cosine Modulated “Pseudo QMF” M-band Banks......Page 182
6.7 Cosine Modulated Perfect Reconstruction (PR) M-band Banks and the Modified Discrete Cosine Transform (MDCT)......Page 185
6.7.2 MDCT Window Design......Page 187
6.7.3 Example MDCT Windows (Prototype FIR Filters)......Page 189
6.8 Discrete Fourier and Discrete Cosine Transform......Page 200
6.9 Pre-echo Distortion......Page 202
6.10.2 Window Switching......Page 204
6.10.3 Hybrid, Switched Filter Banks......Page 206
6.10.5 Temporal Noise Shaping......Page 207
6.11 Summary......Page 208
Problems......Page 210
Computer Exercises......Page 213
7.1 Introduction......Page 217
7.2 Optimum Coding in the Frequency Domain......Page 218
7.3 Perceptual Transform Coder......Page 219
7.3.1 PXFM......Page 220
7.3.2 SEPXFM......Page 221
7.4 Brandenburg-Johnston Hybrid Coder......Page 222
7.5.2 CNET MDCT Coder 1......Page 223
7.5.3 CNET MDCT Coder 2......Page 224
7.6 Adaptive Spectral Entropy Coding......Page 225
7.7 Differential Perceptual Audio Coder......Page 226
7.8 DFT Noise Substitution......Page 227
7.9 DCT with Vector Quantization......Page 228
7.10 MDCT with Vector Quantization......Page 229
Problems......Page 230
Computer Exercises......Page 232
8.1 Introduction......Page 233
8.1.1 Subband Algorithms......Page 234
8.2 DWT and Discrete Wavelet Packet Transform (DWPT)......Page 236
8.3.1 DWPT Coder with Globally Adapted Daubechies Analysis Wavelet......Page 240
8.3.2 Scalable DWPT Coder with Adaptive Tree Structure......Page 242
8.3.4 DWPT Coder with Adaptive Tree Structure and Locally Adapted Analysis Wavelet......Page 245
8.3.5 DWPT Coder with Perceptually Optimized Synthesis Wavelets......Page 246
8.4.1 Switched Nonuniform Filter Bank Cascade......Page 248
8.5 Hybrid WP and Adapted WP/Sinusoidal Algorithms......Page 249
8.5.1 Hybrid Sinusoidal/Classical DWPT Coder......Page 250
8.5.2 Hybrid Sinusoidal/M-band DWPT Coder......Page 251
8.5.3 Hybrid Sinusoidal/DWPT Coder with WP Tree Structure Adaptation (ARCO)......Page 252
8.6 Subband Coding with Hybrid Filter Bank/CELP Algorithms......Page 255
8.6.1 Hybrid Subband/CELP Algorithm for Low-Delay Applications......Page 256
8.6.2 Hybrid Subband/CELP Algorithm for Low-Complexity Applications......Page 257
Problems......Page 259
Computer Exercise......Page 262
9.1 Introduction......Page 263
9.2.1 Sinusoidal Analysis and Parameter Tracking......Page 264
9.2.2 Sinusoidal Synthesis and Parameter Interpolation......Page 267
9.3 Analysis/Synthesis Audio Codec (ASAC)......Page 269
9.3.3 ASAC Bit Allocation, Quantization, Encoding, and Scalability......Page 270
9.4 Harmonic and Individual Lines Plus Noise Coder (HILN)......Page 271
9.4.1 HILN Sinusoidal Analysis-by-Synthesis......Page 272
9.5 FM Synthesis......Page 273
9.5.2 Perceptual Audio Coding Using an FM Synthesis Model......Page 274
9.6 The Sines + Transients + Noise (STN) Model......Page 276
9.7 Hybrid Sinusoidal Coders......Page 277
9.7.1 Hybrid Sinusoidal-MDCT Algorithm......Page 278
9.7.2 Hybrid Sinusoidal-Vocoder Algorithm......Page 279
Problems......Page 280
Computer Exercises......Page 281
10.1 Introduction......Page 285
10.2.1 MIDI Synthesizer......Page 286
10.2.3 MIDI Applications......Page 288
10.3.1 The Evolution of Surround Sound......Page 289
10.3.3 The ITU-R BS.775 5.1-Channel Configuration......Page 290
10.4 MPEG Audio Standards......Page 292
10.4.1 MPEG-1 Audio (ISO/IEC 11172-3)......Page 297
10.4.2 MPEG-2 BC/LSF (ISO/IEC-13818-3)......Page 301
10.4.3 MPEG-2 NBC/AAC (ISO/IEC-13818-7)......Page 305
10.4.4 MPEG-4 Audio (ISO/IEC 14496-3)......Page 311
10.4.5 MPEG-7 Audio (ISO/IEC 15938-4)......Page 331
10.4.6 MPEG-21 Framework (ISO/IEC-21000)......Page 339
10.5 Adaptive Transform Acoustic Coding (ATRAC)......Page 341
10.6.1 Perceptual Audio Coder (PAC)......Page 343
10.6.3 Multichannel PAC (MPAC)......Page 345
10.7.1 Dolby AC-2, AC-2A......Page 347
10.7.2 Dolby AC-3/Dolby Digital/Dolby SR · D......Page 349
10.8 Audio Processing Technology APT-x100......Page 357
10.9.1 Framing and Subband Analysis......Page 360
10.9.3 ADPCM – Differential Subband Coding......Page 361
10.9.4 Bit Allocation, Quantization, and Multiplexing......Page 363
Computer Exercise......Page 364
11.1 Introduction......Page 365
11.2 Lossless Audio Coding (L(2)AC)......Page 366
11.2.1 L(2)AC Principles......Page 367
11.2.2 L(2)AC Algorithms......Page 368
11.3 DVD-Audio......Page 378
11.4 Super-Audio CD (SACD)......Page 380
11.4.2 Sigma-Delta Modulators (SDM)......Page 384
11.4.3 Direct Stream Digital (DSD) Encoding......Page 386
11.5 Digital Audio Watermarking......Page 390
11.5.1 Background......Page 392
11.5.2 A Generic Architecture for DAW......Page 396
11.5.3 DAW Schemes – Attributes......Page 399
11.6 Summary of Commercial Applications......Page 400
Computer Exercise......Page 404
12.1 Introduction......Page 405
12.2 Subjective Quality Measures......Page 406
12.3 Confounding Factors in Subjective Evaluations......Page 408
12.4 Subjective Evaluations of Two-Channel Standardized Codecs......Page 409
12.5 Subjective Evaluations of 5.1-Channel Standardized Codecs......Page 410
12.6 Subjective Evaluations Using Perceptual Measurement Systems......Page 411
12.6.2 NSE Perceptual Measurement Schemes......Page 412
12.7 Algorithms for Perceptual Measurement......Page 413
12.7.1 Example: Perceptual Audio Quality Measure (PAQM)......Page 414
12.7.2 Example: Noise-to-Mask Ratio (NMR)......Page 418
12.7.3 Example: Objective Audio Signal Evaluation (OASE)......Page 421
12.8 ITU-R BS.1387 and ITU-T P.861: Standards for Perceptual Quality Measurement......Page 423
12.9 Research Directions for Perceptual Codec Quality Measures......Page 424
REFERENCES......Page 427
INDEX......Page 481