A probabilistic theory of pattern recognition

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Author(s): Luc Devroye, Laszlo Györfi, Gabor Lugosi
Series: Stochastic Modelling and Applied Probability
Publisher: Springer
Year: 1996

Language: English
Pages: 654
Tags: Информатика и вычислительная техника;Искусственный интеллект;Распознавание образов;

Cover......Page 1
Series: Applications of Mathematics Volume 31......Page 2
A Probabilistic Theory of Pattern Recognition......Page 4
Copyright - ISBN: 0387946187......Page 5
Preface......Page 6
Contents......Page 10
1 Introduction......Page 18
2.1 The Bayes Problem......Page 26
2.2 A Simple Example......Page 28
2.3 Another Simple Example......Page 29
2.4 Other Formulas for the Bayes Risk......Page 31
2.5 Plug-In Decisions......Page 32
2.6 Bayes Error Versus Dimension......Page 34
Problems and Exercises......Page 35
3.1 Measuring Discriminatory Information......Page 38
3.3 The Nearest Neighbor Error......Page 39
3.4 The Bhattacharyya Affinity......Page 40
3.5 Entropy......Page 42
3.6 Jeffreys' Divergence......Page 44
3.7 F-Errors......Page 45
3.8 The Mahalanobis Distance......Page 47
3.9 f-Divergences......Page 48
Problems and Exercises......Page 52
4 Linear Discrimination......Page 56
4.1 Univariate Discrimination and Stoller Splits......Page 57
4.2 Linear Discriminants......Page 61
4.3 The Fisher Linear Discriminant......Page 63
4.4 The Normal Distribution......Page 64
4.5 Empirical Risk Minimization......Page 66
4.6 Minimizing Other Criteria......Page 71
Problems and Exercises......Page 73
5.1 Introduction......Page 78
5.2 Notation and Simple Asymptotics......Page 80
5.3 Proof of Stone's Lemma......Page 83
5.4 The Asymptotic Probability of Error......Page 86
5.5 The Asymptotic Error Probability of Weighted Nearest Neighbor Rules......Page 88
5.6 k-Nearest Neighbor Rules: Even k......Page 91
5.7 Inequalities for the Probability of Error......Page 92
5.8 Behavior When L* Is Small......Page 95
5.9 Nearest Neighbor Rules When L* = 0......Page 97
5.11 The (k, l)-Nearest Neighbor Rule......Page 98
Problems and Exercises......Page 100
6.1 Universal Consistency......Page 108
6.2 Classification and Regression Estimation......Page 109
6.3 Partitioning Rules......Page 111
6.4 The Histogram Rule......Page 112
6.5 Stone's Theorem......Page 114
6.6 The k-Nearest Neighbor Rule......Page 117
6.7 Classification Is Easier Than Regression Function Estimation......Page 118
6.8 Smart Rules......Page 123
Problems and Exercises......Page 124
7.1 Finite Training Sequence......Page 128
7.2 Slow Rates......Page 130
Problems and Exercises......Page 135
8.1 Error Counting......Page 138
8.2 Hoeffding's Inequality......Page 139
8.3 Error Estimation Without Testing Data......Page 141
8.4 Selecting Classifiers......Page 142
8.5 Estimating the Bayes Error......Page 145
Problems and Exercises......Page 146
9.1 The Method of Bounded Differences......Page 150
9.2 Strong Universal Consistency......Page 155
Problems and Exercises......Page 159
10 Kernel Rules......Page 164
10.1 Consistency......Page 166
10.2 Proof of the Consistency Theorem......Page 170
10.3 Potential Function Rules......Page 176
Problems and Exercises......Page 178
11 Consistency of the k-Nearest Neighbor Rule......Page 186
11.1 Strong Consistency......Page 187
11.2 Breaking Distance Ties......Page 191
11.3 Recursive Methods......Page 193
11.4 Scale-Invariant Rules......Page 194
11.5 Weighted Nearest Neighbor Rules......Page 195
11.6 Rotation-Invariant Rules......Page 196
11.7 Relabeling Rules......Page 197
Problems and Exercises......Page 199
12.1 Empirical Error Minimization......Page 204
12.2 Fingering......Page 208
12.3 The Glivenko-Cantelli Theorem......Page 209
12.4 Uniform Deviations of Relative Frequencies from Probabilities......Page 213
12.5 Classifier Selection......Page 216
12.6 Sample Complexity......Page 218
12.7 The Zero-Error Case......Page 219
12.8 Extensions......Page 223
Problems and Exercises......Page 225
13.1 Shatter Coefficients and VC Dimension......Page 232
13.2 Shatter Coefficients of Some Classes......Page 236
13.3 Linear and Generalized Linear Discrimination Rules......Page 241
13.4 Convex Sets and Monotone Layers......Page 243
Problems and Exercises......Page 246
14 Lower Bounds for Empirical Classifier Selection......Page 250
14.2 The Case L_c = 0......Page 251
14.3 Classes with Infinite VC Dimension......Page 255
14.4 The Case L_c > 0......Page 256
14.5 Sample Complexity......Page 262
Problems and Exercises......Page 264
15.1 Maximum Likelihood: The Formats......Page 266
15.2 The Maximum Likelihood Method: Regression Format......Page 267
15.3 Consistency......Page 270
15.4 Examples......Page 273
15.5 Classical Maximum Likelihood: Distribution Format......Page 277
Problems and Exercises......Page 278
16 Parametric Classification......Page 280
16.1 Example: Exponential Families......Page 283
16.2 Standard Plug-In Rules......Page 284
16.3 Minimum Distance Estimates......Page 287
16.4 Empirical Error Minimization......Page 292
Problems and Exercises......Page 293
17 Generalized Linear Discrimination......Page 296
17.1 Fourier Series Classification......Page 297
17.2 Generalized Linear Classification......Page 302
Problems and Exercises......Page 304
18 Complexity Regularization......Page 306
18.1 Structural Risk Minimization......Page 307
18.3 Simple Empirical Covering......Page 314
Problems and Exercises......Page 317
19.1 Condensed Nearest Neighbor Rules......Page 320
19.3 Sieves and Prototypes......Page 326
Problems and Exercises......Page 329
20 Tree Classifiers......Page 332
20.1 Invariance......Page 335
20.2 Trees with the X-Property......Page 336
20.3 Balanced Search Trees......Page 339
20.4 Binary Search Trees......Page 343
20.5 The Chronological k-d Tree......Page 345
20.6 The Deep k-d Tree......Page 349
20.7 Quadtrees......Page 350
20.8 Best Possible Perpendicular Splits......Page 351
20.9 Splitting Criteria Based on Impurity Functions......Page 353
20.10 A Consistent Splitting Criterion......Page 357
20.11 BSP Trees......Page 358
20.12 Primitive Selection......Page 360
20.13 Constructing Consistent Tree Classifiers......Page 363
20.14 A Greedy Classifier......Page 365
Problems and Exercises......Page 374
21.1 Introduction......Page 380
21.2 A Vapnik-Chervonenkis Inequality for Partitions......Page 381
21.3 Consistency......Page 385
21.4 Statistically Equivalent Blocks......Page 389
21.5 Partitioning Rules Based on Clustering......Page 394
21.6 Data-Based Scaling......Page 398
Problems and Exercises......Page 400
22.1 The Holdout Estimate......Page 404
22.2 Consistency and Asymptotic Optimality......Page 406
22.3 Nearest Neighbor Rules with Automatic Scaling......Page 408
22.4 Classification Based on Clustering......Page 409
22.5 Statistically Equivalent Blocks......Page 410
22.6 Binary Tree Classifiers......Page 411
Problems and Exercises......Page 412
23.1 The Resubstitution Estimate......Page 414
23.2 Histogram Rules......Page 416
23.3 Data-Based Histograms and Rule Selection......Page 420
Problems and Exercises......Page 422
24 Deleted Estimates of the Error Probability......Page 424
24.1 A General Lower Bound......Page 425
24.2 A General Upper Bound for Deleted Estimates......Page 428
24.3 Nearest Neighbor Rules......Page 430
24.4 Kernel Rules......Page 432
24.5 Histogram Rules......Page 434
Problems and Exercises......Page 436
25 Automatic Kernel Rules......Page 440
25.1 Consistency......Page 441
25.2 Data Splitting......Page 445
25.3 Kernel Complexity......Page 448
25.4 Multiparameter Kernel Rules......Page 452
25.5 Kernels of Infinite Complexity......Page 453
25.6 On Minimizing the Apparent Error Rate......Page 456
25.7 Minimizing the Deleted Estimate......Page 458
25.8 Sieve Methods......Page 461
25.9 Squared Error Minimization......Page 462
Problems and Exercises......Page 463
26.1 Consistency......Page 468
26.2 Data Splitting......Page 469
26.3 Data Splitting for Weighted NN Rules......Page 470
26.4 Reference Data and Data Splitting......Page 471
26.5 Variable Metric NN Rules......Page 472
26.6 Selection of k Based on the Deleted Estimate......Page 474
Problems and Exercises......Page 475
27.1 Multinomial Discrimination......Page 478
27.2 Quantization......Page 481
27.3 Independent Components......Page 483
27.4 Boolean Classifiers......Page 485
27.5 Series Methods for the Hypercube......Page 487
27.6 Maximum Likelihood......Page 489
Problems and Exercises......Page 491
28.1 Definitions......Page 496
28.2 Examples: Totally Bounded Classes......Page 497
28.3 Skeleton Estimates......Page 499
28.4 Rate of Convergence......Page 502
Problems and Exercises......Page 503
29.1 Minimizing the Empirical Squared Error......Page 506
29.2 Uniform Deviations of Averages from Expectations......Page 507
29.3 Empirical Squared Error Minimization......Page 510
29.4 Proof of Theorem 29.1......Page 511
29.5 Covering Numbers and Shatter Coefficients......Page 513
29.6 Generalized Linear Classification......Page 518
Problems and Exercises......Page 522
30.1 Multilayer Perceptrons......Page 524
30.2 Arrangements......Page 528
30.3 Approximation by Neural Networks......Page 534
30.4 VC Dimension......Page 538
30.5 L_1 Error Minimization......Page 543
30.6 The Adaline and Padaline......Page 548
30.7 Polynomial Networks......Page 549
30.8 Kolmogorov-Lorentz Networks and Additive Models......Page 551
30.9 Projection Pursuit......Page 555
30.10 Radial Basis Function Networks......Page 557
Problems and Exercises......Page 559
31.1 Smoothing the Error Count......Page 566
31.2 Posterior Probability Estimates......Page 571
31.4 Bootstrap......Page 573
Problems and Exercises......Page 576
32.1 Dimensionality Reduction......Page 578
32.2 Transformations with Small Distortion......Page 584
32.3 Admissible and Sufficient Transformations......Page 586
Problems and Exercises......Page 589
A.1 Basics of Measure Theory......Page 592
A.2 The Lebesgue Integral......Page 593
A.3 Denseness Results......Page 596
A.4 Probability......Page 598
A.5 Inequalities......Page 599
A.6 Convergence of Random Variables......Page 601
A.7 Conditional Expectation......Page 602
A.8 The Binomial Distribution......Page 603
A.10 The Multinomial Distribution......Page 606
A.12 The Multivariate Normal Distribution......Page 607
Notation......Page 608
References......Page 610
Author Index......Page 636
Subject Index......Page 644