Author(s): Gerhard Tutz, Ludwig-Maximilians-Universität Munchen
Series: Cambridge Series in Statistical and Probabilistic Mathematics
Edition: 1st edition
Publisher: Cambridge University Press
Year: 2012
Language: English
Pages: 574
Tags: Regression, Categorical Data
Cover......Page 1
Regression for Categorical Data......Page 3
Title......Page 5
Copyright......Page 6
Contents......Page 7
Preface......Page 11
1.1.1 Some Examples......Page 13
Scale Levels: Nominal and Ordinal Variables......Page 16
1.2 Organization of This Book......Page 17
1.3.1 Structured Univariate Regression......Page 18
Structuring the Influential Term......Page 19
Linear Predictor......Page 20
Categorical Explanatory Variables......Page 21
Additive Predictor......Page 22
The Link between Covariates and Response......Page 23
1.3.3 Multivariate Regression......Page 24
Structuring the Influential Term......Page 25
1.3.4 Statistical Modeling......Page 26
Binary Explanatory Variables......Page 27
Multicategorical Explanatory Variables or Factors......Page 28
1.4.2 Linear Regression in Matrix Notation......Page 30
Least-Squares Estimation......Page 31
Properties of Estimates......Page 32
1.4.4 Residuals and Hat Matrix......Page 33
Case Deletion as Diagnostic Tool......Page 34
1.4.5 Decomposition of Variance and Coefficient of Determination......Page 35
1.4.6 Testing in Multiple Linear Regression......Page 37
Submodels and the Testing of Linear Hypotheses......Page 38
1.5 Exercises......Page 39
2.1.1 Single Binary Variables......Page 41
Odds, Logits, and Odds Ratios......Page 43
Comparing Two Groups......Page 44
2.2.1 Deficiencies of Linear Models......Page 45
Binary Responses as Dichotomized Latent Variables......Page 46
Modeling the Common Distribution of a Binary and a Continuous Distribution......Page 47
Basic Form of Binary Regression Models......Page 48
2.3.2 Logit Model with Continuous Predictor......Page 49
Multivariate Predictor......Page 53
2.3.3 Logit Model with Binary Predictor......Page 54
Logit Model with (0-1)-Coding of Covariates......Page 55
Logit Model with Effect Coding......Page 56
Logit Model with (0-1)-Coding......Page 57
Logit Model with Effect Coding......Page 58
Logit Model with Several Categorical Predictors......Page 59
2.4 The Origins of the Logistic Function and the Logit Model......Page 60
2.5 Exercises......Page 61
3.1 Basic Structure......Page 63
3.2.2 Exponential Distribution......Page 65
3.2.3 Gamma-Distributed Responses......Page 66
3.2.4 Inverse Gaussian Distribution......Page 67
3.3.1 Models for Binary Data......Page 68
3.3.2 Models for Binomial Data......Page 69
3.3.3 Poisson Model for Count Data......Page 70
3.3.4 Negative Binomial Distribution......Page 71
3.4.1 Means and Variances......Page 72
3.4.2 Canonical Link......Page 73
3.5 Modeling of Grouped Data......Page 74
Log-Likelihood and Score Function......Page 75
Information Matrix......Page 77
3.7.1 The Deviance......Page 79
Analysis of Deviance......Page 81
3.7.3 Alternative Test Statistics for Linear Hypotheses......Page 83
3.8.1 The Deviance for Grouped Observations......Page 84
3.8.2 Pearson Statistic......Page 86
3.9 Computation of Maximum Likelihood Estimates......Page 87
3.10 Hat Matrix for Generalized Linear Models......Page 88
3.11 Quasi-Likelihood Modeling......Page 90
3.13 Exercises......Page 91
Chapter 4 Modeling of Binary Data......Page 93
4.1 Maximum Likelihood Estimation......Page 94
Single Binary Responses......Page 95
Grouped Data: Estimation for Binomially Distributed Responses......Page 96
Estimation Conditioned on Predictor Values......Page 97
4.2.1 The Deviance......Page 99
Deviance as Goodness-of-Fit Statistic......Page 101
4.2.2 Pearson Statistic......Page 102
Alternative Tests......Page 104
4.3.1 Residuals......Page 105
4.3.2 Hat Matrix and Influential Observations......Page 108
4.3.3 Case Deletion......Page 109
4.4.1 The Linear Predictor: Continuous Predictors, Factors, and Interactions......Page 113
The Crossing Operator......Page 115
4.4.2 Testing Components of the Linear Predictor......Page 116
4.4.3 Ordered Categorical Predictors......Page 120
4.5 Comparing Non-Nested Models......Page 125
4.6 Explanatory Value of Covariates......Page 126
4.6.1 Measures of Residual Variation......Page 127
4.6.3 Correlation-Based Measures......Page 129
4.7 Further Reading......Page 131
4.8 Exercises......Page 132
Probit Model......Page 135
Complementary Log-Log Model and Log-Log Model......Page 136
Exponential Model......Page 137
Cauchy Model......Page 138
5.1.2 Comparing Link Functions......Page 139
5.1.3 Choice between Models and Advantages of Logit Models......Page 141
Parametric Families of Link Functions......Page 142
Non-Parametric Fitting of Link Functions......Page 143
5.3 Overdispersion......Page 144
Unobserved Heterogeneity......Page 145
5.3.2 Beta-Binomial Model......Page 146
5.3.3 Generalized Estimation Functions and Quasi-Likelihood......Page 147
The Hypergeometric Distribution......Page 150
5.6 Exercises......Page 152
Chapter 6 Regularization and Variable Selection for Parametric Models......Page 155
6.1 Classical Subset Selection......Page 156
6.2 Regularization by Penalization......Page 157
6.2.1 Ridge Regression......Page 159
6.2.2 L1-Penalty: The Lasso......Page 161
The Adaptive Lasso......Page 164
Categorical Predictors and the Group Lasso......Page 165
6.2.3 The Elastic Net......Page 166
6.2.4 Alternative Estimators with Grouping Effect......Page 167
OSCAR......Page 168
Correlation-Based Penalties......Page 169
6.2.5 SCAD......Page 171
6.2.6 The Dantzig Selector......Page 174
6.3.1 Boosting for Linear Models......Page 175
6.3.2 Boosting for Generalized Linear Models......Page 177
Blockwise Boosting......Page 178
6.4 Simultaneous Selection of Link Function and Predictors......Page 182
6.5.1 Selection within Categorical Predictors......Page 185
Ordered Categories......Page 186
6.5.2 Selection of Variables Combined with Clustering of Categories......Page 187
6.5.3 Selection of Variables......Page 188
6.6 Bayesian Approach......Page 190
6.8 Exercises......Page 191
Chapter 7 Regression Analysis of Count Data......Page 193
7.1 The Poisson Distribution......Page 194
Poisson Distribution as the Law of Rare Events......Page 195
Poisson Process......Page 196
7.2 Poisson Regression Model......Page 197
Deviance and Goodness-of-Fit......Page 198
Testing of Hierarchical Models......Page 199
7.4 Poisson Regression with an Offset......Page 202
Model with Overdispersion Parameter......Page 204
Alternative Variance Functions......Page 205
7.6 Negative Binomial Model and Alternatives......Page 206
Negative Binomial Model as Gamma-Poisson-Model......Page 207
7.7 Zero-Inflated Counts......Page 210
7.8 Hurdle Models......Page 212
7.9 Further Reading......Page 215
7.10 Exercises......Page 216
Chapter 8 Multinomial Response Models......Page 219
8.1 The Multinomial Distribution......Page 221
8.2 The Multinomial Logit Model......Page 222
Side Constraints......Page 224
8.4 Structuring the Predictor......Page 227
8.5 Logit Model as Multivariate Generalized Linear Model......Page 229
8.6.1 Maximum Likelihood Estimation......Page 230
Separate Fitting of Binary Models......Page 231
Pearson Statistic......Page 232
Power-Divergence Family......Page 233
Residuals......Page 234
8.7 Multinomial Models with Hierarchically Structured Response......Page 235
Random Utility Models......Page 238
Independence from Irrelevant Alternatives......Page 240
Pair Comparison Models......Page 241
8.9 Nested Logit Model......Page 243
Ridge-Type Penalties......Page 245
8.11 Further Reading......Page 250
8.12 Exercises......Page 251
Chapter 9 Ordinal Response Models......Page 253
9.1.1 Simple Cumulative Model......Page 255
Cumulative Extreme Value Models......Page 258
Probit Model......Page 260
9.1.3 General Cumulative Models......Page 261
9.1.4 Testing the Proportional Odds Assumption......Page 263
9.2 Sequential Models......Page 264
9.2.1 Basic Model......Page 265
9.3.2 Equivalence of Cumulative and Sequential Models......Page 267
9.3.3 Cumulative Models versus Sequential Models......Page 268
9.4.2 Hierarchically Structured Models......Page 269
9.4.5 Adjacent Categories Logits......Page 272
9.5.1 Cumulative Models......Page 273
Maximum Likelihood Estimation by Binary Response Models......Page 275
9.7 Exercises......Page 277
10.1 Univariate Generalized Non-Parametric Regression......Page 281
10.1.1 Regression Splines and Basis Expansions......Page 282
Regression Splines: Truncated Power Series Basis......Page 283
Representation by B-Splines......Page 284
Alternative Basis Functions......Page 285
10.1.2 Smoothing Splines......Page 286
10.1.3 Penalized Estimation......Page 288
Penalized Splines......Page 289
Maximizing the Penalized Likelihood......Page 290
Effective Degrees of Freedom of a Smoother......Page 291
10.1.4 Local Regression......Page 293
Cross-Validation......Page 295
Likelihood-Based Approaches......Page 296
10.2.1 Expansion in Basis Functions......Page 297
10.2.2 Smoothing Splines......Page 298
10.2.3 Penalized Regression Splines......Page 299
10.2.4 Local Estimation in Two Dimensions......Page 300
10.3 Structured Additive Regression......Page 301
Penalized Regression Splines......Page 302
Selection of Smoothing Parameters......Page 303
Simultaneous Selection of Variables and Amount of Smoothing......Page 304
Univariate Smoothers and the Backfitting Algorithm......Page 306
10.3.2 Extension to Multicategorical Response......Page 310
Varying-Coefficients Models......Page 312
Continuous-by-Continuous Interactions......Page 313
Estimation......Page 314
10.3.4 Structured Additive Regression Modeling......Page 316
10.4 Functional Data and Signal Regression......Page 319
10.4.1 Functional Model for Univariate Response......Page 320
10.4.3 Penalized Signal Regression......Page 321
10.4.5 Feature Extraction in Signal Regression......Page 322
10.5 Further Reading......Page 325
10.6 Exercises......Page 326
11.1 Regression and Classification Trees......Page 329
Test-Based Splits......Page 331
Dichotomous Responses......Page 332
Multicategorical Response......Page 333
Splitting by Impurity Measures......Page 334
11.1.3 Size of a Tree......Page 336
11.1.4 Advantages and Disadvantages of Trees......Page 337
Random Forests......Page 338
Maximally Selected Statistics......Page 339
11.2 Multivariate Adaptive Regression Splines......Page 340
11.4 Exercises......Page 341
Chapter 12 The Analysis of Contingency Tables: Log-Linear and Graphical Models......Page 343
12.1 Types of Contingency Tables......Page 344
Type 3: Product-Multinomial Distribution (Treatment and Pain)......Page 345
Multinomial and Product-Multinomial Distributions......Page 346
12.2 Log-Linear Models for Two-Way Tables......Page 347
12.3 Log-Linear Models for Three-Way Tables......Page 350
Type 3: Product-Multinomial Distribution......Page 351
Type 0: Saturated Model......Page 353
Type 2: Only Two Two-Factor Interactions Contained......Page 354
Type 4: Main Effects Model......Page 356
12.5 Log-Linear and Graphical Models for Higher Dimensions......Page 357
Graphical Models......Page 358
12.6 Collapsibility......Page 360
Logit Models with Selected Response Variables......Page 361
12.8.1 Maximum Likelihood Estimates and Minimal Sufficient Statistics......Page 362
12.8.2 Testing and Goodness-of-Fit......Page 365
12.9 Model Selection and Regularization......Page 366
12.10 Mosaic Plots......Page 369
12.11 Further Reading......Page 370
12.12 Exercises......Page 371
Chapter 13 Multivariate Response Models......Page 375
Subject-Specific Models......Page 376
13.1.1 Transition Models and Response Variables with a Natural Order......Page 377
Estimation......Page 378
13.1.2 Symmetric Conditional Models......Page 380
13.2 Marginal Parametrization and Generalized Log-Linear Models......Page 382
13.3 General Marginal Models: Association as Nuisance and GEEs......Page 383
Working Correlation Matrices......Page 385
Specification by Odds Ratios......Page 387
13.3.1 Generalized Estimation Approach......Page 388
Structured Correlation Model......Page 389
Asymptotic Properties and Extensions......Page 390
Types of Covariates and Loss of Efficiency......Page 393
13.3.2 Marginal Models for Multinomial Responses......Page 394
13.3.3 Penalized GEE Approaches......Page 395
13.3.4 Generalized Additive Marginal Models......Page 396
13.4 Marginal Homogeneity......Page 397
13.4.1 Marginal Homogeneity for Dichotomous Outcome......Page 398
Likelihood Ratio Statistic and Alternatives......Page 399
13.4.2 Regression Approach to Marginal Homogeneity......Page 400
Conditional Logit Models......Page 401
Conditional Maximum Likelihood Estimation......Page 402
13.4.3 Marginal Homogeneity for Multicategorical Outcome......Page 403
13.5 Further Reading......Page 404
13.6 Exercises......Page 405
Chapter 14 Random Effects Models and Finite Mixtures......Page 407
14.1.1 Random Effects for Clustered Data......Page 408
Random Intercept Model......Page 409
14.1.2 General Linear Mixed Model......Page 410
Maximum Likelihood......Page 411
Best Linear Unbiased Prediction (BLUP)......Page 412
Logistic-Normal Model......Page 414
Probit-Normal Model and Alternatives......Page 416
14.2.3 Generalized Linear Mixed Models for Clustered Data......Page 417
Random Slopes......Page 418
14.3.1 Marginal Maximum Likelihood Estimation by Integration Techniques......Page 419
Indirect Maximization Based on the EM Algorithm......Page 421
Motivation as Posterior Mode Estimator......Page 423
Solution of the Penalized Likelihood Problem......Page 424
14.3.4 Estimation of Variances......Page 425
Error Approximation......Page 426
14.3.5 Bayesian Approaches......Page 427
Ordered Response Categories......Page 428
Mixed Multinomial Logit Model......Page 430
14.5 The Marginalized Random Effects Model......Page 431
14.7 Semiparametric Mixed Models......Page 432
14.8 Finite Mixture Models......Page 434
Estimation......Page 435
14.9 Further Reading......Page 438
14.10 Exercises......Page 439
Chapter 15 Prediction and Classification......Page 441
15.1 Basic Concepts of Prediction......Page 442
Estimated Prediction Rule......Page 443
15.1.1 Squared Error Loss......Page 444
15.1.2 Discrete Data......Page 445
Direct Prediction......Page 446
Prediction Based on Estimated Probabilities......Page 447
Loss for Univariate Discrete Response......Page 449
15.2.1 Bayes Rule and the Minimization of the Rate of Misclassification......Page 450
15.2.2 Classification with Discriminant Functions......Page 452
15.2.3 Discrimination with Normally Distributed Predictors......Page 454
15.2.4 Bayes Rule for General Loss Functions......Page 456
15.3.1 Samples and Error Rates......Page 457
Empirical Error Rates......Page 458
Receiver Operating Characteristic Curves (ROC Curves)......Page 460
Plug-In Rules......Page 463
Fisher’s Discriminant Analysis......Page 464
Quadratic Discrimination......Page 465
Regularized Discriminant Analysis......Page 466
15.4.3 Linear Separation and Support Vector Classifiers......Page 467
15.5.1 Nearest Neighborhood Methods......Page 469
15.5.2 Random Forests and Ensemble Methods......Page 470
Early Boosting Approaches: AdaBoost......Page 472
Functional Gradient Descent Boosting......Page 473
Likelihood-Based Boosting......Page 478
Multiple Category Case......Page 479
15.6.1 Feed-Forward Networks......Page 480
15.6.2 Radial Basis Function Networks......Page 482
15.7 Examples......Page 483
15.8 Variable Selection in Classification......Page 485
15.9 Prediction of Ordinal Outcomes......Page 486
15.9.1 Ordinal Response Models......Page 487
15.9.2 Aggregation over Binary Splits......Page 488
15.10 Model-Based Prediction......Page 492
15.11 Further Reading......Page 493
15.12 Exercises......Page 494
A.1.3 Negative Binomial Distribution......Page 497
A.1.6 Multinomial Distribution......Page 498
A.2.4 Gompertz or Minimum Extreme Value Distribution......Page 499
A.2.8 Dirichlet Distribution......Page 500
A.2.9 Beta Distribution......Page 501
Derivatives......Page 502
Univariate Version......Page 503
Taylor Approximation and the Asymptotic Covariance of ML Estimates......Page 504
B.3 Conditional Expectation, Distribution......Page 505
B.4 EM Algorithm......Page 506
C.1 Simplification of Penalties......Page 508
Simplifying the Penalty by Reparameterization......Page 509
C.2 Linear Constraints......Page 510
C.3 Fisher Scoring with Penalty Term......Page 511
D.1 Kullback-Leibler Distance......Page 512
D.1.2 Kullback-Leibler and Discrete Distributions......Page 513
D.1.3 Kullback-Leibler in Generalized Linear Models......Page 514
D.1.4 Decomposition......Page 515
E.1 Laplace Approximation......Page 516
E.2 Gauss-Hermite Integration......Page 517
E.2.1 Multivariate Gauss-Hermite Integration......Page 518
E.3 Inversion of Pseudo-Fisher Matrix......Page 519
List of Examples......Page 521
Bibliography......Page 525
Author Index......Page 557
Subject Index......Page 566