Praise for the Second Edition
"A must-have book for anyone expecting to do research and/or applications in categorical data analysis."
—Statistics in Medicine
"It is a total delight reading this book."
—Pharmaceutical Research
"If you do any analysis of categorical data, this is an essential desktop reference."
—Technometrics
The use of statistical methods for analyzing categorical data has increased dramatically, particularly in the biomedical, social sciences, and financial industries. Responding to new developments, this book offers a comprehensive treatment of the most important methods for categorical data analysis.
Categorical Data Analysis, Third Edition summarizes the latest methods for univariate and correlated multivariate categorical responses. Readers will find a unified generalized linear models approach that connects logistic regression and Poisson and negative binomial loglinear models for discrete data with normal regression for continuous data. This edition also features:
- An emphasis on logistic and probit regression methods for binary, ordinal, and nominal responses for independent observations and for clustered data with marginal models and random effects models
- Two new chapters on alternative methods for binary response data, including smoothing and regularization methods, classification methods such as linear discriminant analysis and classification trees, and cluster analysis
- New sections introducing the Bayesian approach for methods in that chapter
- More than 100 analyses of data sets and over 600 exercises
- Notes at the end of each chapter that provide references to recent research and topics not covered in the text, linked to a bibliography of more than 1,200 sources
- A supplementary website showing how to use R and SAS; for all examples in the text, with information also about SPSS and Stata and with exercise solutions
Categorical Data Analysis, Third Edition is an invaluable tool for statisticians and methodologists, such as biostatisticians and researchers in the social and behavioral sciences, medicine and public health, marketing, education, finance, biological and agricultural sciences, and industrial quality control.
Author(s): Alan Agresti
Edition: 3
Publisher: Wiley
Year: 2012
Language: English
Pages: 744
Tags: Data Mining;Databases & Big Data;Computers & Technology;Probability & Statistics;Applied;Mathematics;Science & Math;Database Storage & Design;Computer Science;New, Used & Rental Textbooks;Specialty Boutique;Statistics;Mathematics;Science & Mathematics;New, Used & Rental Textbooks;Specialty Boutique
Contents ... 9
Preface ... 15
CHAPTER 1 Introduction: Distributions and Inference for Categorical Data ... 19
1.1 CATEGORICAL RESPONSE DATA ... 19
1.1.1 Response-Explanatory Variable Distinction ... 20
1.1.2 Binary-Nominal-Ordinal Scale Distinction ... 20
1.1.3 Discrete-Continuous Variable Distinction ... 21
1.1.4 Quantitative-Qualitative Variable Distinction ... 21
1.1.S Organization of Book and Online Computing Appendix ... 22
1.2 DISTRIBUTIONS FOR CATEGORICAL DATA ... 23
1.2.1 Binomial Distribution ... 23
1.2.2 Multinomial Distribution ... 24
1.2.3 Poisson Distribution ... 24
1.2.4 Overdispersion ... 25
1.2.5 Connection Between Poisson and Multinomial Distributions ... 25
1.2.6 The Chi-Squared Distribution ... 26
1.3 STATISTICAL INFERENCE FOR CATEGORICAL DATA ... 26
1.3.1 Likelihood Functions and Maximum Likelihood Estimation ... 27
1.3.2 Likelihood Function and ML Estimate for Binomial Parameter ... 27
1.3.3 Wald-Likelihood Ratio Score Test Triad ... 28
1.3.4 Constructing Confi dence Intervals by Inverting Tests ... 30
1.4 STATISTICAL INFERENCE FOR BINOMIAL PARAME TERS ... 31
1.4.1 Tests About a Binomial Parameter ... 31
1.4.2 Confidence Intervals for a Binomial Parameter ... 32
1.4.3 Example: Estimating the Proportion of Vegetarians ... 33
1.4.4 Exact Small-Sample Inference and the Mid P-Value ... 34
1.5 STATISTICAL INFERENCE FOR MULTINOMIAL PARAMETERS ... 35
1.5.1 Estimation of Multinomial Parameters ... 35
1.5.2 Pearson Chi-Squared Test of a Specifi ed Multinomial ... 36
1.5.3 Likelihood-Ratio Chi-Squared Test of a Specifi ed Multinomial ... 36
1.5.4 Example: Testing Mendel's Theories ... 37
1.5.5 Testing with Estimated Expected Frequencies ... 38
1.5.6 Example: Pneumonia Infections in Calves ... 38
1.5.7 Chi-Squared Theoretical Justifi cation ... 40
1.6 BAYESIAN INFERENCE FOR BINOMIAL AND MULTINOMIAL PARAMETERS ... 40
1.6.1 The Bayesian Approach to Statistical Inference ... 40
1.6.2 Binomial Estimation: Beta and Logit-Normal Prior Distributions ... 42
1.6.3 Multinomial Estimation: Dirichlet Prior Distributions ... 43
1.6.4 Example: Estimating Vegetarianism Revisited ... 44
1.6.5 Binomial and Multinomial Estimation: Improper Priors ... 44
NOTES ... 45
EXERCISES ... 46
CHAPTER 2 Describing Contingency Tables ... 55
2.1 PROBABILITY STRUCTURE FOR CONTINGENCY TABLES ... 55
2.1.1 Contingency Tables ... 55
2.1.2 Joint/Marginal/Conditional Distributions for Contingency Tables ... 56
2.1.3 Example: Sensitivity and Specifi city for Medical Diagnoses ... 57
2.1.4 Independence of Categorical Variables ... 58
2.1.5 Poisson, Binomial, and Multinomial Sampling ... 58
2.1.6 Example: Seat Belts and Auto Accident Injuries ... 59
2.1.7 Example: Case-Control Study of Cancer and Smoking ... 60
2.1.8 Ty pes of Studies: Observational Versus Experimental ... 61
2.2 COMPARING TWO PROPORTIONS ... 61
2.2.1 Difference of Proportions ... 62
2.2.2 Relative Risk ... 62
2.2.3 Odds Ratio ... 62
2.2.4 Properties of the Odds Ratio ... 63
2.2.S Example: Association Between Heart Attacks and Aspirin Use ... 64
2.2.6 Case-Control Studies and the Odds Ratio ... 64
2.2. 7 Relationship B etween Odds Ratio and Relative Risk ... 65
2.3 CONDITIONAL ASSOCIATION IN STRATIFIED 2x2 TABLES ... 65
2.3.1 Partial Tables ... 66
2.3.2 Example: Racial Characteristics and the Death Penalty ... 66
2.3.3 Conditional and Marginal Odds Ratios ... 68
2.3.4 Marginal Independence Versus Conditional Independence ... 69
2.3.5 Homogeneous Association ... 71
2.3.6 Collapsibility: Identical Conditional and Marginal Associations ... 71
2.4 MEASURING ASSOCIATION IN IxJ TABLES ... 72
2.4.1 Odds Ratios in IxJ Tables ... 72
2.4.2 Association Factors ... 73
2.4.3 Summary Measures of Association ... 74
2.4.4 Ordinal Trends: Concordant and Discordant Pairs ... 74
2.4.5 Ordinal Measure of Association: Gamma ... 75
2.4.6 Probabilistic Comparisons of Tw o Ordinal Distributions ... 76
2.4.7 Example: Comparing Pain Ratings After Surgery ... 77
2.4.8 Correlation for Underlying Normality ... 77
NOTES ... 78
EXERCISES ... 78
CHAPTER 3 Inference for Two-Way Contingency Tables ... 87
3.1 CONFIDENCE INTERVALS FOR ASSOCIATION PARAMETERS ... 87
3.1.1 Interval Estimation of the Odds Ratio ... 87
3.1.2 Example: Seat-Belt Use and Traffic Deaths ... 88
3.1.3 Interval Estimation of Diff erence of Proportions and Relative Risk ... 89
3.1.4 Example: Aspirin and Heart Attacks Revisited ... 89
3.1.5 Deriving Standard Errors with the Delta Method ... 90
3.1.6 Delta Method Applied to the Sample Logit ... 91
3.1.7 Delta Method for the Log Odds Ratio ... 91
3.1.8 Simultaneous Confi dence Intervals for Multiple Comparisons ... 93
3.2 TESTING INDEPENDENCE IN TWO-WAY CONTINGENCY TABLES ... 93
3.2.1 Pearson and Likelihood-Ratio Chi-Squared Tests ... 93
3.2.2 Example: Education and Belief in God ... 95
3.2.3 Adequacy of Chi-Squared Approximations ... 95
3.2.4 Chi-Squared and Comparing Proportions in 2x2 Tables ... 96
3.2.5 Score Confi dence Intervals Comparing Proportions ... 96
3.2.6 Profi le Likelihood Confi dence Intervals ... 97
3.3 FOLLOWING-UP CHI-SQUARED TESTS ... 98
3.3.1 Pearson Residuals and Standardized Residuals ... 98
3.3.2 Example: Education and Belief in God Revisited ... 99
3.3.3 Partitioning Chi-Squared ... 99
3.3.4 Example: Origin of Schizophrenia ... 101
3.3.S Rules for Partitioning ... 102
3.3.6 Summarizing the Association ... 102
3.3.7 Limitations of Chi-Squared Tests ... 102
3.3.8 Why Consider Independence If It's Unlikely to Be True? ... 103
3.4 TWO-WAY TABLES WITH ORDERED CLASSIFICATIONS ... 104
3.4.1 Linear Trend Alternative to Independence ... 104
3.4.2 Example: Is Happiness Associated with Political Ideology? ... 105
3.4.3 Monotone Trend Alternatives to Independence ... 105
3.4.4 Extra Power with Ordinal Tests ... 106
3.4.5 Sensitivity to Choice of Scores ... 106
3.4.6 Example: Infant Birth Defects by Maternal Alcohol Consumption ... 107
3.4.7 Trend Tests for Ix2 and 2xJ Tables ... 108
3.4.8 Nominal-Ordinal Tables ... 108
3.5 SMALL-SAMPLE INFERENCE FOR CONTINGENCY TABLES ... 108
3.5.1 Fisher's Exact Test for 2x2 Tables ... 108
3.5.2 Example: Fisher's Tea Drinker ... 109
3.5.3 Two-Sided P-Values for Fisher's Exact Test ... 110
3.5.4 Confidence Intervals Based on Conditional Likelihood ... 110
3.5.5 Discreteness and Conservatism Issues ... 111
3.5.6 Small-Sample Unconditional Tests of Independence ... 111
3.5.7 Conditional Versus Unconditional Tests ... 112
3.6 BAYESIAN INFERENCE FOR TWO-WAY CONTINGENCY TABLES ... 114
3.6.1 Prior Distributions for Comparing Proportions in 2x2 Tables ... 114
3.6.2 Posterior Probabilities Comparing Proportions ... 115
3.6.3 Posterior Intervals for Association Parameters ... 115
3.6.4 Example: Urn Sampling Gives Highly Unbalanced Treatment Allocation ... 116
3.6.5 Highest Posterior Density Intervals ... 116
3.6.6 Testing Independence ... 117
3.6.7 Empirical Bayes and Hierarchical Bayesian Approaches ... 118
3.7 EXTENSIONS FOR MULTIWAY TABLES AND NONTABULATED RESPONSES ... 118
3.7.1 Categorical Data Need Not Be Contingency Tables ... 118
NOTES ... 119
EXERCISES ... 121
CHAPTER 4 Introduction to Generalized Linear Models ... 131
4.1 THE GENERALIZED LINEAR MODEL ... 131
4.1.1 Components of Generalized Linear Models ... 132
4.1.2 Binomial Logit Models for Binary Data ... 132
4.1.3 Poisson Loglinear Models for Count Data ... 133
4.1.4 Generalized Linear Models for Continuous Responses ... 133
4.1.5 Deviance of a GLM ... 133
4.1.6 Advantages of GLMs Versus Transforming the Data ... 134
4.2 GENERALIZED LINEAR MODELS FOR BINARY DATA ... 135
4.2.1 Linear Probability Model ... 135
4.2.2 Example: Snoring and Heart Disease ... 136
4.2.3 Logistic Regression Model ... 137
4.2.4 Binomial GLM for 2x2 Contingency Tables ... 138
4.2.5 Probit and Inverse cdf Link Functions ... 139Black,notBold,notItalic,open,TopLeftZoom,358,2,0.0
4.2.6 Latent Tolerance Motivation for Binary Response Models ... 140
4.3 GENERALIZED LINEAR MODELS FOR COUNTS AND RATES ... 140
4.3.1 Poisson Loglinear Models ... 141
4.3.2 Example: Horseshoe Crab Mating ... 141
4.3.3 Overdispersion for Poisson GLMs ... 144
4.3.4 Negative Binomial GLMs ... 145
4.3.5 Poisson Regression for Rates Using Offsets ... 146
4.3.6 Example: Modeling Death Rates for Heart Valve Operations ... 146
4.3.7 Poisson GLM of Independence in Two-Way Contingency Tables ... 148
4.4 MOMENTS AND LIKELIHOOD FOR GENERALIZED LINEAR MODELS ... 148
4.4.1 The Exponential Dispersion Family ... 148
4.4.2 Mean and Variance Functions for the Random Component ... 149
4.4.3 Mean and Variance Functions for Poisson and Binomial GLMs ... 150
4.4.4 Systematic Component and Link Function of a GLM ... 150
4.4.S Likelihood Equations for a GLM ... 151
4.4.6 The Key Role of the Mean-Variance Relationship ... 152
4.4.7 Likelihood Equations for Binomial GLMs ... 152
4.4.8 Asymptotic Covariance Matrix of Model Parameter Estimators ... 153
4.4.9 Likelihood Equations and cov (p) for Poisson Loglinear Model ... 154
4.5 INFERENCE AND MODEL CHECKING FOR GENERALIZED LINEAR MODELS ... 154
4.5.1 Deviance and Goodness of Fit ... 154
4.5.2 Deviance for Poisson GLMs ... 155
4.5.3 Deviance for Binomial GLMs: Grouped Versus Ungrouped Data ... 155
4.5.4 Likelihood-Ratio Model Comparison Using the Deviances ... 156
4.5.S Score Tests for Goodness of Fit and for Model Comparison ... 157
4.5.6 Residuals for GLMs ... 158
4.5.7 Covariance Matrices for Fitted Values and Residuals ... 160
4.5.8 The Bayesian Approach for GLMs ... 160
4.6 FITTING GENERALIZED LINEAR MODELS ... 161
4.6.1 Newton-Raphson Method ... 161
4.6.2 Fisher Scoring Method ... 162
4.6.3 Newton-Raphson and Fisher Scoring for Binary Data ... 163
4.6.4 ML as Iterative Reweighted Least Squares ... 164Black,notBold,notItalic,open,TopLeftZoom,284,2,0.0
4.6.5 Simplifi cations for Canonical Link Functions ... 165
4.7 QUASI-LIKELIHOOD AND GENERALIZED LINEAR MODELS ... 167
4.7.1 Mean-Variance Relationship Determines Quasi-likelihood Estimates ... 167
4.7.2 Overdispersion for Poisson GLMs and Quasi-likelihood ... 167
4.7.3 Overdispersion for Binomial GLMs and Quasi-likelihood ... 168
4.7.4 Example: Teratology Overdispersion ... 169
NOTES ... 170
EXERCISES ... 171
CHAPTER 5 Logistic Regression ... 181
5.1 INTERPRETING PARAMETERS IN LOGISTIC REGRESSION ... 181
5.1.1 Interpreting p: Odds, Probabilities, and Linear Approximations ... 182
5.1.2 Looking at the Data ... 183
5.1.3 Example: Horseshoe Crab Mating Revisited ... 184
5.1.4 Logistic Regression with Retrospective Studies ... 186
5.1.5 Logistic Regression Is Implied by Normal Explanatory Variables ... 187
5.2 INFERENCE FOR LOGISTIC REGRESSION ... 187
5.2.1 Inference About Model Parameters and Probabilities ... 187
5.2.2 Example: Inference for Horseshoe Crab Mating Data ... 188
5.2.3 Checking Goodness of Fit: Grouped and Ungrouped Data ... 189
5.2.4 Example: Model Goodness of Fit for Horseshoe Crab Data ... 190
5.2.5 Checking Goodness of Fit with Ungrouped Data by Grouping ... 190
5.2.6 Wald Inference Can Be Suboptimal ... 192
5.3 LOGISTIC MODELS WITH CATEGORICAL PREDICTORS ... 193
5.3.1 ANOVA-Type Representation of Factors ... 193
5.3.2 Indicator Variables Represent a Factor ... 193
5.3.3 Example: Alcohol and Infant Malformation Revisited ... 194
5.3.4 Linear Logit Model for Ix2 Contingency Tables ... 195
5.3.5 Cochran-Armitage Trend Test ... 195
5.3.6 Example: Alcohol and Infant Malformation Revisited ... 197
5.3.7 Using Directed Models Can Improve Inferential Power ... 197
5.3.8 Noncentral Chi-Squared Distribution and Power for Narrower Alternatives ... 198
5.3.9 Example: Skin Damage and Leprosy ... 199
5.3.10 Model Smoothing Improves Precision of Estimation ... 200
5.4 MULTIPLE LOGISTIC REGRESSION ... 200
5.4.1 Logistic Models for Multiway Contingency Tables ... 201
5.4.2 Example: AIDS and AZT Use ... 202
5.4.3 Goodness of Fit as a Likelihood-Ratio Test ... 204
5.4.4 Model Comparison by Comparing Deviances ... 205
5.4.5 Example: Horseshoe Crab Satellites Revisited ... 205
5.4.6 Quantitative Treatment of Ordinal Predictor ... 207Black,notBold,notItalic,open,TopLeftZoom,292,2,0.0
5.4.7 Probability-Based and Standardized Interpretations ... 208
5.4.8 Estimating an Average Causal Eff ect ... 209
5.5 FITTING LOGISTIC REGRESSION MODELS ... 210
5.5.1 Likelihood Equations for Logistic Regression ... 210
5.5.2 Asymptotic Covariance Matrix of Parameter Estimators ... 211
5.5.3 Distribution of Probability Estimators ... 212
5.5.4 Newton-Raphson Method Applied to Logistic Regression ... 212
NOTES ... 213
EXERCISES ... 214
CHAPTER 6 Building, Checking, and Applying Logistic Regression Models ... 225
6.1 STRATEGIES IN MODEL SELECTION ... 225
6.1.1 How Many Explanatory Variables Can Be in the Model? ... 226
6.1.2 Example: Horseshoe Crab Mating Data Revisited ... 226
6.1.3 Stepwise Procedures: Forward Selection and Backward Elimination ... 227
6.1.4 Example: Backward Elimination for Horseshoe Crab Data ... 228
6.1.5 Model Selection and the "Correct" Model ... 229
6.1.6 AIC: Minimizing Distance of the Fit from the Tr uth ... 230
6.1.7 Example: Using Causal Hypotheses to Guide Model Building ... 231
6.1.8 Alternative Strategies, Including Model Averaging ... 233
6.2 LOGISTIC REGRESSION DIAGNOSTICS ... 233
6.2.1 Residuals: Pearson, Deviance, and Standardized ... 233
6.2.2 Example: Heart Disease and Blood Pressure ... 234
6.2.3 Example: Admissions to Graduate School at Florida ... 236
6.2.4 Infl uence Diagnostics for Logistic Regression ... 238
6.3 SUMMARIZING THE PREDICTIVE POWER OF A MODEL ... 239
6.3.1 Summarizing Predictive Power: Rand R-Squared Measures ... 239
6.3.2 Summarizing Predictive Power: Likelihood and Deviance Measures ... 240
6.3.3 Summarizing Predictive Power: Classifi cation Tables ... 241
6.3.4 Summarizing Predictive Power: ROC Curves ... 242
6.3.S Example: Evaluating Predictive Power for Horseshoe Crab Data ... 242
6.4 MANTEL-HAENSZEL AND RELATED METHODS FOR MULTIPLE 2x2 TABLES ... 243
6.4.1 Using Logistic Models to Test Conditional Independence ... 244
6.4.2 Cochran-Mantel-Haenszel Test of Conditional Independence ... 245
6.4.3 Example: Multicenter Clinical Trial Revisited ... 246
6.4.4 CMH Test Is Advantageous for Sparse Data ... 246
6.4.S Estimation of Common Odds Ratio ... 247
6.4.6 Meta-analyses for Summarizing Multiple 2x2 Tables ... 248
6.4. 7 Meta-analyses for Multiple 2x2 Tables: Diff erence of Proportions ... 249
6.4.8 Collapsibility and Logistic Models for Contingency Tables ... 250
6.4.9 Testing Homogeneity of Odds Ratios ... 250
6.4.10 Summarizing Heterogeneity in Odds Ratios ... 251
6.4.11 Propensity Scores in Observational Studies ... 251
6.5 DETECTING A ND DEALING WITH INFINITE ESTIMATES ... 251
6.5.1 Complete or Quasi-complete Separation ... 252
6.5.2 Example: Multicenter Clinical Trial with Few Successes ... 253
6.5.3 Remedies When at Least One ML Estimate Is Infi nite ... 254
6.6 SAMPLE SIZE AND POWER CONSIDERATIONS ... 255
6.6.1 Sample Size and Power for Comparing Two Proportions ... 255
6.6.2 Sample Size Determination in Logistic Regression ... 256
6.6.3 Sample Size in Multiple Logistic Regression ... 257
6.6.4 Power for Chi-Squared Tests in Contingency Tables ... 257
6.6.5 Power for Testing Conditional Independence ... 258
6.6.6 Effects of Sample Size on Model Selection and Inference ... 259
NOTES ... 259
EXERCISES ... 261
CHAPTER 7 Alternative Modeling of Binary Response Data ... 269
7.1 PROBIT AND COMPLEMENTARY LOG-LOG MODELS ... 269
7.1.l Probit Models: Three Latent Variable Motivations ... 270
7.1.2 Probit Models: Interpreting Eff ects ... 270
7.1.3 Probit Model Fitting ... 271
7.1.4 Example: Modeling Flour Beetle Mortality ... 272
7.1.5 Complementary Log-Log Link Models ... 273
7.1.6 Example: Beetle Mortality Revisited ... 275
7.2 BAYESIAN INFERENCE FOR BINARY REGRESSION ... 275
7.2.1 Prior Specifi cations for Binary Regression Models ... 275
7.2.2 Example: Risk Factors for Endometrial Cancer Grade ... 276
7 .2.3 Bayesian Logistic Regression for Retrospective Studies ... 278
7.2.4 Probability-Based Prior Specifi cations for Binary Regression Models ... 278
7.2.5 Example: Modeling the Probability a Trauma Patient Survives ... 279
7.2.6 Bayesian Fitting for Probit Models ... 281
7.2.7 Bayesian Model Checking for Binary Regression ... 283
7.3 CONDITIONAL LOGISTIC REGRESSION ... 283
7.3.1 Conditional Likelihood ... 283
7.3.2 Small-Sample Inference for a Logistic Regression Parameter ... 285
7.3.3 Small-Sample Conditional Inference for 2x2 Contingency Tables ... 285
7.3.4 Small-Sample Conditional Inference for Linear Logit Model ... 286
7.3.5 Small-Sample Tests of Conditional Independence in 2x2 x K Tables ... 287
7.3.6 Example: Promotion Discrimination ... 287
7.3.7 Discreteness Complications of Using Exact Conditional Inference ... 288
7.4 SMOOTHING: KERNELS, PENALIZED LIKELIHOOD, GENERALIZED ADDITIVE MODELS ... 288
7.4.l How Much Smoothing The Variance -Bias Trade-off ... 288
7.4.2 Kernel Smoothing ... 289
7.4.3 Example: Smoothing to Portray Probability of Kyphosis ... 290
7.4.4 Nearest Neighbors Smoothing ... 290
7.4.5 Smoothing Using Penalized Likelihood Estimation ... 291
7.4.6 Why Shrink Estimates Toward 0? ... 293
7.4.7 Firth's Penalized Likelihood for Logistic Regression ... 293
7.4.8 Example: Complete Separation but Finite Logistic Estimates ... 293
7.4.9 Generalized Additive Models ... 294
7.4.10 Example: GAMs for Horseshoe Crab Mating Data ... 295
7.4.11 Advantages -Disadvantages of Various Smoothing Methods ... 295
7.5 ISSUES IN ANALYZING HIGH-DIMENSIONAL CATEGORICAL DATA ... 296
7.5.l Issues in Selecting Explanatory Variables ... 296
7.5.2 Adjusting for Multiplicity: The Bonferroni Method ... 297
7.5.3 Adjusting for Multiplicity: The False Discovery Rate ... 298
7.5.4 Other Variable Selection Methods with High-Dimensional Data ... 299
7.5.S Examples: High-Dimensional Applications in Genomics ... 300
7.5.6 Example: Motif Discovery for Protein Sequences ... 301
7.5.7 Example: The Netfl ix Prize ... 302
7.5.8 Example: Credit Scoring ... 303
NOTES ... 303
EXERCISES ... 305
CHAPTER 8 Models for Multinomial Responses ... 311
8.1 NOMINAL RESPONSES: BASELINE-CATEGORY LOGIT MODELS ... 311
8.1.1 Baseline-Category Logits ... 311
8.1.2 Example: Alligator Food Choice ... 312
8.1.3 Estimating Response Probabilities ... 314
8.1.4 Fitting Baseline-Category Logistic Models ... 315
8.1.5 Multicategory Logit Model as a Multivariate GLM ... 317
8.1.6 Multinomial Probit Models ... 317
8.1.7 Example: Eff ect of Menu Pricing ... 318
8.2 ORDINAL RESPONSES: CUMULATIVE LOGIT MODELS ... 319
8.2.1 Cumulative Logits ... 319
8.2.2 Proportional Odds Form of Cumulative Logit Model ... 319
8.2.3 Latent Variable Motivation for Proportional Odds Structure ... 321
8.2.4 Example: Happiness and Traumatic Events ... 322
8.2.S Checking the Proportional Odds Assumption ... 324
8.3 ORDINAL RESPONSES: ALTERNATIVE MODELS ... 326
8.3.1 Cumulative Link Models ... 326
8.3.2 Cumulative Probit and Log-Log Models ... 326
8.3.3 Example: Happiness Revisited with Cumulative Probits ... 327
8.3.4 Adjacent-Categories Logit Models ... 327
8.3.5 Example: Happiness Revisited ... 328
8.3.6 Continuation-Ratio Logit Models ... 329
8.3.7 Example: Developmental Toxicity Study with Pregnant Mice ... 330
8.3.8 Stochastic Ordering Location Eff ects Versus Dispersion Eff ects ... 331
8.3.9 Summarizing Predictive Power of Explanatory Variables ... 332
8.4 TESTING CONDITIONAL INDEPENDENCE IN IxJ x K TABLES ... 332
8.4.1 Testing Conditional Independence Using Multinomial Models ... 332
8.4.2 Example: Homosexual Marriage and Religious Fundamentalism ... 334
8.4.3 Generalized Cochran-Mantel-Haenszel Tests for IxJ x K Tables ... 335
8.4.4 Example: Homosexual Marriage Revisited ... 337
8.4.5 Related Score Tests for Multinomial Logit Models ... 337
8.5 DISCRETE-CHOICE MODELS ... 338
8.5.1 Conditional Logits for Characteristics of the Choices ... 338
8.5.2 Multinomial Logit Model Expressed as Discrete-Choice Model ... 339
8.5.3 Example: Shopping Destination Choice ... 339
8.5.4 Multinomial Probit Discrete-Choice Models ... 339
8.5.5 Extensions: Nested Logit and Mixed Logit Models ... 340
8.5.6 Extensions: Discrete Choice with Ordered Categories ... 340
8.6 BAYESIAN MODELING OF MULTINOMIAL RESPONSES ... 341
8.6.1 Bayesian Fitting of Cumulative Link Models ... 341
8.6.2 Example: Cannabis Use and Mother's Age ... 342
8.6.3 Bayesian Fitting of Multinomial Logit and Probit Models ... 343
8.6.4 Example: Alligator Food Choice Revisited ... 344
NOTES ... 344
EXERCISES ... 347
CHAPTER 9 Loglinear Models for Contingency Tables ... 357
9.1 LOGLINEAR MODELS FOR TWO-WAY TABLES ... 357
9.1.1 Independence Model for a Two-Way Table ... 357
9.1.2 Interpretation of Loglinear Model Parameters ... 358
9.1.3 Saturated Model for a Two-Way Table ... 358
9.1.4 Alternative Parameter Constraints ... 359
9.1.5 Hierarchical Versus Nonhierarchical Models ... 359
9.1.6 Multinomial Models for Cell Probabilities ... 360
9.2 LOGLINEAR MODELS FOR INDEPENDENCE AND INTERACTION IN THREE-WAY TABLES ... 360
9.2.1 Types of Independence ... 360
9.2.2 Homogeneous Association and Three-Factor Interaction ... 362
9.2.3 Interpretation of Loglinear Model Parameters ... 363
9.2.4 Example: Alcohol, Cigarette, and Marijuana Use ... 364
9.3 INFERENCE FOR LOGLINEAR MODELS ... 366
9.3.1 Chi-Squared Goodness-of-Fit Tests ... 366
9.3.2 Inference about Conditional Associations ... 366
9.4 LOGLINEAR MODELS FOR HIGHER DIMENSIONS ... 368
9.4.1 Models for Four-Way Contingency Tables ... 368
9.4.2 Example: Automobile Accidents and Seat-Belt Use ... 368
9.4.3 Large Samples and Statistical Versus Practical Signifi cance ... 370
9.4.4 Dissimilarity Index ... 370
9.5 LOGLINEAR-LOGISTIC MODEL CONNECTION ... 371
9.5.1 Using Logistic Models to Interpret Loglinear Models ... 371
9.5.2 Example: Auto Accidents and Seat-Belts Revisited ... 372
9.5.3 Equivalent Loglinear and Logistic Models ... 372
9.5.4 Example: Detecting Gene-Environment Interactions in Case-Control Studies ... 373
9.6 LOG LINEAR MODEL FITTING: LIKELIHOOD EQUATIONS AND ASYMPTOTIC DISTRIBUTIONS ... 374
9.6.1 Minimal Suffi cient Statistics ... 374
9.6.2 Likelihood Equations for Loglinear Models ... 375
9.6.3 Unique ML Estimates Match Data in Suffi cient Marginal Tables ... 376
9.6.4 Direct Versus Iterative Calculation of Fitted Values ... 376
9.6.S Decomposable Models ... 377
9.6.6 Chi-Squared Goodness-of-Fit Tests ... 377
9.6.7 Covariance Matrix of ML Parameter Estimators ... 378
9.6.8 Connection Between Multinomial and Poisson Loglinear Models ... 379
9.6.9 Distribution of Probability Estimators ... 380
9.6.10 Proof of Uniqueness of ML Estimates ... 381
9.6.11 Pseudo ML for Complex Sampling Designs ... 381
9.7 LOGLINEA R MODEL FITTING: ITERATIVE METHODS AND THEIR APPLICATION ... 382
9.7.1 Newton-Raphson Method ... 382
9.7.2 Iterative Proportional Fitting ... 383
9.7.3 Comparison ofIPF and Newton-Raphson Iterative Methods ... 384
9.7.4 Raking a Table: Contingency Table Standardization ... 385
NOTES ... 386
EXERCISES ... 387Black,notBold,notItalic,open,TopLeftZoom,165,2,0.0
CHAPTER 10 Building and Extending Loglinear Models ... 395
IO.I CONDITIONAL INDEPENDENCE GRAPHS AND COLLAPSIBILITY ... 395
IO.I.I Conditional Independence Graphs ... 395
10.1.2 Graphical Loglinear Models ... 396
10.1.3 Collapsibility in Three-Way Contingency Tables ... 397
10.1.4 Collapsibility for Multiway Tables ... 398
10.2 MODEL SELECTION AND COMPARISON ... 398
10.2.1 Considerations in Model Selection ... 398
10.2.2 Example: Model Building for Student Survey ... 399
10.2.3 Loglinear Model Comparison Statistics ... 401
10.2.4 Partitioning Chi-Squared with Model Comparisons ... 402
10.2.5 Identical Marginal and Conditional Tests of Independence ... 402
10.3 RESIDUALS FOR DETECTING CELL-SPECIFIC LACK OF FIT ... 403
10.3.1 Residuals for Loglinear Models ... 403
10.3.2 Example: Student Survey Revisited ... 403
10.3.3 Identical Loglinear and Logistic Standardized Residuals ... 404
10.4 MODELING ORDINAL ASSOCIATIONS ... 404
10.4.1 Linear-by-Linear Association Model for Two-Way Tables ... 405
10.4.2 Corresponding Logistic Model for Adjacent Responses ... 406
10.4.3 Likelihood Equations and Model Fitting ... 407
10.4.4 Example: Sex and Birth Control Opinions Revisited ... 407
10.4.5 Directed Ordinal Test of Independence ... 409
10.4.6 Row Effects and Column Effects Association Models ... 409
10.4.7 Example: Estimating Category Scores for Premarital Sex ... 410
10.4.8 Ordinal Variables in Models for Multiway Tables ... 410
10.S GENERALIZED LOGLINEAR AND ASSOCIATION MODELS, CORRELATION MODELS, AND CORRESPONDENCE ANALYSIS ... 411
10.S.1 Generalized Loglinear Model ... 411
10.5.2 Multiplicative Row and Column Eff ects Model ... 412
10.5.3 Example: Mental Health and Parents' SES ... 413
10.5.4 Correlation Models ... 413
10.5.5 Correspondence Analysis ... 414
10.5.6 Model Selection and Score Choice for Ordinal Variables ... 416
10.6 EMPTY CELLS AND SPARSENESS IN MODELING CONTINGENCY TABLES ... 416
10.6.1 Empty Cells: Sampling Versus Structural Zeros ... 416
10.6.2 Existence of Estimates in Loglinear Models ... 416
10.6.3 Eff ects of Sparseness on X2, G2, and Model-Based Tests ... 418
10.6.4 Alternative Sparse Data Asymptotics ... 419
10.6.5 Adding Constants to Cells of a Contingency Table ... 419
10.7 BAYESIAN LOGLINEAR MODELING ... 419
10.7.1 Estimating Loglinear Model Parameters in Two-Way Tables ... 420
10.7.2 Example: Polarized Opinions by Political Party ... 420
10.7.3 Bayesian Loglinear Modeling of Multidimensional Tables ... 421
10.7.4 Graphical Conditional Independence Models ... 422
NOTES ... 422
EXERCISES ... 425
CHAPTER 11 Models for Matched Pairs ... 431
11.1 COMPARING DEPENDENT PROPORTIONS ... 432
11.1.2 McNemar Test Comparing Dependent Proportions ... 433
11.1.3 Example: Changes in Presidential Election Voting ... 433
11.1.4 Increased Precision with Dependent Samples ... 434
11.1.5 Small-Sample Test Comparing Dependent Proportions ... 434
11.1.6 Connection Between McNemar and Cochran-Mantel-Haenszel Tests ... 435
11.1.7 Subject-Specifi c and Population- Averaged (Marginal) Tables ... 436
11.2 CONDITIONAL LOGISTIC REGRESSION FOR BINARY MATCHED PA IRS ... 436
11.2.1 Subject-Specific Versus Marginal Models for Matched Pairs ... 436
11.2.2 Logistic Models with Subject-Specific Probabilities ... 437
11.2.3 Conditional ML Inference for Binary Matched Pairs ... 438
11.2.4 Random Effects in Binary Matched-Pairs Model ... 439
11.2.S Conditional Logistic Regression for Matched Case-Control Studies ... 439
11.2.6 Conditional Logistic Regression for Matched Pairs with Multiple Predictors ... 440
11.2.7 Marginal Models and Subject-Specifi c Models: Extensions ... 441
11.3 MARGINAL MODELS FOR SQUARE CONTINGENCY TABLES ... 442
11.3.1 Marginal Models for Nominal Classifi cations ... 442
11.3.2 Example: Regional Migration ... 443
11.3.3 Marginal Models for Ordinal Classifi cations ... 443
11.3.4 Example: Opinions on Premarital and Extramarital Sex ... 444
11.4 SYMMETRY, QUASI-SYMMETRY, AND QUASI-INDEPENDENCE ... 444
11.4.1 Symmetry as Logistic and Loglinear Models ... 445
11.4.2 Quasi-symmetry ... 445
11.4.3 Marginal Homogeneity and Quasi-symmetry ... 447
11.4.4 Quasi-independence ... 447
11.4.5 Example: Migration Revisited ... 448
11.4.6 Ordinal Quasi-symmetry ... 449
11.4.7 Example: Premarital and Extramarital Sex Revisited ... 450
11.5 MEASURING AGREEMENT BETWEEN OBSERVERS ... 450
11.5.1 Agreement: Departures from Independence ... 451
11.5.2 Using Quasi-independence to Analyze Agreement ... 451
11.5.3 Quasi-symmetry and Agreement Modeling ... 452
11.5.4 Kappa: A Summary Measure of Agreement ... 452
11.5.5 Weighted Kappa: Quantifying Disagreement ... 453
11.S.6 Extensions to Multiple Observers ... 453
11.6 BRADLEY-TERRY MODEL FOR PAIRED PREFERENCES ... 454
11.6.1 Bradley-Terry Model ... 454
11.6.2 Example: Major League Baseball Rankings ... 454
11.6.3 Example: Home Team Advantage in Baseball ... 455
11.6.4 Bradley-Terry Model and Quasi-symmetry ... 456
11.6.S Extensions to Ties and Ordinal Pairwise Evaluations ... 457
11.7 MARGINAL MODELS AND QUASI-SYMMETRY MODELS FOR MATCHED SETS ... 457
11.7.1 Marginal Homogeneity, Complete Symmetry, and Quasi-symmetry ... 457
11.7.2 Types of Marginal Symmetry ... 458
11.7.3 Comparing Binary Marginal Distributions in Multiway Tables ... 458
11.7.4 Example: Attitudes Toward Legalized Abortion ... 459
11.7.S Marginal Homogeneity for a Multicategory Response ... 460
11.7.6 Wald and Generalized CMH Score Tests of Marginal Homogeneity ... 460
NOTES ... 461
EXERCISES ... 463
CHAPTER 12 Clustered Categorical Data: Marginal and Transitional Models ... 473
12.1 MARGINAL MODELING: MAXIMUM LIKELIHOOD APPROACH ... 474
12.1.1 Example: Longitudinal Study of Mental Depression ... 474
12.1.2 Modeling a Repeated Multinomial Response ... 476
12.1.3 Example: Insomnia Clinical Trial ... 476
12.1.4 ML Fitting of Marginal Logistic Models: Constraints on Cell Probabilities ... 477
12.1.5 ML Fitting of Marginal Logistic Models: Other Methods ... 479
12.2 MARGINAL MODELING: GENERALIZED ESTIMATING EQUATIONS (GEEs) APPROACH ... 480
12.2.1 Generalized Estimating Equations Methodology: Basic Ideas ... 480
12.2.2 Example: Longitudinal Mental Depression Revisited ... 481
12.2.3 Example: Multinomial GEE Approach for Insomnia Trial ... 482
12.3 QUASI-LIKELIHOOD A ND ITS GEE MULTIVARIATE EXTENSION: DETAILS ... 483
12.3.1 The Univariate Quasi-likelihood Method ... 483
12.3.2 Properties of Quasi-likelihood Estimators ... 484
12.3.3 Sandwich Covariance Adjustment for Variance Misspecifi cation ... 485
12.3.4 GEE Multivariate Methodology: Technical Details ... 486
12.3.S Working Associations Characterized by Odds Ratios ... 488
12.3.6 GEE Approach: Multinomial Responses ... 488
12.3. 7 Dealing with Missing Data ... 489
12.4 TRANSITIONAL MODELS: MARKOV CHAIN AND TIME SERIES MODELS ... 491
12.4.1 Markov Chains ... 491
12.4.2 Example: Changes in Evapotranspiration Rates ... 492
12.4.3 Transitional Models with Explanatory Variables ... 493
12.4.4 Example: Child's Respiratory Illness and Maternal Smoking ... 494
12.4.5 Example: Initial Response in Matched Pair as a Covariate ... 495
12.4.6 Transitional Models and Loglinear Conditional Models ... 496
NOTES ... 496
EXERCISES ... 497
CHAPTER 13 Clustered Categorical Data: Random Effects Models ... 507
13.1 RANDOM EFFECTS MODELING OF CLUSTERED CATEGORICAL DATA ... 507
13.1.1 Generalized Linear Mixed Model ... 508
13.1.2 Logistic GLMM with Random Intercept for Binary Matched Pairs ... 509
13.1.3 Example: Changes in Presidential Voting Revisited ... 510
13.1.4 Extension: Rasch Model and Item Response Models ... 510
13.1.S Random Eff ects Versus Conditional ML Approaches ... 511
13.2 BINARY RESPONSES: LOGISTIC-NORMAL MODEL ... 512
13.2.1 Shared Random Eff ect Implies Nonnegative Marginal Correlations ... 512
13.2.2 Interpreting Heterogeneity in Logistic-Normal Models ... 512
13.2.3 Connections Between Random Eff ects Models and Marginal Models ... 513
13.2.4 Comments About GLMMs Versus Marginal Models ... 515
13.3 EXAMPLES OF RANDOM EFFECTS MODELS FOR BINARY DATA ... 516
13.3.1 Example: Small-Area Estimation of Binomial Proportions ... 516
13.3.2 Modeling Repeated Binary Responses: Attitudes About Abortion ... 518
13.3.3 Example: Longitudinal Mental Depression Study Revisited ... 520
13.3.4 Example: Capture-Recapture Prediction of Population Size ... 521
13.3.S Example: Heterogeneity Among Multicenter Clinical Trials ... 523
13.3.6 Meta-analysis Using a Random Effects Approach ... 525
13.3.7 Alternative Formulations of Random Effects Models ... 525
13.3.8 Example: Matched Pairs with a Bivariate Binary Response ... 526
13.3.9 Time Series Models Using Autocorrelated Random Eff ects ... 527
13.3.10 Example: Oxford and Cambridge Annual Boat Race ... 528
13.4 RANDOM EFFECT S MODELS FOR MULTINOMIAL DATA ... 529
13.4.1 Cumulative Logit Model with Random Intercept ... 529
13.4.2 Example: Insomnia Study Revisited ... 529
13.4.3 Example: Combining Measures on Ordinal Items ... 530
13.4.4 Example: Cluster Sampling ... 531
13.4.S Baseline-Category Logit Models with Random Eff ects ... 532
13.4.6 Example: Eff ectiveness of Housing Program ... 532
13.5 MULTILEVEL MODELING ... 533
13.5.1 Hierarchical Random Terms: Partitioning Variability ... 534
13.5.2 Example: Children's Care for an Unmarried Mother ... 534
13.6 GLMM FITTING, INFERENCE, AND PREDICTION ... 537
13.6.1 Marginal Likelihood and Maximum Likelihood Fitting ... 537
13.6.2 Gauss-Hermite Quadrature Methods for ML Fitting ... 538
13.6.3 Monte Carlo and EM Methods for ML Fitting ... 538
13.6.4 Laplace and Penalized Quasi-likelihood Approximations to ML ... 539
13.6.5 Inference for GLMM Parameters ... 540
13.6.6 Prediction Using Random Effects ... 540
13.7 BAYESIAN MULTIVARIATE CATEGORICAL MODELING ... 541
13.7.1 Marginal Homogeneity Analyses for Matched Pairs ... 541
13.7.2 Bayesian Approaches to Meta-analysis and Multicenter Trials ... 541
13.7.3 Example: Bayesian Analyses for a Multicenter Trial ... 542
13.7.4 Bayesian GLMMs and Marginal Models ... 542
NOTES ... 543
EXERCISES ... 545
CHAPTER 14 Other Mixture Models for Discrete Data ... 553
14.1 LATENT CLASS MODELS ... 553
14.1.1 Independence Given a Latent Categorical Variable ... 554
14.1.2 Fitting Latent Class Models ... 555
14.1.3 Example: Latent Class Model for Rater Agreement ... 556
14.1.4 Example: Latent Class Models for Capture-Recapture ... 558
14.1.5 Example: Latent Class Tr ansitional Models ... 559
14.2 NONPARAMETRIC RANDOM EFFECTS MODELS ... 560
14.2.1 Logistic Models with Unspecifi ed Random Eff ects Distribution ... 560
14.2.2 Example: Attitudes About Legalized Abortion ... 560
14.2.3 Example: Nonparametric Mixing of Logistic Regressions ... 561
14.2.4 Is Misspecifi cation of Random Eff ects a Serious Problem? ... 561
14.2.5 Rasch Mixture Model ... 563
14.2.6 Example: Modeling Rater Agreement Revisited ... 563
14.2.7 Nonparametric Mixtures and Quasi-symmetry ... 564
14.2.8 Example: Attitudes About Legalized Abortion Revisited ... 565
14.3 BETA-BINOMIAL MODELS ... 566
14.3.1 Beta-Binomial Distribution ... 566
14.3.2 Models Using the Beta-Binomial Distribution ... 567
14.3.3 Quasi-likelihood with Beta-Binomial Ty pe Variance ... 567
14.3.4 Example: Teratology Overdispersion Revisited ... 568
14.3.5 Conjugate Mixture Models ... 570
14.4 NEGATIVE BINOMIAL REGRESSION ... 570
14.4.1 Gamma Mixture of Poissons Is Negative Binomial ... 571
14.4.2 Negative Binomial Regression Modeling ... 571
14.4.3 Example: Frequency of Knowing Homicide Victims ... 572
14.5 POISSON REGRESSION WITH RANDOM EFFECTS ... 573
14.5.1 A Poisson GLMM ... 574
14.5.2 Marginal Model Implied by Poisson GLMM ... 574
14.5.3 Example: Homicide Victim Frequency Revisited ... 575
14.5.4 Negative Binomial Models versus Poisson GLMMs ... 575
NOTES ... 575
EXERCISES ... 576
CHAPTER 15 Non-Model-Based Classification and Clustering ... 583
15.1 CLASSIFICATION: LINEAR DISCRIMINANT ANALYSIS ... 583
15.1.1 Classifi cation with Normally Distributed Predictors ... 583
15.1.2 Example: Horseshoe Crab Satellites Revisited ... 585
15.1.3 Multicategory Classifi cation and Other Versions of Discriminant Analysis ... 586
15.1.4 Classifi cation Methods for High Dimensions ... 587
15.1.5 Discriminant Analysis Versus Logistic Regression ... 587
15.2 CLASSIFICATION: TREE-STRUCTURED PREDICTION ... 588
15.2.1 Classifi cation Trees ... 588
15.2.2 Example: Classifi cation Tree for a Health Care Application ... 589
15.2.3 How Does the Classifi cation Tree Grow? ... 590
15.2.4 Pruning a Tree and Checking Prediction Accuracy ... 591
15.2.5 Classifi cation Trees Versus Logistic Regression ... 592
15.2.6 Support Vector Machines for Classifi cation ... 593
15.3 CLUSTER ANALYSIS FOR CATEGORICAL DATA ... 594
15.3.1 Supervised Versus Unsupervised Learning ... 595
15.3.2 Measuring Dissimilarity Between Observations ... 595
15.3.3 Clustering Algorithms: Partitions and Hierarchies ... 596
15.3.4 Example: Clustering States on Election Results ... 597
NOTES ... 599
EXERCISES ... 600
CHAPTER 16 Large- and Small-Sample Theory for Multinomial Models ... 605
16.1 DELTA METHOD ... 605
16.1.1 0, o Rates of Convergence ... 606
16.1.2 Delta Method for a Function of a Random Variable ... 606
16.1.3 Delta Method for a Function of a Random Vector ... 607
16.1.4 Asymptotic Normality of Functions of Multinomial Counts ... 607
16.1.S Delta Method for a Vector Function of a Random Vector ... 609
16.1.6 Joint Asymptotic Normality of Log Odds Ratios ... 609
16.2 ASYMPTOTIC DISTRIBUTIONS OF ESTIMATORS OF MODEL PARA METERS AND CELL PROBABILITIES ... 610Black,notBold,notItalic,closed,TopLeftZoom,2,2,0.0
16.2.1 A symptotic Distribution of Model Parameter Estimator ... 610Black,notBold,notItalic,open,TopLeftZoom,845,2,0.0
16.2.2 Asymptotic Distribution of Cell Probability Estimators ... 611
16.2.3 Model Smoothing Is Benefi cial ... 612
16.3 ASYMPTOTIC DISTRIBUTIONS OF RESIDUALS AND GOODNESS-OF-FIT STATISTICS ... 612
16.3.1 Joint Asymptotic Normality of p and ii: ... 612
16.3.2 Asymptotic Distribution of Pearson and Standardized Residuals ... 613
16,3,3 Asymptotic Distribution of Pearson X2 Statistic ... 614
16.3.4 Asymptotic Distribution of Likelihood-Ratio Statistic ... 615
16.3.5 Asymptotic Noncentral Distributions ... 616
16.4 ASYMPTOTIC DISTRIBUTIONS FOR LOGIT - LOGLINEAR MODELS ... 617
16.4.1 Asymptotic Covariance Matrices ... 617
16.4.2 Connection with Poisson Loglinear Models ... 618
16.5 SMALL-SAMPLE SIGNIFICANCE TESTS FOR CONTINGENCY TABLES ... 619
16.5.1 Exa ct Conditional Distribution for IxJ Tables Under Independence ... 619
16.5.2 Exact Tests of Independence for IxJ Tables ... 620
16.5.3 Example: Sexual Orientation and Party ID ... 620
16.6 SMALL-SAMPLE CONFIDENCE INTERVALS FOR CATEGORICAL DATA ... 621
16.6.1 Small-Sample Cis for a Binomial Parameter ... 621
16.6.2 Cls Based on Tests Using the Mid P-Value ... 623
16.6.3 Example: Proportion of Vegetarians Revisited ... 623
16.6.4 Small-Sample Cls for Odds Ratios ... 624
16.6.5 Example: Fisher's Tea Taster Revisited ... 625
16.6.6 Small-Sample Cls for Logistic Regression Parameters ... 625
16.6.7 Example: Diarrhea and an Antibiotic ... 626
16.6.8 Unconditional Small-Sample Cls for Difference of Proportions ... 627
16.7 ALTERNATIVE ESTIMATION THEORY FOR PARAMETRIC MODELS ... 628
16.7.1 Weighted Least Squares for Categorical Data ... 628
16.7.2 Inference Using the WLS Approach to Model Fitting ... 629
16.7.3 Scope of WLS Versus ML Estimation ... 630
16.7.4 Minimum Chi-Squared Estimators ... 631
16.7.S Minimum Discrimination Information ... 632
NOTES ... 633
EXERCISES ... 634
CHAPTER 17 Historical Tour of Categorical Data Analysis ... 641
17.1 PEARSON-YULE ASSOCIATION CONTROVERSY ... 641
17.2 R. A. FISHER'S CONTRIBUTIONS ... 643
17.3 LOGISTIC REGRESSION ... 645
17.4 MULTIWAY CONTINGENC Y TABLES AND LOGLINEAR MODEL S ... 647
17.5 BAYESIAN METHODS FOR CATEGORICAL DATA ... 651
17.6 A LOOK FORWARD, AND BACKWARD ... 652
APPENDIX A Statistical Software for Categorical Data Analysis ... 655
References ... 661
Author Index ... 707
Subject Index ... 723