I got this book while working on an article that involved a hierarchical model with a binary dependent variable - after poking through Radenbush/Bryk and a variety of other texts that left me frustrated. Not only did this book teach me how to properly specify and estimate the model in R, I also learned a lot about interpretation and graphical means of presenting results. I don't think I've read another book that so effectively combines theoretical and practical information, while also being a relatively smooth read - the examples are clear and interesting! In addition to the extensive treatment of hierarchical models, Gelman and Hill also cover non-hierarchical OLS and ML models, plus a variety of other key stats topics. My only quibble is that the accompanying R code on Gelman's website isn't complete - but the fact that they have sample code available at all puts this far beyond most stats books. I wish I had had this book in grad school and look forward to referring to it for years to come.
Author(s): Andrew Gelman, Jennifer Hill
Edition: 1
Publisher: Cambridge University Press
Year: 2007
Language: English
Pages: 651
Tags: Информатика и вычислительная техника;Искусственный интеллект;Интеллектуальный анализ данных;
Cover......Page 1
Half-title......Page 3
Series-title......Page 5
Title......Page 7
Copyright......Page 8
Dedication......Page 11
Contents......Page 13
List of examples......Page 21
Preface......Page 23
Models for regression coefficients......Page 27
Labels......Page 28
Modeling correlations: forecasting presidential elections......Page 29
Small-area estimation: state-level opinions from national polls......Page 30
Social science modeling: police stops by ethnic group with variation across precincts......Page 31
Prediction......Page 32
Including predictors at two different levels......Page 33
1.4 Distinctive features of this book......Page 34
1.5 Computing......Page 35
R and S......Page 36
Data and code for examples......Page 37
Normal distribution; means and variances......Page 39
Lognormal distribution......Page 41
Sampling and measurement error models......Page 42
Standard errors for proportions......Page 43
Linear transformations......Page 44
Using simulation to compute confidence intervals for ratios, logarithms, odds ratios, logits, and other functions of estimated parameters......Page 45
Comparisons of parameters to fixed values and each other: interpreting confidence intervals as hypothesis tests......Page 46
Underdispersion......Page 47
2.5 Problems with statistical significance......Page 48
2.6 55,000 residents desperately need your help!......Page 49
2.8 Exercises......Page 52
Part 1A: Single-level regression......Page 55
For a binary predictor, the regression coe.cient is the difference between the averages of the two groups......Page 57
3.2 Multiple predictors......Page 58
3.3 Interactions......Page 60
Interpreting regression coefficients in the presence of interactions......Page 62
Two ways of writing the model......Page 63
Fitting and summarizing regressions in R......Page 64
Least squares estimate of the vector of regression coefficients, ß......Page 65
Residuals, ri......Page 66
Difficulties in interpreting residual standard deviation and explained variance......Page 67
Displaying a regression line as a function of one input variable......Page 68
Displaying uncertainty in the fitted regression......Page 69
Displaying using one plot for each input variable......Page 70
Assumptions of the regression model......Page 71
Prediction......Page 73
External validation......Page 74
3.9 Exercises......Page 75
4.1 Linear transformations......Page 79
Standardization using reasonable scales......Page 80
Centering by subtracting the mean of the data......Page 81
Standardizing by subtracting the mean and dividing by 2 standard deviations......Page 82
The principal components line and the regression line......Page 83
Regression to the mean......Page 84
Height and earnings example......Page 85
Why we use natural log rather than log-base-10......Page 86
Building a regression model on the log scale......Page 87
Log-log model: transforming the input and outcome variables......Page 90
Idiosyncratic transformations......Page 91
Using discrete rather than continuous predictors......Page 92
Index and indicator variables......Page 93
4.6 Building regression models for prediction......Page 94
General principles......Page 95
Predicting party identification......Page 99
4.9 Exercises......Page 100
The logistic regression model......Page 105
Evaluation at and near the mean of the data......Page 107
Interpretation of coefficients as odds ratios......Page 108
Inference......Page 109
Displaying the results of several logistic regressions......Page 110
5.3 Latent-data formulation......Page 111
5.4 Building a logistic regression model: wells in Bangladesh......Page 112
Background......Page 113
Interpreting the logistic regression coefficients......Page 115
Adding a second input variable......Page 116
Graphing the fitted model with two predictors......Page 117
5.5 Logistic regression with interactions......Page 118
Refitting the interaction model using the centered inputs......Page 119
Graphing the model with interactions......Page 120
Adding social predictors......Page 121
Standardizing predictors......Page 122
Plotting binned residuals versus inputs of interest......Page 123
Considering a log transformation......Page 124
Error rate and comparison to the null model......Page 125
Deviance......Page 126
Example: well switching in Bangladesh......Page 127
General notation for predictive comparisons......Page 129
5.8 Identifiability and separation......Page 130
5.10 Exercises......Page 131
6.1 Introduction......Page 135
Traffic accidents......Page 136
Poisson regression with an exposure input......Page 137
Example: police stops by ethnic group......Page 138
The exposure input......Page 139
Overdispersion......Page 140
Fitting the overdispersed-Poisson or negative-binomial model......Page 141
Overdispersion......Page 142
Count-data model as a special case of the binary-data model......Page 143
Probit or logit?......Page 144
The ordered multinomial logit model......Page 145
Example: storable votes......Page 146
Alternative approaches to modeling ordered categorical data......Page 149
Robit instead of logit or probit......Page 150
6.7 Building more complex generalized linear models......Page 151
Cockroaches and the zero-inflated Poisson model......Page 152
6.8 Constructive choice models......Page 153
Logistic or probit regression as a choice model in one dimension......Page 154
Logistic or probit regression as a choice model in multiple dimensions......Page 156
6.9 Bibliographic note......Page 157
6.10 Exercises......Page 158
Part 1B: Working with regression inferences......Page 161
A simple example of discrete predictive simulation......Page 163
Simulation in R using custom-made functions......Page 165
Simulation to represent predictive uncertainty......Page 166
Why do we need simulation for predictive inferences?......Page 167
Details of the simulation procedure......Page 168
Informal Bayesian inference......Page 169
Background......Page 170
Fitting the model......Page 171
Simulation for inferences and predictions of new data points......Page 172
Implementation using functions......Page 173
Logistic regression......Page 174
Compound models......Page 176
7.5 Bibliographic note......Page 177
7.6 Exercises......Page 178
8.1 Fake-data simulation......Page 181
8.2 Example: using fake-data simulation to understand residual plots......Page 183
8.3 Simulating from the fitted model and comparing to actual data......Page 184
Example: comparing data to replications from a fitted normal distribution......Page 185
Example: zeroes in count data......Page 187
Checking the overdispersed model......Page 188
Simulating replicated datasets......Page 189
Visual and numerical comparisons of replicated to actual data......Page 190
8.6 Exercises......Page 191
Hypothetical example of zero causal effeect but positive predictive comparison......Page 193
Hypothetical example of positive causal effect but zero positive predictive comparison......Page 194
Formula for omitted variable bias......Page 195
The problem......Page 196
Ways of getting around the problem......Page 197
9.3 Randomized experiments......Page 198
Average causal effects and randomized experiments......Page 199
Example: showing children an educational television show......Page 200
Controlling for pre-treatment predictors......Page 201
More than two treatment levels, continuous treatments, and multiple treatment factors......Page 203
Interactions of treatment effect with pre-treatment inputs......Page 204
9.5 Observational studies......Page 207
Assumption of ignorable treatment assignment......Page 208
Judging the reasonableness of regression as a modeling approach, assuming ignorability......Page 210
Examining overlap in the Electric Company embedded observational study......Page 211
Defining a “treatment” variable......Page 212
Thought experiment: what would be an ideal randomized experiment?......Page 213
9.7 Do not control for post-treatment variables......Page 214
Hypothetical example of a binary intermediate outcome......Page 216
Regression controlling for intermediate outcomes cannot, in general, estimate “mediating” effects......Page 217
What can be estimated: principal stratification......Page 218
Intermediate outcomes in the context of observational studies......Page 219
9.10 Exercises......Page 220
Imbalance and model sensitivity......Page 225
Example: evaluating the effectiveness of high-quality child care......Page 227
Examining imbalance for several covariates......Page 228
Imbalance is not the same as lack of overlap......Page 229
Subclassification......Page 230
Average treatment effects: whom do we average over?......Page 231
Matching and subclassification......Page 232
Propensity score matching......Page 233
Computation of propensity score matches......Page 234
The propensity score as a one-number summary used to assess balance and overlap......Page 235
Other matching methods, matching on all covariates, and subclassification......Page 236
Other uses for propensity scores......Page 237
Regression discontinuity and ignorability......Page 238
Example: political ideology of congressmembers......Page 239
10.5 Estimating causal effects indirectly using instrumental variables......Page 241
Ignorability of the instrument......Page 242
Monotonicity and the exclusion restrictions......Page 243
Derivation of instrumental variables estimation with complete data (including unobserved potential outcomes)......Page 244
Local average treatment e.ects......Page 245
Identifiability with instrumental variables......Page 246
Two-stage least squares......Page 247
Standard errors for instrumental variables estimates......Page 248
Performing two-stage least squares automatically using the tsls function......Page 249
Continuous treatment variables or instruments......Page 250
Assessing the plausibility of the instrumental variables assumptions......Page 251
Comparisons within groups—so-called fixed effects models......Page 252
Comparisons within and between groups: difference-in-differences estimation......Page 254
10.8 Bibliographic note......Page 255
10.9 Exercises......Page 257
Part 2A: Multilevel regression......Page 261
11.2 Clustered data: child support enforcement in cities......Page 263
Studying the effectiveness of child support enforcement......Page 264
Individual- and group-level models......Page 265
Multilevel models......Page 266
Repeated measurements......Page 267
Time-series cross-sectional data......Page 269
Multilevel regression: including all J indicators......Page 270
Fixed and random effects......Page 271
Complexity of multilevel models......Page 272
11.6 Bibliographic note......Page 273
11.7 Exercises......Page 274
12.1 Notation......Page 277
12.2 Partial pooling with no predictors......Page 278
Partial-pooling estimates from a multilevel model......Page 279
Complete-pooling and no-pooling analyses for the radon data, with predictors......Page 280
Multilevel analysis......Page 282
Average regression line and individual- and group-level variances......Page 283
Classical regression as a special case......Page 284
The lmer function......Page 285
Estimated regression coefficients......Page 286
Summarizing and displaying the fitted model......Page 287
Allowing regression coefficients to vary across groups......Page 288
Combining separate local regressions......Page 289
Regression with multiple error terms......Page 290
Adding a group-level predictor to improve inference for group coefficients aj......Page 291
Interpreting the coefficient of the group-level predictor......Page 294
Partial pooling of group coefficients aj in the presence of group-level predictors......Page 295
When is multilevel modeling most effective?......Page 296
Statistical significance......Page 297
Review of prediction for classical regression......Page 298
Prediction for a new observation in an existing group......Page 299
Nonlinear predictions......Page 300
One or two groups......Page 301
12.10 Bibliographic note......Page 302
12.11 Exercises......Page 303
13.1 Varying intercepts and slopes......Page 305
Including group-level predictors......Page 306
Varying slopes as interactions......Page 308
A situation in which a constant-intercept, varying-slope model is appropriate......Page 309
13.3 Modeling multiple varying coefficients using the scaled inverse-Wishart distribution......Page 310
Including individual-level predictors whose coefficients do not vary by group......Page 311
Modeling the group-level covariance matrix using the scaled inverse-Wishart distribution......Page 312
13.4 Understanding correlations between group-level intercepts and slopes......Page 313
13.5 Non-nested models......Page 315
Example: regression of earnings on ethnicity categories, age categories, and height......Page 316
Example: a latin square design with grouping factors and group-level predictors......Page 318
Classical models for regression coefficients......Page 319
Linear transformation and combination of inputs in a multilevel model......Page 320
Connection to factor analysis......Page 322
13.8 Bibliographic note......Page 323
13.9 Exercises......Page 324
14.1 State-level opinions from national polls......Page 327
A fuller model including non-nested factors......Page 328
Graphing the estimated model......Page 330
Using the model inferences to estimate average opinion for each state......Page 333
Comparing public opinion estimates to election outcomes......Page 335
Classical regressions of state averages and individuals......Page 336
Varying-intercept model of income and vote preference within states......Page 337
14.3 Item-response and ideal-point models......Page 340
Identifiability problems......Page 341
Adding a discrimination parameter......Page 342
An ideal-point model for Supreme Court voting......Page 343
Other generalizations......Page 345
Multilevel overdispersed binomial regression......Page 346
14.5 Bibliographic note......Page 347
14.6 Exercises......Page 348
Overdispersion as a variance component......Page 351
Multilevel Poisson regression model......Page 352
Modeling variability across precincts......Page 354
Modeling the relation of stops to previous year’s arrests......Page 355
15.2 Ordered categorical regression: storable votes......Page 357
15.3 Non-nested negative-binomial model of structure in social networks......Page 358
Background......Page 359
The overdispersed model......Page 361
Computation......Page 362
The distribution of social network sizes ai......Page 363
Relative sizes bk of subpopulations......Page 364
Overdispersion parameters ωk for subpopulations......Page 366
Analysis of residuals......Page 367
15.5 Exercises......Page 368
Part 2B: Fitting multilevel models......Page 369
16.2 Bayesian inference and prior distributions......Page 371
Varying-intercept, varying-slope model......Page 372
Noninformative prior distributions......Page 373
Setting up the data in R......Page 374
Classical no-pooling regression in R......Page 375
Calling Bugs from R......Page 376
Summarizing classical and multilevel inferences graphically......Page 378
The individual-level model......Page 379
Prior distributions......Page 380
Noninformative prior distributions......Page 381
Number of sequences and number of iterations......Page 382
Accessing the simulations......Page 384
Classical complete-pooling and no-pooling regressions......Page 385
Classical regression with multiple predictors......Page 386
16.6 Predictions for new observations and new groups......Page 387
Predicting a new unit in a new group using Bugs......Page 388
Specifying “true” parameter values......Page 389
Inference and comparison to “true” values......Page 390
Checking coverage of 50% intervals......Page 391
Data, parameters, and derived quantities......Page 392
Changing what is included in the data......Page 393
How many chains and how long to run?......Page 395
16.10 Open-ended modeling in Bugs......Page 396
Unequal variances......Page 397
Other distributional forms......Page 398
16.12 Exercises......Page 399
Simple model with no correlation between intercepts and slopes......Page 401
Scaled inverse-Wishart model......Page 402
Modeling multiple varying coefficients......Page 403
Adding unmodeled individual-level coefficients......Page 404
Multiple varying coefficients with multiple group-level predictors......Page 405
17.3 Non-nested models......Page 406
17.4 Multilevel logistic regression......Page 407
17.5 Multilevel Poisson regression......Page 408
17.6 Multilevel ordered categorical regression......Page 409
Robit regression......Page 410
17.9 Exercises......Page 411
Least squares......Page 413
Maximum likelihood......Page 414
Generalized linear models......Page 415
Two-parameter example: linear regression with two coefficients......Page 416
Informative prior distributions in a single-level regression......Page 418
Simple multilevel model with no predictors......Page 419
Individual predictors but no group-level predictors......Page 421
Multilevel regression as least squares with augmented data......Page 422
18.4 Gibbs sampler for multilevel linear models......Page 423
Gibbs sampler for a multilevel model with no predictors......Page 424
Programming the Gibbs sampler in R......Page 425
Gibbs sampler for a multilevel model with regression predictors......Page 427
18.5 Likelihood inference, Bayesian inference, and the Gibbs sampler: the case of censored data......Page 428
Naive regression estimate imputing the censoring point......Page 429
Maximum likelihood estimate using R......Page 430
Fitting the censored-data model using Bugs......Page 431
Gibbs sampler......Page 432
18.6 Metropolis algorithm for more general Bayesian computation......Page 434
The joint posterior density......Page 435
Programming in R and Umacs......Page 436
18.9 Exercises......Page 439
Getting your Bugs program to work......Page 441
Unexpected dificulties with Bugs......Page 443
Thinning output to save memory......Page 444
19.4 Redundant parameters and intentionally nonidentifiable models......Page 445
Redundant mean parameters for a simple nested model......Page 446
Example: multilevel logistic regression for survey responses......Page 448
“Adjusted” or “raw” parameters......Page 449
Solution using parameter expansion......Page 450
Example: multilevel logistic regression for survey responses......Page 451
Application to item-response and ideal-point models......Page 452
19.6 Using redundant parameters to create an informative prior distribution for multilevel variance parameters......Page 453
Noninformative prior distributions......Page 454
Example: educational testing experiments in 8 schools......Page 456
Weakly informative prior distribution for the 3-schools problem......Page 457
General comments......Page 458
19.8 Exercises......Page 460
Part 3: From data collection to model understanding to model checking......Page 461
Observational studies or experiments with unit-level or group-level treatments......Page 463
Sample size, design, and interactions......Page 464
Power calculations......Page 465
Sample size to achieve a specified probability of obtaining statistical significance......Page 466
2.8 standard errors from the comparison point......Page 467
Simple comparisons of proportions: equal sample sizes......Page 468
Simple comparisons of means......Page 469
Estimating standard deviations using results from previous studies......Page 470
Estimation of regression coeficients more generally......Page 472
Standard deviation of the mean of clustered data......Page 473
Example of a sample size calculation for cluster sampling......Page 474
Modeling a hypothetical treatment effect......Page 475
Power calculation for multilevel estimate using fake-data simulation......Page 477
20.7 Exercises......Page 480
Distinguishing between uncertainty and variability in a multilevel model......Page 483
Variability can be interesting in itself......Page 484
Example of a non-nested model......Page 485
The finite-population standard deviation is estimated more precisely than the superpopulation standard deviations......Page 486
Fixed and random effects......Page 487
21.3 Contrasts and comparisons of multilevel coefficients......Page 488
Contrasts: including an input both numerically and categorically......Page 489
Inferences for multilevel parameters defined relative to their finite-population average......Page 490
Notation and basic definition of predictive comparisons......Page 492
Problems with evaluating predictive comparisons at a central value......Page 493
General approach to defining population predictive comparisons......Page 494
Inputs that are not always active......Page 495
Applying average predictive comparisons to a single model......Page 496
Using average predictive comparisons to compare models......Page 498
21.5 R and explained variance......Page 499
Proportion of variance explained at each level......Page 500
Connections to classical definitions......Page 501
Setting up the computations in R and Bugs......Page 502
21.6 Summarizing the amount of partial pooling......Page 503
Setting up the computation in R and Bugs......Page 505
21.7 Adding a predictor can increase the residual variance!......Page 506
A meta-analysis of a set of randomized experiments......Page 507
Multilevel model......Page 508
21.9 Bibliographic note......Page 510
21.10 Exercises......Page 511
22.1 Classical analysis of variance......Page 513
Classical ANOVA as additive data decomposition......Page 514
Classical ANOVA for model comparison......Page 515
Notation......Page 516
Generalized linear models......Page 517
A five-way factorial analysis: internet connect times......Page 518
A multilevel logistic regression: vote preference broken down by state and demographics......Page 519
One-way ANOVA: radon measurements within counties......Page 520
Two-way ANOVA: flight simulator data......Page 521
Individual-level predictors and analysis of covariance......Page 522
Group-level predictors and contrast analysis......Page 523
Crossed and nested ANOVA: a split-plot design......Page 524
Multilevel model with noninformative prior distributions for the variance parameters......Page 525
Superpopulation and finite-population standard deviations......Page 526
22.8 Exercises......Page 527
Hierarchical analysis of a paired design......Page 529
Hierarchical analysis of randomized-block and other structured designs......Page 531
Varying treatment effects......Page 532
23.3 Treatments applied at different levels......Page 533
Analysis of an educational-subsidy program......Page 534
23.4 Instrumental variables and multilevel modeling......Page 535
Group-level randomization......Page 537
23.6 Exercises......Page 538
24.1 Principles of predictive checking......Page 539
Setting up the logistic regression model and estimating its parameters......Page 541
Defining predictive replications for the dog example......Page 543
More focused model checks......Page 545
Fitting and checking a logarithmic regression model......Page 547
Fitting and checking a multilevel model with no additional learning from avoidances......Page 548
24.3 Model comparison and deviance......Page 550
Deviance and DIC in multilevel models......Page 551
24.4 Bibliographic note......Page 552
24.5 Exercises......Page 553
Missing data in R and Bugs......Page 555
25.1 Missing-data mechanisms......Page 556
Complete-case analysis......Page 557
25.3 Simple missing-data approaches that retain all the data......Page 558
25.4 Random imputation of a single variable......Page 559
Zero coding and topcoding......Page 560
Using regression predictions to perform deterministic imputation......Page 561
Random regression imputation......Page 562
Two-stage modeling to impute a variable that can be positive or zero......Page 563
Matching and hot-deck imputation......Page 564
Iterative regression imputation......Page 565
Nonignorable missing-data models......Page 566
Imputation in multilevel data structures......Page 567
25.8 Bibliographic note......Page 568
25.9 Exercises......Page 569
Appendixes......Page 571
Data subsetting......Page 573
A.4 Transformations......Page 574
A.6 Estimate causal inferences in a targeted way, not as a byproduct of a large regression......Page 575
Why to graph......Page 577
No single graph does it all......Page 578
The x and y axes......Page 579
Symbols and auxiliary lines......Page 580
Maps......Page 582
Calibration plots......Page 583
Residual plots......Page 584
A display of several time series of opinion polls......Page 585
Histograms......Page 587
B.4 Bibliographic note......Page 588
B.5 Exercises......Page 589
C.2 Fitting classical and multilevel regressions in R......Page 591
The lmer() function for multilevel modeling......Page 592
Programming in R......Page 593
Six prototype models; fitting in R......Page 594
Fitting in Stata......Page 595
Fitting in SAS......Page 596
Fitting in SPSS......Page 597
Fitting in AD Model Builder......Page 598
C.5 Bibliographic note......Page 599
References......Page 601
Author index......Page 627
Subject index......Page 633