The success of the first edition of Generalized Linear Models led to the updated Second Edition, which continues to provide a definitive unified, treatment of methods for the analysis of diverse types of data. Today, it remains popular for its clarity, richness of content and direct relevance to agricultural, biological, health, engineering, and other applications.
The authors focus on examining the way a response variable depends on a combination of explanatory variables, treatment, and classification variables. They give particular emphasis to the important case where the dependence occurs through some unknown, linear combination of the explanatory variables.
The Second Edition includes topics added to the core of the first edition, including conditional and marginal likelihood methods, estimating equations, and models for dispersion effects and components of dispersion. The discussion of other topics-log-linear and related models, log odds-ratio regression models, multinomial response models, inverse linear and related models, quasi-likelihood functions, and model checking-was expanded and incorporates significant revisions.
Comprehension of the material requires simply a knowledge of matrix theory and the basic ideas of probability theory, but for the most part, the book is self-contained. Therefore, with its worked examples, plentiful exercises, and topics of direct use to researchers in many disciplines, Generalized Linear Models serves as ideal text, self-study guide, and reference.
Author(s): P. McCullagh, John A. Nelder
Series: Chapman & Hall/CRC Monographs on Statistics & Applied Probability
Edition: 2nd
Publisher: Chapman and Hall / CRC
Year: 1989
Language: English
Pages: 532
Tags: Linear Programming;Applied;Mathematics;Science & Math;Probability & Statistics;Applied;Mathematics;Science & Math;Statistics;Mathematics;Science & Mathematics;New, Used & Rental Textbooks;Specialty Boutique
Contents ... 3
Preface to the first edition ... 12
Preface ... 14
CHAPTER 1 Introduction ... 16
1.1 Background ... 16
1.1.1 The problem of looking at data ... 18
1.1.2 Theory as pattern ... 19
1.1.3 Model fitting ... 20
1.1.4 What is a good model? ... 22
1.2 The origins of generalized linear models ... 23
1.2.1 Terminology ... 23
1.2.2 Classical linear models ... 24
1.2.3 R.A. Fisher and the design of experiments ... 25
1.2.4 Dilution assay ... 26
1.2.5 Probit analysis ... 28
1.2.6 Log it models for proportions ... 29
1.2.7 Log-linear models for counts ... 29
1.2.8 Inverse polynomials ... 31
1.2.9 Survival data ... 31
1.3 Scope of the rest of the book ... 32
1.4 Bibliographic notes ... 34
1.5 Further results and exercises 1 ... 34
CHAPTER 2 An outline of generalized linear models ... 36
2.1 Processes in model fitting ... 36
2.1.1 Model selection ... 36
2.1.2 Estimation ... 38
2.1.3 Prediction ... 40
2.2 The components of a generalized linear model ... 41
2.2.1 The generalization ... 42
2.2.3 Link functions ... 46
2.2.4 Sufficient statistics and canonical links ... 47
2.3 Measuring the goodness of fit ... 48
2.3.1 The discrepancy of a fit ... 48
2.3.2 The analysis of deviance ... 50
2.4 Residua ls ... 52
2.4.1 Pearson residual ... 52
2.4.2 Anscombe residual ... 53
2.4.3 Deviance residual ... 54
2.5 An algorithm for fitting generalized linear models ... 55
2.5.1 Justification of the fitting procedure ... 56
2.6 Bibliographic notes ... 58
2.7 Further results and exercises 2 ... 59
CHAPTER 3 Models for continuous data with constant variance ... 63
3.1 Introduction ... 63
3.2 Error structure ... 64
3.3 Systematic component (linear predictor) ... 66
3.3.1 Continuous covariates ... 66
3.3.2 Qualitative covariates ... 67
3.3.3 Dummy variates ... 69
3.3.4 Mixed terms ... 70
3.4 Model formulae for linear predictors ... 71
3.4.1 Individual terms ... 71
3.4.2 The dot operator ... 71
3.4.3 The + operator ... 72
3.4.4 The crossing (*) and nesting (/) operators ... 73
3.4.5 Operators for the removal of terms ... 74
3.4.6 Exponential operator ... 75
3.5 Aliasing ... 76
3.5.1 Intrinsic aliasing with factors ... 78
3.5.2 Aliasing in a two-way cross-classification ... 80
3.5.3 Extrinsic aliasing ... 83
3.5.4 Functional relations among covariates ... 84
3.6 Estimation ... 85
3.6.1 The maximum-likelihood equations ... 85
3.6.2 Geometrical interpretation ... 86
3.6.3 Information ... 87
3.6.4 A model with two covariates ... 89
3. 6.5 The information surface ... 92
3.6.6 Stability ... 93
3.7 Tables as data ... 94
3.7.1 Empty cells ... 94
3.7.2 Fused cells ... 96
3.8 Algorithms for least squares ... 96
3.8.1 Methods based on the information matrix ... 97
3.8.2 Direct decomposition methods ... 100
3.8.3 Extension to generalized linear models ... 103
3.9 Selection of covariates ... 104
3.10 Bibliographic notes ... 108
3.11 Further results and exercises 3 ... 108
CHAPTER 4 Binary data ... 113
4.1 Introduction ... 113
4.1.1 Binary responses ... 113
4.1.2 Covariate classes ... 114
4.1.3 Contingency tables ... 115
4.2 Binomial distribution ... 116
4.2.1 Genesis ... 116
4.2.2 Moments and cumulants ... 117
4.2.3 Normal limit ... 118
4.2.4 Poisson limit ... 120
4.2.5 Transformations ... 120
4.3 Models for binary responses ... 122
4.3.1 Link Junctions ... 122
4.3.2 Parameter interpretation ... 125
4.3.3 Retrospective sampling ... 126
4.4 Likelihood functions for binary data ... 129
4.4.l Log likelihood for binomial data ... 129
4.4.2 Parameter estimation ... 130
4.4.3 Deviance function ... 133
4.4.4 Bias and precision of estimates ... 134
4.4.5 Sparseness ... 135
4.4.6 Extrapolation ... 137
4.5 Over-dispersion ... 139
4.5.1 Genesis ... 139
4.5.2 Parameter estimation ... 141
4.6 Example ... 143
4.6.1 Habitat preferences of lizards ... 143
4.7 Bibliographic notes ... 150
4.8 Further results and exercises 4 ... 150
CHAPTER 5 Models for polytomous data ... 164
5.1 Introduction ... 164
5.2 Measurement scales ... 165
5.2.1 General points ... 165
5.2.2 Models for ordinal scales ... 166
5.2.3 Models for interval scales ... 170
5.2.4 Models for nominal scales ... 174
5.2.5 Nested or hierarchical response scales ... 175
5.3 The multinomial distribution ... 179
5.3.1 Genesis ... 179
5.3.2 Moments and cumulants ... 180
5.3.3 Generalized inverse matrices ... 183
5.3.4 Quadratic forms ... 184
5.3.5 Marginal and conditional distributions ... 185
5.4 Likelihood functions ... 186
5.4.1 Log likelihood for multinomial responses ... 186
5.4.2 Para meter estimation ... 187
5.4.3 Deviance function ... 189
5.5 Over-dispersion ... 189
5.6 Examples ... 190
5.6.1 A cheese-tasting experiment ... 190
5.6.2 Pneumoconiosis among coalminers ... 193
5.7 Bibliographic notes ... 197
5.8 Further results and exercises 5 ... 199
CHAPTER 6 Log-linear models ... 208
6.1 Introduction ... 208
6.2 Likelihood functions ... 209
6.2.1 Poisson distribution ... 209
6.2.2 The Poisson log-likelihood function ... 212
6.2.3 Over-dispersion ... 213
6.2.4 Asymptotic theory ... 215
6.3 Examples ... 215
6.3.1 A biological assay of tuberculins ... 215
6.3.2 A study of wave damage to cargo ships ... 219
6.4 Log-linear models and multinomial response models ... 224
6.4.1 Comparison of two or more Poisson means ... 224
6.4.2 Multinomial response models ... 226
6.4.3 Summary ... 228
6.5 Multiple responses ... 229
6.5.1 Introduction ... 229
6.5.2 Independence and conditional independence ... 230
6.5.3 Canonical correlation models ... 232
6.5.4 Multivariate regression models ... 234
6.5.5 Multivariate model formulae ... 237
6.5.6 Log-linear regression models ... 238
6.5.7 Likelihood equations ... 240
6.6 Example ... 244
6.6.1 Respiratory ailments of coalminers ... 244
6.6.2 Parameter interpretation ... 248
6.7 Bibliographic notes ... 250
6.8 Further results and exercises 6 ... 251
CHAPTER 7 Conditional likelihoods ... 260
7.1 Introduction ... 260
7.2 Marginal and conditional likelihoods ... 261
7.2.1 Marginal likelihood ... 261
7.2.2 Conditional likelihood ... 263
7.2.3 Exponential-family models ... 267
7.2.4 Profile likelihood ... 269
7.3 Hypergeometric distributions ... 270
7.3.1 Central hypergeometric distribution ... 270
7.3.2 Non-central hypergeometric distribution ... 272
7.3.3 Multivariate hypergeometric distribution ... 275
7.3.4 Multivariate non-central hypergeometric distribution ... 276
7.4 Some applications involving binary data ... 277
7.4.1 Comparison of two binomial probabilities ... 277
7.4.2 Combination of information fr om several 2x2 tables ... 280
7.4.3 Example: flle-et- Vilaine study of oesophageal cancer ... 282
7.5 Some applications involving polytomous data ... 285
7.5.1 Matched pairs: nominal response ... 285
7.5.2 Ordinal responses ... 288
7.5.3 Example ... 291
7.6 Bibliographic notes ... 292
7.7 Further results and exercises 7 ... 294
CHAPTER 8 Models for data with constant coefficient of variation ... 300
8.1 Introduction ... 300
8.2 The gamma distribution ... 302
8.3 Models with gamma-distributed observations ... 304
8.3.1 The variance function ... 304
8.3.2 The deviance ... 305
8.3.3 The canonical link ... 306
8.3.4 Multiplicative models: log link ... 307
8.3.5 Linear models: identity link ... 309
8.3.6 Estimation of the dispersion parameter ... 310
8.4 Examples ... 311
8.4.1 Car insurance claims ... 311
8.4.2 Clotting times of blood ... 315
8.4.3 Modelling rainfall data using two generalized linear models ... 317
8.4.4 Developmental rate of Drosophila melanogaster ... 321
8.5 Bibliographic notes ... 328
8.6 Further results and exercises 8 ... 329
CHAPTER 9 Quasi-likelihood functions ... 338
9.1 Introduction ... 338
9.2 Independent observations ... 339
9.2.1 Covariance functions ... 339
9.2.2 Construction of the quasi-likelihood function ... 340
9.2.3 Parameter estimation ... 342
9.2.4 Example: incidence of leaf-blotch on barley ... 343
9.3 Dependent observations ... 347
9.3.1 Quasi-likelihood estimating equations ... 347
9.3.2 Quasi-likelihood function ... 348
9.3.3 Example: estimation of probabilities from marginal frequencies ... 351
9.4 Optimal estimating functions ... 354
9.4.1 Introduction ... 354
9.4.2 Combination of estimating functions ... 355
9.4.3 Example: estimation for megalithic stone rings ... 358
9.5 Optimality criteria ... 362
9.6 Extended quasi-likelihood ... 364
9.7 Bibliographic notes ... 367
9.8 Further results and exercises 9 ... 367
CHAPTER 10 Joint modelling of mean and dispersion ... 372
10.1 Introduction ... 372
10.2 Model specification ... 373
10.3 Interaction between mean and dispersion effects ... 374
10.4 Extended quasi-likelihood as a criterion ... 375
10.5 Adjustments of the estimating equations ... 376
10.5.1 Adjustment for kurtosis ... 376
10.5.2 Adjustment for degrees of freedom ... 377
10.6 Joint optimum estimating equations ... 379
10.7 Example: the production of leaf-springs for trucks ... 380
10.8 Bibliographic notes ... 385
1O.9 Further results and exercises 10 ... 386
CHAPTER 11 Models with additional non-linear parameters ... 387
11.1 Introduction ... 387
11.2 Parameters in the variance function ... 388
11.3 Parameters in the link function ... 390
11.3.1 One link parameter ... 390
11.3.2 More than one link parameter ... 392
11.3.3 Transformation of data vs transformation of fitted values ... 393
11.4 Non-linear parameters in the covariates ... 394
11.5 Examples ... 396
11.5.1 The effects of fertilizers on coastal Bermuda grass ... 396
11.5.2 Assay of an insecticide with a synergist ... 399
11.5.3 Mixtures of drugs ... 401
11.6 Bibliographic notes ... 404
11.7 Further results and exercises 11 ... 404
CHAPTER 12 Model checking ... 406
12.1 Introduction ... 406
12.2 Techniques in model checking ... 407
12.3 Score tests for extra pai- ameters ... 408
12.4 Smoothing as an aid to informal checks ... 409
12.5 The raw materials of model checking ... 411
12.6 Checks for systematic departure from model ... 413
12.6.1 Informal checks using residuals ... 413
12.6.2 Checking the variance function ... 415
12.6.3 Checking the link function ... 416
12.6.4 Checking the scales of covariates ... 416
12.6.5 Checks for compound systematic discrepancies ... 418
12.7 Checks for isolated departures from the model ... 418
12.7.1 Measure of leverage ... 420
12.7.2 Measure of consistency ... 421
12.7.3 Measure of influence ... 421
12.7.4 Informal assessment of extreme values ... 422
12.7.5 Extreme points and checks for systematic discrepancies ... 423
12.8 Examples ... 424
12.8.1 Damaged carrots in an insecticide experiment ... 424
12.8.2 Minitab tree data ... 425
12.8.3 Insurance claims (continued) ... 428
12.9 A strategy for model checking? ... 429
12.10 Bibliographic notes ... 430
12.11 Further results and exercises 12 ... 431
CHAPTER 13 Models for survival data ... 434
13.1 Introduction ... 434
13.1.1 Survival functions and hazard functions ... 434
13.2 Proportional -hazards models ... 436
13.3 Estimation with a specified survival distribution ... 437
13.3.1 The exponential distribution ... 438
13.3.2 The Weibull distribution ... 438
13.3.3 The extreme-value distribution ... 439
13.4 Example: remission times for leukemia ... 440
13.5 Cox's proportional-hazards model ... 441
13.5.1 Partial likelihood ... 441
13.5.2 The treatment of ties ... 442
13.5.3 Numerical methods ... 444
13.6 Bibliographic notes ... 445
13.7 Further results and exercises 13 ... 445
CHAPTER 14 Components of dispersion ... 447
14.1 Introduction ... 447
14.2 Linear models ... 448
14.3 Non-linear models ... 449
14.4 Parameter estimation ... 452
14.5 Example: A salamander mating experiment ... 454
14.5.1 Introduction ... 454
14.5.2 Experimental procedure ... 456
14.5.3 A linear logistic model with random effects ... 459
14.5.4 Estimation of the dispersion parameters ... 463
14.6 Bibliographic notes ... 465
14.7 Further results and exercises 14 ... 467
CHAPTER 15 Further topics ... 470
15.1 Introduction ... 470
15.2 Bias adjustment ... 470
15.2.1 Models with canonical link ... 470
15.2.2 Non-canonical models ... 472
15.2.3 Example: Lizard data (continued) ... 473
15.3 Computation of Bartlett adjustments ... 474
15.3.1 General theory ... 474
15.3.2 Computation of the adjustment ... 475
15.3.3 Example: exponential regression model ... 478
15.4 Generalized additive models ... 480
15.4.1 Algorithms for fitting ... 480
15.4.2 Smoothing methods ... 481
15.4.3 Conclusions ... 482
15.5 Bibliographic notes ... 482
15.6 Further results and exercises 15 ... 482
APPENDIX A Elementary likelihood theory ... 484
Scalar parameter ... 484
Vector parameter ... 487
APPENDIX B Edgeworth series ... 489
APPENDIX C Likelihood-ratio statistics ... 491
References ... 494
Index of data sets ... 515
Author index ... 516
Subject index ... 521