This is the only book actuaries need to understand generalized linear models (GLMs) for insurance applications. GLMs are used in the insurance industry to support critical decisions. Until now, no text has introduced GLMs in this context or addressed the problems specific to insurance data. Using insurance data sets, this practical, rigorous book treats GLMs, covers all standard exponential family distributions, extends the methodology to correlated data structures, and discusses recent developments which go beyond the GLM. The issues in the book are specific to insurance data, such as model selection in the presence of large data sets and the handling of varying exposure times. Exercises and data-based practicals help readers to consolidate their skills, with solutions and data sets given on the companion website. Although the book is package-independent, SAS code and output examples feature in an appendix and on the website. In addition, R code and output for all the examples are provided on the website.
Author(s): Piet de Jong, Gillian Z. Heller
Series: International Series on Actuarial Science
Publisher: Cambridge University Press
Year: 2008
Language: English
Pages: 196
Cover......Page 1
Half-title......Page 3
Title......Page 5
Copyright......Page 6
Contents......Page 7
Preface......Page 11
1 Insurance data......Page 13
1.1 Introduction......Page 14
1.2 Types of variables......Page 15
1.3 Data transformations......Page 16
1.4 Data exploration......Page 18
1.5 Grouping and runoff triangles......Page 22
1.6 Assessing distributions......Page 24
1.7 Data issues and biases......Page 25
1.8 Data sets used......Page 26
1.9 Outline of rest of book......Page 31
2.1 Discrete and continuous random variables......Page 32
2.2 Bernoulli......Page 33
2.3 Binomial......Page 34
2.4 Poisson......Page 35
2.5 Negative binomial......Page 36
2.6 Normal......Page 38
2.7 Chi-square and gamma......Page 39
2.8 Inverse Gaussian......Page 41
2.9 Overdispersion......Page 42
Exercises......Page 45
3.1 Exponential family......Page 47
3.2 The variance function......Page 48
3.4 Standard distributions in the exponential family form......Page 49
3.5 Fitting probability functions to data......Page 51
Exercises......Page 53
4.1 History and terminology of linear modeling......Page 54
4.3 Simple linear modeling......Page 55
4.4 Multiple linear modeling......Page 56
4.5 The classical linear model......Page 58
4.7 Weighted least squares......Page 59
4.8 Grouped and ungrouped data......Page 60
4.9 Transformations to normality and linearity......Page 61
4.10 Categorical explanatory variables......Page 63
4.11 Polynomial regression......Page 65
4.12 Banding continuous explanatory variables......Page 66
4.14 Collinearity......Page 67
4.15 Hypothesis testing......Page 68
4.16 Checks using the residuals......Page 70
4.17 Checking explanatory variable specifications......Page 72
4.18 Outliers......Page 73
4.19 Model selection......Page 74
5.1 The generalized linear model......Page 76
5.2 Steps in generalized linear modeling......Page 77
5.4 Offsets......Page 78
5.5 Maximum likelihood estimation......Page 79
5.6 Confidence intervals and prediction......Page 82
5.7 Assessing fits and the deviance......Page 83
5.8 Testing the significance of explanatory variables......Page 86
5.9 Residuals......Page 89
5.10 Further diagnostic tools......Page 91
Exercises......Page 92
6.1 Poisson regression......Page 93
6.2 Poisson overdispersion and negative binomial regression......Page 101
6.3 Quasi-likelihood......Page 106
Exercises......Page 108
7.1 Binary responses......Page 109
7.2 Logistic regression......Page 110
7.3 Application of logistic regression to vehicle insurance......Page 111
7.4 Correcting for exposure......Page 114
7.5 Grouped binary data......Page 117
7.6 Goodness of fit for logistic regression......Page 119
7.7 Categorical responses with more than two categories......Page 122
7.8 Ordinal responses......Page 123
7.9 Nominal responses......Page 128
Exercises......Page 131
8.1 Gamma regression......Page 132
8.2 Inverse Gaussian regression......Page 137
8.3 Tweedie regression......Page 139
Exercises......Page 140
9 Correlated data......Page 141
9.1 Random effects......Page 143
9.2 Specification of within-cluster correlation......Page 148
9.3 Generalized estimating equations......Page 149
Exercise......Page 152
10.1 Generalized additive models......Page 153
10.3 Generalized additive models for location, scale and shape......Page 155
10.4 Zero-adjusted inverse Gaussian regression......Page 157
10.5 A mean and dispersion model for total claim size......Page 160
Exercises......Page 161
Number of children: log link......Page 162
Number of children: identity link......Page 163
Diabetes deaths, categorical age......Page 164
Diabetes deaths, cubic age......Page 166
Third party claims......Page 167
Third party claims......Page 168
Swedish mortality, polynomial age and year......Page 169
A1.3 Quasi-likelihood regression......Page 171
Vehicle insurance: quadratic vehicle value......Page 172
Vehicle insurance: banded vehicle value......Page 173
Vehicle insurance: full model, adjusting for exposure......Page 174
Vehicle insurance: logistic regression on grouped data......Page 176
Proportional odds model......Page 181
Partial proportional odds model......Page 183
A1.6 Nominal regression......Page 187
Personal injury insurance, no adjustment for quickly settled claims......Page 190
Personal injury insurance, with adjustment for quickly settled claims......Page 191
Runoff triangle......Page 192
A1.8 Inverse Gaussian regression......Page 193
A1.9 Logistic regression GLMM......Page 195
A1.10 Logistic regression GEE......Page 197
A1.11 Logistic regression GAM......Page 199
A1.12 GAMLSS......Page 201
A1.13 Zero-adjusted inverse Gaussian regression......Page 202
Bibliography......Page 204
Index......Page 207