Introduction to Statistical Modelling and Inference

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

The complexity of large-scale data sets (“Big Data”) has stimulated the development of advanced
computational methods for analysing them. There are two different kinds of methods to aid this. The
model-based method uses probability models and likelihood and Bayesian theory, while the model-free
method does not require a probability model, likelihood or Bayesian theory. These two approaches
are based on different philosophical principles of probability theory, espoused by the famous
statisticians Ronald Fisher and Jerzy Neyman.
Introduction to Statistical Modelling and Inference covers simple experimental and survey designs,
and probability models up to and including generalised linear (regression) models and some
extensions of these, including finite mixtures. A wide range of examples from different application
fields are also discussed and analysed. No special software is used, beyond that needed for maximum
likelihood analysis of generalised linear models. Students are expected to have a basic
mathematical background in algebra, coordinate geometry and calculus.
Features
• Probability models are developed from the shape of the sample empirical cumulative distribution
function (cdf) or a transformation of it.
• Bounds for the value of the population cumulative distribution function are obtained from the
Beta distribution at each point of the empirical cdf.
• Bayes’s theorem is developed from the properties of the screening test for a rare condition.
• The multinomial distribution provides an always-true model for any randomly sampled data.
• The model-free bootstrap method for finding the precision of a sample estimate has a model-based
parallel – the Bayesian bootstrap – based on the always-true multinomial distribution.
• The Bayesian posterior distributions of model parameters can be obtained from the maximum
likelihood analysis of the model.

This book is aimed at students in a wide range of disciplines including Data Science. The book is
based on the model-based theory, used widely by scientists in many fields, and compares it, in less
detail, with the model-free theory, popular in computer science, machine learning and official
survey analysis. The development of the model-based theory is accelerated by recent developments
in Bayesian analysis.

Author(s): Murray Aitkin
Publisher: CRC Press/Chapman & Hall
Year: 2022

Language: English
Pages: 390
City: Boca Raton

Cover
Half Title
Title Page
Copyright Page
Table of Contents
Preface
Chapter 1 Introduction
1.1 What is Statistical Modelling?
1.2 What is Statistical Analysis?
1.3 What is Statistical Inference?
Chapter 2 What is (or are) Big Data?
Chapter 3 Data and research studies
3.1 Lifetimes of radio transceivers
3.2 Clustering of V1 missile hits in South London
3.3 Court case on vaccination risk
3.4 Clinical trial of Depepsen for the treatment of duodenal ulcers
3.5 Effectiveness of treatments for respiratory distress in newborn babies
3.6 Vitamin K
3.7 Species counts
3.8 Toxicology in small animal experiments
3.9 Incidence of Down’s syndrome in four regions
3.10 Fish species in lakes
3.11 Absence from school
3.12 Hostility in husbands of suicide attempters
3.13 Tolerance of racial intermarriage
3.14 Hospital bed use
3.15 Dugong growth
3.16 Simulated motorcycle collision
3.17 Global warming
3.18 Social group membership
Chapter 4 The StatLab database
4.1 Types of variables
4.2 StatLab population questions
Chapter 5 Sample surveys – should we believe what we read?
5.1 Women and Love
5.2 Would you have children?
5.3 Representative sampling
5.4 Bias in the Newsday sample
5.5 Bias in the Women and Love sample
Chapter 6 Probability
6.1 Relative frequency
6.2 Degree of belief
6.3 StatLab dice sampling
6.4 Computer sampling
6.4.1 Natural random processes
6.5 Probability for sampling
6.5.1 Extrasensory perception
6.5.2 Representative sampling
6.6 Probability axioms
6.6.1 Dice example
6.6.2 Coin tossing
6.7 Screening tests and Bayes’s theorem
6.8 The misuse of probability in the Sally Clark case
6.9 Random variables and their probability distributions
6.9.1 Definitions
6.10 Sums of independent random variables
Chapter 7 Statistical inference I – discrete distributions
7.1 Evidence-based policy
7.2 The basis of statistical inference
7.3 The survey sampling approach
7.4 Model-based inference theories
7.5 The likelihood function
7.6 Binomial distribution
7.6.1 The binomial likelihood function
7.6.1.1 Sufficient and ancillary statistics
7.6.1.2 The maximum likelihood estimate (MLE)
7.7 Frequentist theory
7.7.1 Parameter transformations
7.7.2 Ambiguity of notation
7.8 Bayesian theory
7.8.1 Bayes’s theorem
7.8.2 Summaries of the posterior distribution
7.8.3 Conjugate prior distributions
7.8.4 Improving frequentist interval coverage
7.8.5 The bootstrap
7.8.6 Non-informative prior rules
7.8.7 Frequentist objections to flat priors
7.8.8 General prior specifications
7.8.9 Are parameters really just random variables?
7.9 Inferences from posterior sampling
7.9.1 The precision of posterior draws
7.10 Sample design
7.11 Parameter transformations
7.12 The Poisson distribution
7.12.1 Poisson likelihood and ML
7.12.2 Bayesian inference
7.12.3 Prediction of a new Poisson value
7.12.4 Side effect risk
7.12.4.1 Frequentist analysis
7.12.4.2 Bayesian analysis
7.12.5 A two-parameter binomial distribution
7.12.5.1 Frequentist analysis
7.12.5.2 Bayesian analysis
7.13 Categorical variables
7.13.1 The multinomial distribution
7.14 Maximum likelihood
7.15 Bayesian analysis
7.15.1 Posterior sampling
7.15.2 Sampling without replacement
Chapter 8 Comparison of binomials
8.1 Definition
8.2 Example – RCT of Depepsen for the treatment of duodenal ulcers
8.2.1 Frequentist analysis: confidence interval
8.2.2 Bayesian analysis: credible interval
8.3 Monte Carlo simulation
8.4 RCT continued
8.5 Bayesian hypothesis testing/model comparison
8.5.1 The null and alternative hypotheses, and the two models
8.6 Other measures of treatment difference
8.6.1 Frequentist analysis: hypothesis testing
8.6.2 How are the hypothetical samples to be drawn?
8.6.3 Conditional testing
8.7 The ECMO trials
8.7.1 The first trial
8.7.2 Frequentist analysis
8.7.3 The likelihood
8.7.3.1 Bayesian Analysis
8.7.4 The second ECMO study
Chapter 9 Data visualisation
9.1 The histogram
9.2 The empirical mass and cumulative distribution functions
9.3 Probability models for continuous variables
Chapter 10 Statistical inference II – the continuous exponential, Gaussian and uniform distributions
10.1 The exponential distribution
10.2 The exponential likelihood
10.3 Frequentist theory
10.3.1 Parameter transformations
10.3.2 Frequentist asymptotics
10.4 Bayesian theory
10.4.1 Conjugate priors
10.5 The Gaussian distribution
10.6 The Gaussian likelihood function
10.7 Frequentist inference
10.8 Bayesian inference
10.8.1 Prior arguments
10.9 Hypothesis testing
10.10 Frequentist hypothesis testing
10.10.1 µ1 vs µ2
10.10.2 µ0 vs µ .= µ0
10.11 Bayesian hypothesis testing
10.11.1 µ1 vs µ2
10.11.2 µ0 vs µ .= µ0
10.11.2.1 Use the credible interval
10.11.2.2 Use the likelihood ratio
10.11.2.3 The integrated likelihood
10.12 Pivotal functions
10.13 Conjugate priors
10.14 The uniform distribution
10.14.1 The location-shifted uniform distribution
Chapter 11 Statistical Inference III – two-parameter continuous distributions
11.1 The Gaussian distribution
11.2 Frequentist analysis
11.3 Bayesian analysis
11.3.1 Inference for σ
11.3.2 Inference for μ
11.3.2.1 Simulation marginalisation
11.3.3 Parametric functions
11.3.4 Prediction of a new observation
11.4 The lognormal distribution
11.4.1 The lognormal density
11.5 The Weibull distribution
11.5.1 The Weibull likelihood
11.5.2 Frequentist analysis
11.5.3 Bayesian analysis
11.5.4 The extreme value distribution
11.5.5 Median Rank Regression (MRR)
11.5.6 Censoring
11.6 The gamma distribution
11.7 The gamma likelihood
11.7.1 Frequentist analysis
11.7.2 Bayesian analysis
Chapter 12 Model assessment
12.1 Gaussian model assessment
12.2 Lognormal model assessment
12.3 Exponential model assessment
12.4 Weibull model assessment
12.5 Gamma model assessment
Chapter 13 The multinomial distribution
13.1 The multinomial likelihood
13.2 Frequentist analysis
13.3 Bayesian analysis
13.4 Criticisms of the Haldane prior
13.4.1 The Dirichlet process prior
13.4.2 Posterior sampling
13.5 Inference for multinomial quantiles
13.6 Dirichlet posterior weighting
13.7 The frequentist bootstrap
13.7.1 Two-category sample
13.8 Stratified sampling and weighting
Chapter 14 Model comparison and model averaging
14.1 Comparison of two fully specified models
14.2 General model comparison
14.2.1 Known parameters
14.2.2 Unknown parameters
14.3 Posterior distribution of the likelihood
14.4 The deviance
14.5 Asymptotic distribution of the deviance
14.6 Nested models
14.7 Model choice and model averaging
Chapter 15 Gaussian linear regression models
15.1 Simple linear regression
15.1.1 Vitamin K
15.2 Model assessment through residual examination
15.3 Likelihood for the simple linear regression model
15.4 Maximum likelihood
15.4.1 Vitamin K example
15.5 Bayesian and frequentist inferences
15.6 Model-robust analysis
15.6.1 The robust variance estimate
15.7 Correlation and prediction
15.7.1 Correlation
15.7.2 Prediction
15.7.3 Example
15.7.4 Prediction as a model assessment tool
15.8 Probability model assessment
15.9 “Dummy variable” regression
15.10 Two-variable models
15.11 Model assumptions
15.12 The p-variable linear model
15.13 The Gaussian multiple regression likelihood
15.13.1 Absence from school
15.14 Interactions
15.14.1 ANOVA, ANCOVA and MR
15.14.1.1 ANOVA
15.14.1.2 Backward elimination
15.14.1.3 ANCOVA
15.15 Ridge regression, the Lasso and the “elastic net"
15.16 Modelling boy birthweights
15.17 Modelling girl intelligence at age ten and family income
15.18 Modelling of the hostility data
15.18.1 Data structure
15.18.1.1 Replication and variance heterogeneity
15.19 Principal component regression
Chapter 16 Incomplete data and their analysis with the EM and DA algorithms
16.1 The general incomplete data model
16.2 The EM algorithm
16.3 Missingness
16.4 Lost data
16.5 Censoring in the exponential distribution
16.6 Randomly missing Gaussian observations
16.7 Missing responses and/or covariates in simple and multiple regression
16.7.1 Missing values in the single covariate in simple linear regression
16.7.2 Modelling the covariate distribution – Gaussian
16.7.3 Modelling the covariate distribution – multinomial
16.7.4 Multiple covariates missing
16.8 Mixture distributions
16.8.1 The two-component Gaussian mixture model
16.9 Bayesian analysis and the Data Augmentation algorithm
16.9.1 The galaxy recession velocity study
16.9.2 The Dirichlet process prior
Chapter 17 Generalised linear models (GLMs)
17.1 The exponential family
17.2 Maximum likelihood
17.3 The GLM algorithm
17.4 Bayesian package development
17.5 Bayesian analysis from ML
17.6 Binary response models
17.6.1 Probit or logit analysis?
17.6.2 Other binomial link functions and their origins
17.6.3 The Racine data
17.6.4 Maximum likelihood
17.6.5 Bayesian analysis
17.6.6 The beetle data
17.7 The menarche data
17.7.1 Down’s syndrome analysis
17.7.1.1 BC analysis
17.7.1.2 Four regions analysis
17.7.2 The Finney vasoconstriction data
17.7.3 Cross-classifications with binary data
17.7.3.1 Region 1
17.7.3.2 Region 2
17.7.3.3 Region 3
17.7.3.4 Region 4
17.7.3.5 Observed and (fitted) proportions, all regions
17.8 Poisson regression – fish species frequency
17.8.1 Gaussian approximation
17.8.2 The Bayesian bootstrap and posterior weighting
17.8.3 Omitted variables, overdispersion and the negative binomial
model
17.8.4 Conjugate W
17.9 Gamma regression
Chapter 18 Extensions of GLMs
18.1 Double GLMs
18.2 Maximum likelihood
18.3 Bayesian analysis
18.3.1 Hospital beds and patients
18.3.2 The absence from school data
18.3.3 The fish species data
18.3.4 Sea temperatures
18.3.5 Model assessment
18.4 Segmented or broken-stick regressions
18.4.1 Nile flood volumes
18.4.2 Modelling the break
18.4.3 Down’s syndrome
18.5 Heterogeneous regressions
18.6 Highly non-linear functions
18.7 Neural networks
18.8 Social networks and social group membership
18.8.1 History of network structures
18.8.2 The Natchez women network
18.8.3 Statistical models
18.8.3.1 The “null” random graph model
18.8.3.2 The “saturated” model
18.8.3.3 The Rasch model
18.8.4 The Exponential Random Graph Model (ERGM)
18.8.4.1 The latent class or mixed Rasch model
18.9 The motorcycle data
Chapter 19 Appendix 1 – length-biased sampling
Chapter 20 Appendix 2 – two-component Gaussian mixture
Chapter 21 Appendix 3 – StatLab variables
21.1 Child variables
21.2 Family variables
21.3 Mother variables
21.4 Father variables
Chapter 22 Appendix 4 – a short history of statistics from 1890
22.1 Karl Pearson (1857–1936)
22.2 Ronald Fisher (1890–1962)
22.3 Jerzy Neyman (1894–1981)
22.4 Harold Jeffreys (1891–1989)
Chapter 23 References
Index