The first half of the book is aimed at quantitative research workers in biology, medicine, ecology and genetics. The book as a whole is aimed at graduate students in statistics, biostatistics, and other quantitative disciplines. Ten detailed examples show how the author approaches real-world statistical problems in a principled way that allows for adequate compromise and flexibility. The need to accommodate correlations associated with space, time and other relationships is a recurring theme, so variance-components models feature prominently. Statistical pitfalls are illustrated via examples taken from the recent scientific literature. Chapter 11 sets the scene, not just for the second half of the book, but for the book as a whole. It begins by defining fundamental concepts such as baseline, observational unit, experimental unit, covariates and relationships, randomization, treatment assignment, and the role that these play in model formulation. Compatibility of the model with the randomization scheme is crucial. The effect of treatment is invariably modelled as a group action on probability distributions. Technical matters connected with space-time covariance functions, residual likelihood, likelihood ratios, and transformations are discussed in later chapters.
Author(s): Peter McCullagh
Series: Springer Series in Statistics
Publisher: Springer
Year: 2023
Language: English
Pages: 414
City: Cham
Preface
Goals
Computation
Organization
Acknowledgments
Contents
1 Rat Surgery
1.1 Healing of Surgical Wounds
1.2 An Elementary Analysis
1.3 Two Incorrect Analyses
1.4 Model Formulae
1.5 A More Appropriate Formal Analysis
1.6 Further Issues
1.6.1 Exclusions
1.6.2 Missing Components
1.6.3 Back-Transformation
1.7 Summary of Statistical Concepts
1.8 Exercises
2 Chain Saws
2.1 Efficiency of Chain Saws
2.2 Covariate and Treatment Factors
2.3 Goals of Statistical Analysis
2.4 Formal Models
2.5 REML and Likelihood Ratios
2.6 Summary of Conclusions
2.7 Exercises
3 Fruit Flies
3.1 Diet and Mating Preferences
3.2 Initial Analyses
3.2.1 Assortative Mating
3.2.2 Initial Questions and Exercises
3.3 Refractory Effects
3.3.1 More Specific Mating Counts
3.3.2 Follow-Up Analyses
3.3.3 Lexis Dispersion
3.3.4 Is Under-Dispersion Possible?
3.3.5 Independence
3.3.6 Acknowledgement
3.4 Technical Points
3.4.1 Hypergeometric Simulation by Random Matching
3.4.2 Pearson's Statistic
3.5 Further Drosophila Project
3.6 Exercises
4 Growth Curves
4.1 Plant Growth: Data Description
4.2 Growth Curve Models
4.3 Technical Points
4.3.1 Non-linear Model with Variance Components
4.3.2 Fitted Versus Predicted Values
4.4 Modelling Strategies
4.5 Miscellaneous R Functions
4.6 Exercises
5 Louse Evolution
5.1 Evolution of Lice on Captive Pigeons
5.1.1 Background
5.1.2 Experimental Design
5.1.3 Deconstruction of the Experimental Design
5.2 Data Analysis
5.2.1 Role of Tables and Graphs
5.2.2 Trends in Mean Squares
5.2.3 Initial Values and Factorial Subspaces
5.2.4 A Simple Variance-Components Model
5.2.5 Conformity with Randomization
5.3 Critique of Published Claims
5.4 Further Remarks
5.4.1 Role of Louse Sex
5.4.2 Persistence of Initial Patterns
5.4.3 Observational Units
5.5 Follow-Up
5.5.1 New Design Information
5.5.2 Modifications to Analyses
5.5.3 Further Remarks
5.6 Exercises
6 Time Series I
6.1 A Meteorological Temperature Series
6.2 Seasonal Cycles
6.2.1 Means and Variances
6.2.2 Skewness and Kurtosis
6.3 Annual Statistics
6.3.1 Means and Variances
6.3.2 Variance of Block Averages
6.3.3 Variogram at Short and Long Lags
6.4 Stochastic Models for the Seasonal Cycle
6.4.1 Structure of Observational Units
6.4.2 Seasonal Structure
6.4.3 Stationary Periodic Processes
6.5 Estimation of Secular Trend
6.5.1 Gaussian Estimation and Prediction
6.5.2 Application to Trend Estimation
6.5.3 Matérn Models
6.5.4 Statistical Tests and Likelihood Ratios
6.5.5 Rough Paths Versus Smooth Paths
6.5.6 Smooth Versus Ultra-Smooth Paths
6.6 Exercises
7 Time Series II
7.1 Frequency-Domain Analyses
7.1.1 Fourier Transformation
7.1.2 Anova Decomposition by Frequency
7.2 Temperature Spectrum
7.2.1 Spectral Plots
7.2.2 A Parametric Spectral Model
7.3 Stationary Temporal Processes
7.3.1 Stationarity
7.3.2 Visualization of Trajectories
7.3.3 Whittle Likelihood
7.4 Exercises
8 Out of Africa
8.1 Linguistic Diversity
8.2 Phoneme Inventory
8.3 Distances
8.4 Maps and Scatterplots
8.5 Point Estimates and Confidence Regions
8.5.1 Simple Version
Shortcomings
8.5.2 Accommodating Correlations
Three Points of Clarification
Shortcomings
8.6 Matters for Further Consideration
8.6.1 Phoneme Inventory as Response
8.6.2 Vowels, Consonants and Tones
8.6.3 Granularity
8.7 Follow-Up Project
8.7.1 Extended Data Frame
8.7.2 An Elementary Misconception
8.8 Exercises
9 Environmental Projects
9.1 Effects of Atmospheric Warming
9.1.1 The Experiment
9.1.2 The Data
9.1.3 Exercises
9.2 The Plight of the Bumblebee
9.2.1 Introduction
9.2.2 Risk of Infection
9.2.3 Mixed Models
9.2.4 Exchangeability
9.2.5 Role of GLMs and GLMMs
9.3 Two Further Projects
9.4 Exercises
10 Fulmar Fitness
10.1 The Eynhallow Colony
10.1.1 Background
10.1.2 The Eynhallow Breeding Record
10.1.3 The Breeding Sequence
10.1.4 Averages for Cohorts
10.1.5 Averages for Disjoint Subsets
10.1.6 Resolution of a Paradox
10.2 Formal Models
10.2.1 A Linear Gaussian Model
10.2.2 Prediction
10.2.3 Model Adequacy
10.3 Mark-Recapture Designs
10.4 Further References
10.5 Exercises
11 Basic Concepts
11.1 Stochastic Processes
11.1.1 Process
11.1.2 Probability
11.1.3 Self-consistency
11.1.4 Statistical Model
11.2 Samples
11.2.1 Baseline
11.2.2 Observational Unit
11.2.3 Population
11.2.4 Biological Populations
11.2.5 Samples and Sub-samples
11.2.6 Illustrations
11.3 Variables
11.3.1 Ordinary Variables
Quantitative Variable
Qualitative Variable
Response
Covariate
Treatment
External Variable
11.3.2 Relationship
Block Factor
11.3.3 External Variable
11.4 Comparative Studies
11.4.1 Randomization
11.4.2 Experimental Unit
11.4.3 Covariate and Treatment Effects
11.4.4 Additivity
11.4.5 Design
11.4.6 Replication
11.4.7 Independence
11.4.8 Interference
11.4.9 State Space
11.4.10 State-Space Evolution
11.4.11 Longitudinal Study
11.4.12 Cemetery State
11.5 Non-comparative Studies
11.5.1 Examples
11.5.2 Stratified Population
11.5.3 Heterogeneity
11.5.4 Random Sample
11.5.5 Stratified Random Sample
11.5.6 Accessibility
11.5.7 Population Averages
11.5.8 Target of Estimation I
11.5.9 Inverse Probability Weighting
11.5.10 Target of Estimation II
11.6 Interpretations of Variability
11.6.1 A Tale of Two Variances
11.6.2 Which Variance Is Appropriate?
11.7 Exercises
12 Principles
12.1 Sampling Consistency
12.2 Adequacy for the Application
12.3 Likelihood Principle
12.4 Attitudes
12.5 Exercises
13 Initial Values
13.1 Randomization Protocols
13.2 Four Gaussian Models
13.2.1 Distribution and Likelihood
13.2.2 Numerical Comparison of Estimates
13.2.3 Initial Values Versus Covariates
13.2.4 Initial Values in an Observational Study
13.3 Exercises
14 Probability Distributions
14.1 Exchangeable Processes
14.1.1 Unconditional Exchangeability
14.1.2 Regression Processes
14.1.3 Block Exchangeability
14.1.4 Stationarity
14.1.5 Exchangeability
14.1.6 Axiomatic Point
14.1.7 Block Randomization
14.2 Families with Independent Components
14.2.1 Parametric Models
14.2.2 IID Model I
14.2.3 IID Model II
14.3 Non-i.d. Models
14.3.1 Classification Factor
14.3.2 Treatment
14.3.3 Classification Factor Plus Treatment
14.3.4 Quantitative Covariate Plus Treatment
14.3.5 Random Coefficient Models
14.4 Examples of Treatment Effects
14.4.1 Simple Gaussian Model Without Interaction
14.4.2 Additive Interaction
14.4.3 Survival Models
Hazard Multiplication
Temporal Dilation
Non-constant Hazard Multiplication
Classification Factor Plus Treatment
14.5 Incomplete Processes
14.5.1 Gosset Process
14.5.2 Factual and Counterfactual Processes
14.5.3 Limitations of Incomplete Processes
14.6 Exercises
15 Gaussian Distributions
15.1 Real Gaussian Distribution
15.1.1 Density and Moments
15.1.2 Gaussian Distribution on Rn
15.2 Complex Gaussian Distribution
15.2.1 One-Dimensional Distribution
15.2.2 Gaussian Distribution on ps: [/EMC pdfmark [/Subtype /Span /ActualText (double struck upper C Superscript n) /StPNE pdfmark [/StBMC pdfmarkCnps: [/EMC pdfmark [/StPop pdfmark [/StBMC pdfmark
15.2.3 Moments
15.3 Gaussian Hilbert Space
15.3.1 Euclidean Structure
15.3.2 Cautionary Remarks
15.3.3 Projections
Specification by Image
Specification by Kernel
Self-adjointness Identity
Mixed Products
Trace and Rank
Rank Degeneracy
15.3.4 Dual Space of Linear Combinations
15.4 Statistical Interpretations
15.4.1 Canonical Norm
15.4.2 Independence
Cochran's Theorem
15.4.3 Prediction and Conditional Expectation
Partitioned Matrix Representation
Example: Exchangeable Gaussian Process
15.4.4 Eddington's Formula
Scalar Signal Estimation
Isotropic Vector Signal Estimation
Spectral Moments for Matrix Reconstruction
15.4.5 Linear Regression
15.4.6 Linear Regression and Prediction
Notation for Component-Wise Transformation
Fiducial Prediction
15.5 Additivity
15.5.1 1DOFNA Algorithm
15.5.2 1DOFNA Theory
15.5.3 Scope and Rationale
15.6 Exercises
16 Space-Time Processes
16.1 Gaussian Processes
16.2 Stationarity and Isotropy
16.2.1 Definitions
16.2.2 Stationarity on Increments
16.2.3 Stationary Process on ps: [/EMC pdfmark [/Subtype /Span /ActualText (double struck upper Z left parenthesis mod k right parenthesis) /StPNE pdfmark [/StBMC pdfmarkZ 8mu(mod6muk)ps: [/EMC pdfmark [/StPop pdfmark [/StBMC pdfmark
16.3 Stationary Gaussian Time Series
16.3.1 Spectral Representation
16.3.2 Matérn Class
16.4 Stationary Spatial Process
16.4.1 Spectral Decomposition
16.4.2 Matérn Spatial Class
Spectral Convolution
Frequency Translation
Decomposition of Spectral Measure
Domain Restriction
16.4.3 Illustration by Simulation
16.5 Covariance Products
16.5.1 Hadamard Product
16.5.2 Separable Products and Tensor Products
16.6 Real Spatio-Temporal Process
16.6.1 Covariance Products
16.6.2 Examples of Covariance Products
Patterned Covariance Matrices
Complex Moments
Complex Covariance Product
A 3D Real Process
16.6.3 Travelling Wave
16.6.4 Perturbation Theory
16.7 Hydrodynamic Processes
16.7.1 Frame of Reference
16.7.2 Rotation and Group Action
16.7.3 Action on Matrices
16.7.4 Borrowed Products
16.7.5 Hydrodynamic Symmetry
16.8 Summer Cloud Cover in Illinois
16.9 More on Gaussian Processes
16.9.1 White Noise
16.9.2 Limit Processes
Existence of a Limit Process
Existence of a Limit Distribution
Limit of Conditional Distributions
Conditional Distributions for the Limit Process
Limit Process as a Markov Kernel
16.10 Exercises
17 Likelihood
17.1 Introduction
17.1.1 Non-Bayesian Model
17.1.2 Bayesian Resolution
17.2 Likelihood Function
17.2.1 Definition
17.2.2 Bartlett Identities
17.2.3 Implications for Estimation
17.2.4 Likelihood-Ratio Statistic I
17.2.5 Profile Likelihood
17.2.6 Two Worked Examples
Example 1: Treatment Effect Estimation
Example 2: Inference for the LD90
17.3 Generalized Linear Models
17.4 Variance-Components Models
17.5 Mixture Models
17.5.1 Two-Component Mixtures
17.5.2 Likelihood-Ratio Statistic
17.5.3 Sparse Signal Detection
17.6 Inferential Compromises
17.7 Exercises
18 Residual Likelihood
18.1 Background
18.2 Simple Linear Regression
18.3 The REML Likelihood
18.3.1 Projections
18.3.2 Determinants
18.3.3 Marginal Likelihood with Arbitrary Kernel
18.3.4 Likelihood Ratios
18.4 Computation
18.4.1 Software Options
18.4.2 Likelihood-Ratios
18.4.3 Testing for Interaction
18.4.4 Singular Models
18.5 Exercises
19 Response Transformation
19.1 Likelihood for Gaussian Models
19.2 Box-Cox Transformation
19.2.1 Power Transformation
19.2.2 Re-scaled Power Transformation
19.2.3 Worked Example
19.2.4 Transformation and Residual Likelihood
19.3 Quantile-Matching Transformation
19.4 Exercises
20 Presentations and Reports
20.1 Coaching Tips I
20.2 Coaching Tips II
20.3 Exercises
21 Q & A
21.1 Scientific Investigations
21.1.1 Observational Unit
21.1.2 Clinical Trials
21.1.3 Agricultural Field Trials
21.1.4 Covariates
21.1.5 Matched Design
21.1.6 The Effect of Treatment
References
Index