Bayesian Modelling of Spatio-Temporal Data with R

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Applied sciences, both physical and social, such as atmospheric, biological, climate, demographic, economic, ecological, environmental, oceanic and political, routinely gather large volumes of spatial and spatio-temporal data in order to make wide ranging inference and prediction. Ideally such inferential tasks should be approached through modelling, which aids in estimation of uncertainties in all conclusions drawn from such data. Unified Bayesian modelling, implemented through user friendly software packages, provides a crucial key to unlocking the full power of these methods for solving challenging practical problems.

Key features of the book:

• Accessible detailed discussion of a majority of all aspects of Bayesian methods and computations with worked examples, numerical illustrations and exercises

• A spatial statistics jargon buster chapter that enables the reader to build up a vocabulary without getting clouded in modeling and technicalities

• Computation and modeling illustrations are provided with the help of the dedicated R package bmstdr, allowing the reader to use well-known packages and platforms, such as rstan, INLA, spBayes, spTimer, spTDyn, CARBayes, CARBayesST, etc

• Included are R code notes detailing the algorithms used to produce all the tables and figures, with data and code available via an online supplement

• Two dedicated chapters discuss practical examples of spatio-temporal modeling of point referenced and areal unit data

• Throughout, the emphasis has been on validating models by splitting data into test and training sets following on the philosophy of machine learning and data science

This book is designed to make spatio-temporal modeling and analysis accessible and understandable to a wide audience of students and researchers, from mathematicians and statisticians to practitioners in the applied sciences. It presents most of the modeling with the help of R commands written in a purposefully developed R package to facilitate spatio-temporal modeling. It does not compromise on rigour, as it presents the underlying theories of Bayesian inference and computation in standalone chapters, which would be appeal those interested in the theoretical details. By avoiding hard core mathematics and calculus, this book aims to be a bridge that removes the statistical knowledge gap from among the applied scientists.

Author(s): Sujit Kumar Sahu
Series: Chapman & Hall/CRC Interdisciplinary Statistics
Publisher: CRC Press
Year: 2022

Language: English
Pages: 440
City: Boca Raton

Cover
Half Title
Series Page
Title Page
Copyright Page
Dedication
Contents
Introduction
Preface
1. Examples of spatio-temporal data
1.1. Introduction
1.2. Spatio-temporal data types
1.3. Point referenced data sets used in the book
1.3.1. New York air pollution data set
1.3.2. Air pollution data from England and Wales
1.3.3. Air pollution in the eastern US
1.3.4. Hubbard Brook precipitation data
1.3.5. Ocean chlorophyll data
1.3.6. Atlantic ocean temperature and salinity data set
1.4. Areal unit data sets used in the book
1.4.1. Covid-19 mortality data from England
1.4.2. Childhood vaccination coverage in Kenya
1.4.3. Cancer rates in the United States
1.4.4. Hospitalization data from England
1.4.5. Child poverty in London
1.5. Conclusion
1.6. Exercises
2. Jargon of spatial and spatio-temporal modeling
2.1. Introduction
2.2. Stochastic processes
2.3. Stationarity
2.4. Variogram and covariogram
2.5. Isotropy
2.6. Matèrn covariance function
2.7. Gaussian processes (GP) GP(0;C(j ))
2.8. Space-time covariance functions
2.9. Kriging or optimal spatial prediction
2.10. Autocorrelation and partial autocorrelation
2.11. Measures of spatial association for areal data
2.12. Internal and external standardization for areal data
2.13. Spatial smoothers
2.14. CAR models
2.15. Point processes
2.16. Conclusion
2.17. Exercises
3. Exploratory data analysis methods
3.1. Introduction
3.2. Exploring point reference data
3.2.1. Non-spatial graphical exploration
3.2.2. Exploring spatial variation
3.3. Exploring spatio-temporal point reference data
3.4. Exploring areal Covid-19 case and death data
3.4.1. Calculating the expected numbers of cases and deaths
3.4.2. Graphical displays and covariate information
3.5. Conclusion
3.6. Exercises
4. Bayesian inference methods
4.1. Introduction
4.2. Prior and posterior distributions
4.3. The Bayes theorem for probability
4.4. Bayes theorem for random variables
4.5. Posterior / Likelihood x Prior
4.6. Sequential updating of the posterior distribution
4.7. Normal-Normal example
4.8. Bayes estimators
4.8.1. Posterior mean
4.8.2. Posterior median
4.8.3. Posterior mode
4.9. Credible interval
4.10. Prior Distributions
4.10.1. Conjugate prior distribution
4.10.2. Locally uniform prior distribution
4.10.3. Non-informative prior distribution
4.11. Posterior predictive distribution
4.11.1. Normal-Normal example
4.12. Prior predictive distribution
4.13. Inference for more than one parameter
4.14. Normal example with both parameters unknown
4.15. Model choice
4.15.1. The Bayes factor
4.15.2. Posterior probability of a model
4.15.3. Hypothesis testing
4.16. Criteria-based Bayesian model selection
4.16.1. The DIC
4.16.2. The WAIC
4.16.3. Posterior predictive model choice criteria (PMCC)
4.17. Bayesian model checking
4.17.1. Nuisance parameters
4.18. The pressing need for Bayesian computation
4.19. Conclusion
4.20. Exercises
5. Bayesian computation methods
5.1. Introduction
5.2. Two motivating examples for Bayesian computation
5.3. Monte Carlo integration
5.4. Importance sampling
5.5. Rejection sampling
5.6. Notions of Markov chains for understanding MCMC
5.7. Metropolis-Hastings algorithm
5.8. The Gibbs sampler
5.9. Hamiltonian Monte Carlo
5.10. Integrated nested Laplace approximation (INLA)
5.11. MCMC implementation issues and MCMC output processing
5.11.1. Diagnostics based on visual plots and autocorrelation
5.11.2. How many chains?
5.11.3. Method of batching
5.12. Computing Bayesian model choice criteria
5.12.1. Computing DIC
5.12.2. Computing WAIC
5.12.3. Computing PMCC
5.12.4. Computing the model choice criteria for the New York air pollution data
5.13. Conclusion
5.14. Exercises
6. Bayesian modeling for point referenced spatial data
6.1. Introduction
6.2. Model versus procedure based methods
6.3. Formulating linear models
6.3.1. Data set preparation
6.3.2. Writing down the model formula
6.3.3. Predictive distributions
6.4. Linear model for spatial data
6.4.1. Spatial model fitting using bmstdr
6.5. A spatial model with nugget effect
6.5.1. Marginal model implementation
6.6. Model fitting using software packages
6.6.1. spBayes
6.6.2. R-Stan
6.6.3. R-inla
6.7. Model choice
6.8. Model validation methods
6.8.1. Four most important model validation criteria
6.8.2. K-fold cross-validation
6.8.3. Illustrating the model validation statistics
6.9. Posterior predictive checks
6.10. Conclusion
6.11. Exercises
7. Bayesian modeling for point referenced spatio-temporal data
7.1. Introduction
7.2. Models with spatio-temporal error distribution
7.2.1. Posterior distributions
7.2.2. Predictive distributions
7.2.3. Simplifying the expressions: 12H?1 and 12H?121
7.2.4. Estimation of v
7.2.5. Illustration of a spatio-temporal model fitting
7.3. Independent GP model with nugget effect
7.3.1. Full model implementation using spTimer
7.3.2. Marginal model implementation using Stan
7.4. Auto regressive (AR) models
7.4.1. Hierarchical AR Models using spTimer
7.4.2. AR modeling using INLA
7.5. Spatio-temporal dynamic models
7.5.1. A spatially varying dynamic model spTDyn
7.5.2. A dynamic spatio-temporal model using spBayes
7.6. Spatio-temporal models based on Gaussian predictive processes (GPP)
7.7. Performance assessment of all the models
7.8. Conclusion
7.9. Exercises
8. Practical examples of point referenced data modeling
8.1. Introduction
8.2. Estimating annual average air pollution in England and Wales
8.3. Assessing probability of non-compliance in air pollution
8.4. Analyzing precipitation data from the Hubbard Experimental Forest
8.4.1. Exploratory data analysis
8.4.2. Modeling and validation
8.4.3. Predictive inference from model fitting
8.4.3.1. Selecting gauges for possible downsizing
8.4.3.2. Spatial patterns in 3-year rolling average annual precipitation
8.4.3.3. Catchment specific trends in annual precipitation
8.4.3.4. A note on model fitting
8.5. Assessing annual trends in ocean chlorophyll levels
8.6. Modeling temperature data from roaming ocean Argo floats
8.6.1. Predicting an annual average temperature map
8.7. Conclusion
8.8. Exercises
9. Bayesian forecasting for point referenced data
9.1. Introduction
9.2. Exact forecasting method for GP
9.2.1. Example: Hourly ozone levels in the Eastern US
9.3. Forecasting using the models implemented in spTimer
9.3.1. Forecasting using GP models
9.3.2. Forecasting using AR models
9.3.3. Forecasting using the GPP models
9.4. Forecast calibration methods
9.4.1. Theory
9.4.2. Illustrating the calibration plots
9.5. Example comparing GP, AR and GPP models
9.6. Example: Forecasting ozone levels in the Eastern US
9.7. Conclusion
9.8. Exercises
10. Bayesian modeling for areal unit data
10.1. Introduction
10.2. Generalized linear models
10.2.1. Exponential family of distributions
10.2.2. The link function
10.2.3. Offset
10.2.4. The implied likelihood function
10.2.5. Model specification using a GLM
10.3. Example: Bayesian generalized linear model
10.3.1. GLM fitting with binomial distribution
10.3.2. GLM fitting with Poisson distribution
10.3.3. GLM fitting with normal distribution
10.4. Spatial random effects for areal unit data
10.5. Revisited example: Bayesian spatial generalized linear model
10.5.1. Spatial GLM fitting with binomial distribution
10.5.2. Spatial GLM fitting with Poisson distribution
10.5.3. Spatial GLM fitting with normal distribution
10.6. Spatio-temporal random effects for areal unit data
10.6.1. Linear model of trend
10.6.2. Anova model
10.6.3. Separable model
10.6.4. Temporal autoregressive model
10.7. Example: Bayesian spatio-temporal generalized linear model
10.7.1. Spatio-temporal GLM fitting with binomial distribution
10.7.2. Spatio-temporal GLM fitting with Poisson distribution
10.7.3. Examining the model fit
10.7.4. Spatio-temporal GLM fitting with normal distribution
10.8. Using INLA for model fitting and validation
10.9. Conclusion
10.10. Exercises
11. Further examples of areal data modeling
11.1. Introduction
11.2. Assessing childhood vaccination coverage in Kenya
11.3. Assessing trend in cancer rates in the US
11.4. Localized modeling of hospitalization data from England
11.4.1. A localized model
11.4.2. Model fitting results
11.5. Assessing trend in child poverty in London
11.5.1. Adaptive CAR-AR model
11.5.2. Model fitting results
11.6. Conclusion
11.7. Exercises
12. Gaussian processes for data science and other applications
12.1. Introduction
12.2. Learning methods and their Bayesian interpretations
12.2.1. Learning with empirical risk minimization
12.2.2. Learning by complexity penalization
12.2.3. Supervised learning and generalized linear models
12.2.4. Ridge regression, LASSO and elastic net
12.2.5. Regression trees and random forests
12.3. Gaussian Process (GP) prior-based machine learning
12.3.1. Example: predicting house prices
12.4. Use of GP in Bayesian calibration of computer codes
12.5. Conclusion
12.6. Exercises
Appendix A: Statistical densities used in the book
A.1. Continuous
A.2. Discrete
Appendix B: Answers to selected exercises
B.1. Solutions to Exercises in Chapter 4
B.2. Solutions to Exercises in Chapter 5
Bibliography
Glossary
Index