Applied Spatial Statistics and Econometrics Data Analysis in R

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This textbook is a comprehensive introduction to applied spatial data analysis using R. Each chapter walks the reader through a different method, explaining how to interpret the results and what conclusions can be drawn. The author team showcases key topics, including unsupervised learning, causal inference, spatial weight matrices, spatial econometrics, heterogeneity and bootstrapping. It is accompanied by a suite of data and R code on Github to help readers practise techniques via replication and exercises. This text will be a valuable resource for advanced students of econometrics, spatial planning and regional science. It will also be suitable for researchers and data scientists working with spatial data.

Author(s): Katarzyna Kopczewska
Edition: 1
Publisher: Routledge
Year: 2021

Language: English
Pages: 621
City: New York
Tags: Spatial Statistics, Google Maps

Cover
Half Title
Series
Title
Copyright
Contents
List of figures
List of tables
List of contributors
Introduction
Statement by the American Statistical Association on statistical significance and p-value – use in the book
Acknowledgements
1 Basic operations in the R software
1.1 About the R software
1.2 The R software interface
1.2.1 R Commander
1.2.2 RStudio
1.3 Using help
1.4 Additional packages
1.5 R language – basic features
1.6 Defining and loading data
1.7 Basic operations on objects
1.8 Basic statistics of the dataset
1.9 Basic visualisations
1.9.1 Scatterplot and line chart
1.9.2 Column chart
1.9.3 Pie chart
1.9.4 Boxplot
1.10 Regression in examples
2 Data, spatial classes and basic graphics
2.1 Loading and basic operations on spatial vector data
2.2 Creating, checking and converting spatial classes
2.3 Selected colour palettes
2.4 Basic contour maps with a colour layer
Scheme 1 – with colorRampPalette() from the grDevices:: package
Scheme 2 – with choropleth() from the GISTools:: package
Scheme 3 – with findInterval() from the base:: package
Scheme 4 – with findColours() from the classInt:: package
Scheme 5 – with spplot() from the sp:: package
2.5 Basic operations and graphs for point data
Scheme 1 – with points() from the graphics:: package – locations only
Scheme 2 – with spplot() from the sp:: package – locations and values
Scheme 3 – with findInterval() from the base:: package – locations, values, different size of symbols
2.6 Basic operations on rasters
2.7 Basic operations on grids
2.8 Spatial geometries
3 Spatial data with Web APIs
3.1 What is an application programming interface (API)?
3.2 Creating background maps with use of an application programming interface
3.3 Ways to visualise spatial data – maps for point and regional data
Scheme 1 – with bubbleMap() from the RgoogleMaps:: package
Scheme 2 – with ggmap() from the ggmap:: package
Scheme 3 – with PlotOnStaticMap() from the RgoogleMaps:: package
Scheme 4 – with RGoogleMaps:: GetMap() and conversion of staticMap into a raster
3.4 Spatial data in vector format – example of the OSM database
3.5 Access to non-spatial internet databases and resources via application programming interface – examples
3.6 Geocoding of data
4 Spatial weights matrix, distance measurement, tessellation, spatial statistics
4.1 Introduction to spatial data analysis
4.2 Spatial weights matrix
4.2.1 General framework for creating spatial weights matrices
4.2.2 Selection of a neighbourhood matrix
4.2.3 Neighbourhood matrices according to the contiguity criterion
4.2.4 Matrix of k nearest neighbours (knn)
4.2.5 Matrix based on distance criterion (neighbours in a radius of d km)
4.2.6 Inverse distance matrix
4.2.7 Summarising and editing spatial weights matrix
4.2.8 Spatial lags and higher-order neighbourhoods
4.2.9 Creating weights matrix based on group membership
### Example ###
### Example ###
4.3 Distance measurement and spatial aggregation
### Example ###
4.4 Tessellation
4.5 Spatial statistics
4.5.1 Global statistics
4.5.1.1 Global Moran’s I statistics
4.5.1.2 Global Geary’s C statistics
4.5.1.3 Join-count statistics
4.5.2 Local spatial autocorrelation statistics
4.5.2.2 Local Moran’s I statistics (local indicator of spatial association)
4.5.2.3 Local Geary’s C statistics
4.5.2.4 Local Getis-Ord Gi statistics
4.5.2.5 Local spatial heteroscedasticity
4.6 Spatial cross-correlations for two variables
4.7 Correlogram
5 Applied spatial econometrics
5.1 Added value from spatial modelling and classes of models
5.2 Basic cross-sectional models
5.2.1 Estimation
### Example ###
5.2.2 Quality assessment of spatial models
5.2.2.1 Information criteria and pseudo-R2 in assessing model fit
5.2.2.2 Test for heteroscedasticity of model residuals
5.2.2.3 Residual autocorrelation tests
5.2.2.4 Lagrange multiplier tests for model type selection
5.2.2.5 Likelihood ratio and Wald tests for model restrictions
5.2.3 Selection of spatial weights matrix and modelling of diffusion strength
5.2.4 Forecasts in spatial models
5.2.5 Causality
5.3 Selected specifications of cross-sectional spatial models
5.3.1 Unidirectional spatial interaction models
5.3.2 Cumulative models
5.3.3 Bootstrapped models for big data
### Example ###
5.3.4 Models for grid data
### Example ###
5.4 Spatial panel models
### Example###
6 Geographically weighted regression – modelling spatial heterogeneity
6.1 Geographically weighted regression
6.2 Basic estimation of geographically weighted regression model
6.2.1 Estimation of the reference ordinary least squares model
6.2.2 Choosing the optimal bandwidth for a dataset
6.2.3 Local geographically weighted statistics
6.2.4 Geographically weighted regression estimation
6.2.5 Basic diagnostic tests of the geographically weighted regression model
6.2.6 Testing the significance of parameters in geographically weighted regression
6.2.7 Selection of the optimal functional form of the model
6.2.8 Geographically weighted regression with heteroscedastic random error
6.3 The problem of collinearity in geographically weighted regression models
6.3.1 Diagnosing collinearity in geographically weighted regression
6.4 Mixed geographically weighted regression
6.5 Robust regression in the geographically weighted regression model
6.6 Geographically and temporally weighted regression
7 Spatial unsupervised learning
7.1 Clustering of spatial points with k-means, PAM (partitioning around medoids) and CLARA (clustering large applications) algorithms
### Example ###
### Example ###
7.2 Clustering with the density-based spatial clustering of applications with noise algorithm
### Example ###
7.3 Spatial principal component analysis
### Example ###
7.4 Spatial drift
### Example ###
7.5 Spatial hierarchical clustering
### Example ###
### Example ###
7.6 Spatial oblique decision tree
### Example ###
8 Spatial point pattern analysis and spatial interpolation
8.1 Introduction and main definitions
8.1.1 Dataset
8.1.2 Creation of window and point pattern
8.1.3 Marks
8.1.4 Covariates
### Example ###
8.1.5 Duplicated points
8.1.6 Projection and rescaling
8.2 Intensity-based analysis of unmarked point pattern
8.2.1 Quadrat test
8.2.2 Tests with spatial covariates
8.3 Distance-based analysis of the unmarked point pattern
8.3.1 Distance-based measures
8.3.1.1 Ripley’s K function
8.3.1.2 F function
8.3.1.3 G function
8.3.1.4 J function
8.3.1.5 Distance-based complete spatial randomness tests
8.3.2 Monte Carlo tests
8.3.3 Envelopes
8.3.4 Non-graphical tests
8.4 Selection and estimation of a proper model for unmarked point pattern
8.4.1 Theoretical note
8.4.2 Choice of parameters
8.4.3 Estimation and results
8.4.4 Conclusions
8.5 Intensity-based analysis of marked point pattern
8.5.1 Segregation test
8.6 Correlation and spacing analysis of the marked point pattern
8.6.1 Analysis under assumption of stationarity
8.6.1.1 K function variations for multitype pattern
8.6.1.2 Mark connection function
8.6.1.3 Analysis of within- and between-type dependence
8.6.1.4 Randomisation test of components’ independence
8.6.2 Analysis under assumption of non-stationarity
8.6.2.1 Inhomogeneous K function variations for multitype pattern
8.7 Selection and estimation of a proper model for unmarked point pattern
8.7.1 Theoretical note
8.7.2 Choice of optimal radius
8.7.3 Within-industry interaction radius
8.7.4 Between-industry interaction radius
8.7.5 Estimation and results
8.7.6 Model with no between-industry interaction
8.7.7 Model with all possible interactions
8.8 Spatial interpolation methods – kriging
8.8.1 Basic definitions
8.8.2 Description of chosen kriging methods
8.8.3 Data preparation for the study
8.8.4 Estimation and discussion
9 Spatial sampling and bootstrapping
9.1 Spatial point data – object classes and spatial aggregation
9.2 Spatial sampling – randomisation/generation of new points on the surface
9.3 Spatial sampling – sampling of sub-samples from existing points
9.3.1 Simple sampling
9.3.2 The options of the sperrorest:: package
9.3.3 Sampling points from areas determined by the k-means algorithm – block bootstrap
9.3.4 Sampling points from moving blocks (moving block bootstrap)
9.4 Use of spatial sampling and bootstrapping in cross-validation of models
### Example ###
10 Spatial big data
10.1 Examples of big data applications
10.2 Spatial big data
10.2.1 Spatial data types
10.2.2 Challenges related to the use of spatial big data
10.2.2.1 Processing of large datasets
10.2.2.2 Mapping and reduction
10.2.2.3 Spatial data indexing
10.3 The sd:: package – simple features
10.3.1 sf class – a special data frame
10.3.2 Data with POLYGON geometry
10.3.3 Data with POINT geometry
10.3.4 Visualisation using the ggplot2:: package
10.3.5 Selected functions for spatial analysis
10.4 Use the dplyr:: package functions
10.5 Sample analysis of large raster data
10.5.1 Measurement of economic inequalities from space
10.5.2 Analysis using the raster:: package functions
10.5.3 Other functions of the raster:: package
10.5.4 Potential alternative – stars:: package
11 Spatial unsupervised learning – applications of market basket analysis in geomarketing
11.1 Introduction to market basket analysis
11.2 Data needed in spatial market basket analysis
11.3 Simulation of data
11.4 The market basket analysis technique applied to geolocation data
11.5 Spatial association rules
11.6 Applications to geomarketing
11.6.1 Finding the best location for a business
11.6.2 Targeting
11.6.3 Discovery of competitors
11.7 Conclusions and further approaches
Appendix A: Datasets used in examples
A1. Dataset no. 1 / dataset1/ – poviat panel data with many variables
A2. Dataset no. 2 / dataset2/ – geolocated point data
A3. Dataset no. 3 / dataset3/ – monthly unemployment rate in poviats (NTS4)
A4. Dataset no. 4 / dataset4/ – grid data for population
A5. Shapefiles of contour maps – for poviats (NTS4), regions (NTS2), country (NTS0) and registration areas
A6. Raster data on night light intensity on Earth in 2013
A7. Population in cities in Poland
Appendix B: Links between packages
References
Index