A reader-friendly introduction to geostatistics for students and researchers struggling with statistics. Using simple, clear explanations for introductory and advanced material, it demystifies complex concepts and makes formulas and statistical tests easy to apply. Beginning with a critical evaluation of experimental and sampling design, the book moves on to explain essential concepts of probability, statistical significance and type 1 and type 2 error. An accessible graphical explanation of analysis of variance (ANOVA) leads onto advanced ANOVA designs, correlation and regression, and non-parametric tests including chi-square. Finally, it introduces the essentials of multivariate techniques, multi-dimensional scaling and cluster analysis, analysis of sequences and concepts of spatial analysis. Illustrated with wide-ranging examples from topics across the Earth and environmental sciences, Geostatistics Explained can be used for undergraduate courses or for self-study and reference. Worked examples at the end of each chapter reinforce a clear understanding of the statistical tests and their applications.
Author(s): Steve McKillup, Melinda Darby Dyar
Publisher: CUP
Year: 2010
Language: English
Pages: 414
Half-title......Page 3
Title......Page 5
Copyright......Page 6
Contents......Page 7
Preface......Page 17
1.1 Why do earth scientists need to understand experimental design and statistics?......Page 19
1.2 What is this book designed to do?......Page 24
2.2 Basic scientific method......Page 26
2.4 Why can’t a hypothesis or theory ever be proven?......Page 29
2.6 Null and alternate hypotheses......Page 30
2.7 Conclusion......Page 31
2.8 Questions......Page 32
3.2 Variables, sampling units and types of data......Page 33
3.3.1 Histograms......Page 35
3.3.2 Frequency polygons or line graphs......Page 37
3.3.3 Cumulative graphs......Page 38
3.5 Bivariate data......Page 39
3.6 Data expressed as proportions of a total......Page 43
3.8 Multivariate data......Page 44
3.9 Conclusion......Page 45
4.1 Introduction......Page 46
4.2.1 Confusing a correlation with causality......Page 47
4.2.2 The inadvertent inclusion of a third variable: sampling confounded in time......Page 49
4.2.3 The need for independent samples in mensurative experiments......Page 50
4.2.4 The need to repeat the sampling on several occasions and elsewhere......Page 51
4.3.1 Independent replicates......Page 52
4.3.2 Control treatments......Page 53
4.3.3 Pseudoreplication......Page 55
4.4 Sometimes you can only do an unreplicated experiment......Page 58
4.5 Realism......Page 59
4.6 A bit of common sense......Page 60
4.7 Designing a “good” experiment......Page 61
4.9 Questions......Page 62
5.2.1 Plagiarism......Page 63
5.2.4 Acknowledging the input of others......Page 64
5.3.2 Ethics......Page 65
5.4 Evaluating and reporting results......Page 66
5.4.2 Record keeping......Page 67
5.6 Questions......Page 68
6.1 Introduction......Page 69
6.2 Statistical tests and significance levels......Page 70
6.4 Making the wrong decision......Page 75
6.5 Other probability levels......Page 76
6.8 A very simple example: the chi-square test for goodness of fit......Page 78
6.9 What if you get a statistic with a probability of exactly 0.05?......Page 82
6.11 Questions......Page 83
7.3 The normal distribution......Page 84
7.3.1 The mean of a normally distributed population......Page 85
7.3.2 The variance of a population......Page 86
7.3.4 The Z statistic......Page 88
7.4 Samples and populations......Page 89
7.4.2 The sample variance......Page 90
7.5 Your sample mean may not be an accurate estimate of the population mean......Page 91
7.6 What do you do when you only have data from one sample?......Page 93
7.7 Why are the statistics that describe the normal distribution so important?......Page 96
7.9 Other distributions......Page 98
7.10.1 The median......Page 100
7.11 Conclusion......Page 101
7.12 Questions......Page 102
8.2 The 95% confidence interval and 95% confidence limits......Page 103
8.3 Using the Z statistic to compare a sample mean and population mean when population statistics are known......Page 104
8.4 Comparing a sample mean to an expected value when population statistics are not known......Page 105
8.4.1 Degrees of freedom and looking up the appropriate critical value of t......Page 108
8.4.2 One-tailed and two-tailed tests......Page 109
8.4.3 The application of a single-sample t test......Page 112
8.5 Comparing the means of two related samples......Page 114
8.6 Comparing the means of two independent samples......Page 116
8.7 Are your data appropriate for a t test?......Page 118
8.7.2 Have the sample(s) been taken at random?......Page 119
8.8 Distinguishing between data that should be analyzed by a paired-sample test and a test for two independent samples......Page 120
8.10 Questions......Page 121
9.2 Type 1 error......Page 123
9.3.1 A worked example showing Type 2 error......Page 124
9.4 The power of a test......Page 127
9.4.1 What determines the power of a test?......Page 128
9.5 What sample size do you need to ensure the risk of Type 2 error is not too high?......Page 129
9.7 Conclusion......Page 131
9.8 Questions......Page 132
10.1 Introduction......Page 133
10.2 Single-factor analysis of variance......Page 134
10.3 An arithmetic/pictorial example......Page 140
10.3.1 Preliminary steps......Page 141
10.3.2 Calculation of within group variation (error)......Page 142
10.3.3 Calculation of among group variation (treatment)......Page 143
10.3.4 Calculation of the total variation......Page 144
10.6 Fixed or random effects......Page 146
10.7 Questions......Page 147
11.2 Multiple comparison tests after a Model I ANOVA......Page 149
11.3.1 Trace elements in New England granites......Page 152
11.3.2 Stable isotope data from tourmalines in Maine......Page 153
11.3.4 Power and a posteriori testing......Page 154
11.5 Planned comparisons......Page 156
11.6 Questions......Page 158
12.1.1 Why do an experiment with more than one factor?......Page 160
12.2 What does a two-factor ANOVA do?......Page 163
12.3 How does a two-factor ANOVA analyze these data?......Page 164
12.4 How does a two-factor ANOVA separate out the effects of each factor and interaction?......Page 168
12.5 An example of a two-factor analysis of variance......Page 171
12.6 Some essential cautions and important complications......Page 172
12.6.1 A posteriori testing is still needed when there is a significant effect of a fixed factor......Page 173
12.6.2 An interaction can obscure a main effect......Page 175
12.6.3 Fixed and random factors......Page 178
Factor A fixed, Factor B random and an interaction......Page 181
12.8 More complex designs......Page 182
12.9 Questions......Page 183
13.2 Homogeneity of variances......Page 184
13.3.1 Skew and outliers......Page 185
13.3.2 A worked example of a box-and-whiskers plot......Page 187
13.5.2 The logarithmic transformation......Page 189
13.6 Are transformations legitimate?......Page 190
13.7 Tests for heteroscedasticity......Page 192
13.8 Questions......Page 194
14.2 Two-factor ANOVA without replication......Page 196
14.3 A posteriori comparison of means after a two-factor ANOVA without replication......Page 201
14.4 Randomized blocks......Page 202
14.5 Nested ANOVA as a special case of a single-factor ANOVA......Page 203
14.6 A pictorial explanation of a nested ANOVA......Page 205
14.8 Questions......Page 210
15.1 Introduction......Page 212
15.3 Linear correlation......Page 213
15.4 Calculation of the Pearson r statistic......Page 214
15.7 Conclusion......Page 220
15.8 Questions......Page 221
16.2 Linear regression......Page 222
16.3 Calculation of the slope of the regression line......Page 223
16.4 Calculation of the intercept with the Y axis......Page 226
16.5 Testing the significance of the slope and the intercept of the regression line......Page 229
16.5.1 Testing the hypothesis that the slope is significantly different from zero......Page 230
16.6 An example: school cancellations and snow......Page 235
16.8 Predicting a value of X from a value of Y......Page 237
16.10 Assumptions of linear regression analysis......Page 238
16.11 Multiple linear regression......Page 241
16.12 Further topics in regression......Page 242
16.13 Questions......Page 243
17.2 The danger of assuming normality when a population is grossly non-normal......Page 245
17.3 The value of making a preliminary inspection of the data......Page 247
18.1 Introduction......Page 248
18.2 Comparing observed and expected frequencies: the chi-square test for goodness of fit......Page 249
18.2.1 Small sample sizes......Page 251
18.3 Comparing proportions among two or more independent samples......Page 252
18.3.1 The chi-square test for heterogeneity......Page 253
18.3.2 The G test or log-likelihood ratio......Page 254
18.4 Bias when there is one degree of freedom......Page 255
18.4.1 The Fisher Exact Test for 2×2 tables......Page 256
18.6 Inappropriate use of tests for goodness of fit and heterogeneity......Page 260
18.8 Comparing proportions among two or more related samples of nominal scale data......Page 261
18.9 Questions......Page 263
19.1 Introduction......Page 265
19.2 A non-parametric comparison between one sample and an expected distribution......Page 266
19.3.1 The Mann–Whitney test......Page 268
19.3.3 Exact tests for two independent samples......Page 270
19.3.4 Recommended non-parametric tests for two independent samples......Page 272
19.4.1 The Kruskal–Wallis test......Page 274
19.4.3 A posteriori comparisons after a non-parametric comparison......Page 275
19.4.4 Rank transformation followed by single-factor ANOVA......Page 276
19.5.1 The Wilcoxon paired-sample test......Page 277
19.5.2 Exact tests and randomization tests for two related samples......Page 278
19.6.1 The Friedman test......Page 280
19.7 Analyzing ratio, interval or ordinal data that show gross differences in variance among treatments and cannot be satisfactorily transformed......Page 282
19.8.1 Spearman’s rank correlation......Page 284
19.10 Questions......Page 286
20.1 Introduction......Page 288
20.2 Simplifying and summarizing multivariate data......Page 289
20.3 An R-mode analysis: principal components analysis......Page 290
20.4 How does a PCA combine two or more variables into one?......Page 291
20.5 What happens if the variables are not highly correlated?......Page 294
20.6 PCA for more than two variables......Page 295
20.7 The contribution of each variable to the principal components......Page 297
20.9 How many principal components should you plot?......Page 300
20.11 Summary and some cautions and restrictions on use of PCA......Page 301
20.12 Q-mode analyses: multidimensional scaling......Page 302
20.13 How is a univariate measure of dissimilarity among sampling units extracted from multivariate data?......Page 303
20.14 An example......Page 305
20.15 Stress......Page 307
20.16 Summary and cautions on the use of multidimensional scaling......Page 308
20.17 Q-mode analyses: cluster analysis......Page 309
20.19 Questions......Page 313
21.1 Introduction......Page 315
21.3 Preliminary inspection by graphing......Page 316
21.4 Detection of within-sequence similarity and dissimilarity......Page 317
21.4.1 Interpreting the correlogram......Page 321
21.5 Cross-correlation......Page 325
21.6 Regression analysis......Page 326
21.7 Simple linear regression......Page 327
21.8 More complex regression......Page 329
21.8.1 Polynomial modeling of a spatial sequence: hydrogen diffusion in a single crystal of pyroxene......Page 333
21.9 Simple autoregression......Page 335
21.10 More complex series with a cyclic component......Page 338
21.12 Some very important limitations and cautions......Page 340
21.13 Sequences of nominal scale data......Page 341
21.13.1 Sequences that have been sampled at regular intervals......Page 342
21.13.2 Sequences for which only true transitions have been recorded......Page 344
21.14 Records of the repeated occurrence of an event......Page 345
21.15 Conclusion......Page 349
21.16 Questions......Page 350
22.1 Introduction......Page 352
22.2 Testing whether a spatial distribution occurs at random......Page 353
22.2.1 A worked example......Page 355
22.2.3 Nearest neighbor analysis......Page 360
22.2.4 A worked example......Page 363
22.3 Data for the direction of objects......Page 364
22.3.2 Drawing a rose diagram......Page 366
22.3.3 Worked examples of rose diagrams......Page 368
22.3.6 Data for the orientation of objects......Page 369
22.4 Prediction and interpolation in two dimensions......Page 370
22.4.1 The semivariance and semivariogram......Page 371
22.4.2 A worked example......Page 376
22.4.3 Application of the theoretical semivariogram......Page 377
22.6 Questions......Page 380
23.1 Introduction......Page 382
Appendix A: Critical values of chi-square, t and F......Page 392
Appendix B: Answers to questions......Page 398
References......Page 407
Index......Page 409