Applied Survey Data Analysis (Chapman & Hall CRC Statistics in the Social and Behavioral Scie)

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Taking a practical approach that draws on the authors’ extensive teaching, consulting, and research experiences, Applied Survey Data Analysis provides an intermediate-level statistical overview of the analysis of complex sample survey data. It emphasizes methods and worked examples using available software procedures while reinforcing the principles and theory that underlie those methods. After introducing a step-by-step process for approaching a survey analysis problem, the book presents the fundamental features of complex sample designs and shows how to integrate design characteristics into the statistical methods and software for survey estimation and inference. The authors then focus on the methods and models used in analyzing continuous, categorical, and count-dependent variables; event history; and missing data problems. Some of the techniques discussed include univariate descriptive and simple bivariate analyses, the linear regression model, generalized linear regression modeling methods, the Cox proportional hazards model, discrete time models, and the multiple imputation analysis method. The final chapter covers new developments in survey applications of advanced statistical techniques, including model-based analysis approaches. Designed for readers working in a wide array of disciplines who use survey data in their work, this book also provides a useful framework for integrating more in-depth studies of the theory and methods of survey data analysis. A guide to the applied statistical analysis and interpretation of survey data, it contains many examples and practical exercises based on major real-world survey data sets. Although the authors use Stata for most examples in the text, they offer SAS, SPSS, SUDAAN, R, WesVar, IVEware, and Mplus software code for replicating the examples on the book’s website: http://www.isr.umich.edu/src/smp/asda/

Author(s): Steven G. Heeringa, Brady T. West, Patricia A. Berglund
Series: Chapman & Hall CRC Statistics in the Social and Behavioral Scie'', 8
Edition: 1
Publisher: Chapman and Hall/CRC
Year: 2010

Language: English
Commentary: index is missing
Pages: 462
Tags: Библиотека;Компьютерная литература;Stata;

Applied Survey Data Analysis......Page 2
Statistics in the Social and Behavioral Sciences Series......Page 3
Applied SurveyData Analysis......Page 4
Contents......Page 6
Preface......Page 16
1.1 Introduction......Page 21
1.2.1 Key Theoretical Developments......Page 23
1.2.2 Key Software Developments......Page 25
1.3.1 The National Comorbidity Survey Replication ( NCS- R)......Page 26
1.3.3 The National Health and Nutrition Examination Survey ( NHANES)— 2005, 2006......Page 27
1.3.4.1 Step 1: Definition of the Problem and Statement of the Objectives......Page 28
1.3.4.2 Step 2: Understanding the Sample Design......Page 29
1.3.4.3 Step 3: Understanding Design Variables, Underlying Constructs, and Missing Data......Page 30
1.3.4.5 Step 5: Interpreting and Evaluating the Results of the Analysis......Page 31
1.3.4.6 Step 6: Reporting of Estimates and Inferences from the Survey Data......Page 32
2.1.1 Technical Documentation and Supplemental Literature Review......Page 33
2.2 Classification of Sample Designs......Page 34
2.2.1 Sampling Plans......Page 35
2.3 Target Populations and Survey Populations......Page 36
2.4.1 Relevance of SRS to Complex Sample Survey Data Analysis......Page 38
2.4.2 SRS Fundamentals: A Framework for Design- Based Inference......Page 39
2.4.3 An Example of Design- Based Inference under SRS......Page 41
2.5.1 Design Effect Ratio......Page 43
2.5.2 Generalized Design Effects and Effective Sample Sizes......Page 45
2.6 Complex Samples: Clustering and Stratification......Page 47
2.6.1 Clustered Sampling Plans......Page 48
2.6.2 Stratification......Page 51
2.6.3 Joint Effects of Sample Stratification and Clustering......Page 54
2.7.1 Introduction to Weighted Analysis of Survey Data......Page 55
2.7.2 Weighting for Probabilities of Selection......Page 57
2.7.3 Nonresponse Adjustment Weights......Page 59
2.7.3.2 Propensity Cell Adjustment Approach......Page 60
2.7.4 Poststratification Weight Factors......Page 62
2.7.5 Design Effects Due to Weighted Analysis......Page 64
2.8 Multistage Area Probability Sample Designs......Page 66
2.8.1 Primary Stage Sampling......Page 67
2.8.2 Secondary Stage Sampling......Page 68
2.8.3 Third and Fourth Stage Sampling of Housing Units and Eligible Respondents......Page 69
2.9 Special Types of Sampling Plans Encountered in Surveys......Page 70
3.1 Introduction......Page 73
3.2 Finite Populations and Superpopulation Models......Page 74
3.4 Weighted Estimation of Population Parameters......Page 76
3.5.1 Sampling Distributions of Survey Estimates......Page 80
3.5.2 Degrees of Freedom for......Page 83
3.6 Variance Estimation......Page 85
3.6.1 Simplifying Assumptions Employed in Complex Sample Variance Estimation......Page 86
3.6.2 The Taylor Series Linearization Method......Page 88
3.6.2.1 TSL Step 1......Page 89
3.6.2.2 TSL Step 2......Page 90
3.6.2.4 TSL Step 4......Page 91
3.6.2.5 TSL Step 5......Page 93
3.6.3 Replication Methods for Variance Estimation......Page 94
3.6.3.1 Jackknife Repeated Replication......Page 95
3.6.3.2 Balanced Repeated Replication......Page 98
3.6.4 An Example Comparing the Results from the TSL, JRR, and BRR Methods......Page 102
3.7 Hypothesis Testing in Survey Data Analysis......Page 103
3.8 Total Survey Error and Its Impact on Survey Estimation and Inference......Page 105
3.8.1 Variable Errors......Page 106
3.8.2 Biases in Survey Data......Page 107
4.1 Introduction......Page 111
4.2 Analysis Weights: Review by the Data User......Page 112
4.2.1 Identification of the Correct Weight Variables for the Analysis......Page 113
4.2.2 Determining the Distribution and Scaling of the Weight Variables......Page 114
4.2.3 Weighting Applications: Sensitivity of Survey Estimates to the Weights......Page 116
4.3 Understanding and Checking the Sampling Error Calculation Model......Page 118
4.3.1 Stratum and Cluster Codes in Complex Sample Survey Data Sets......Page 119
4.3.2 Building the NCS- R Sampling Error Calculation Model......Page 120
4.3.3 Combining Strata, Randomly Grouping PSUs, and Collapsing Strata......Page 123
4.3.4 Checking the Sampling Error Calculation Model for the Survey Data Set......Page 125
4.4.1 Potential Bias Due to Ignoring Missing Data......Page 128
4.4.2 Exploring Rates and Patterns of Missing Data Prior to Analysis......Page 129
4.5 Preparing to Analyze Data for Sample Subpopulations......Page 130
4.5.1 Subpopulation Distributions across Sample Design Units......Page 131
4.5.3 Preparation for Subclass Analyses......Page 134
4.6 A Final Checklist for Data Users......Page 135
5.1 Introduction......Page 137
5.2.1 Weighted Estimation......Page 138
5.2.3 Matching the Method to the Variable Type......Page 139
5.3.1 G raphical Tools for Descriptive Analysis of Survey Data......Page 140
Example 5.1: A Weighted Histogram of Total Cholesterol Using the 2005– 2006 NHANES Data......Page 141
Example 5.2: Weighted Boxplots of Total Cholesterol for U. S. Adult Men and Women Using the 2005– 2006 NHANES Data......Page 142
5.3.2 E stimation of Population Totals......Page 143
Example 5.3: Using the NCS- R Data to Estimate the Total Count of U. S. Adults with Lifetime Major Depressive Episodes ( MDE)......Page 146
Example 5.4: Using the HRS Data to Estimate Total Household Assets......Page 147
5.3.3 Means of Continuous, Binary, or Interval Scale Data......Page 148
Example 5.6: Estimating Mean Systolic Blood Pressure Using the NHANES Data......Page 149
5.3.4 Standard Deviations of Continuous Variables......Page 150
5.3.5 E stimation of Percentiles and Medians of Population Distributions......Page 151
Example 5.8: Estimating Population Quantiles for Total Household Assets Using the HRS Data......Page 152
5.4.1 X– Y Scatterplots......Page 154
5.4.2 Product Moment Correlation Statistic (......Page 155
Example 5.9: E stimating the Population Ratio of High- Density to Total Cholesterol for U. S. Adults......Page 156
5.5 Descriptive Statistics for Subpopulations......Page 157
Example 5.11: Estimating Mean Systolic Blood Pressure for Males and Females Age > 45 Using the NHANES Data......Page 158
5.6 Linear Functions of Descriptive Estimates and Differences of Means......Page 159
Example 5.12: E stimating Differences in Mean Total Household Assets between HRS Subpopulations Defined by Educational Attainment Level......Page 161
Example 5.13: E stimating Differences in Mean Total Household Assets from 2004 to 2006 Using Data from the HRS......Page 163
5.7 Exercises......Page 164
6.1 Introduction......Page 169
6.2.2 Proportions and Percentages......Page 170
6.2.3 Cross- Tabulations, Contingency Tables, and Weighted Frequencies......Page 171
6.3.1 Estimation of Proportions for Binary Variables......Page 172
Example 6.1: Estimating the Proportion of the U. S. Adult Population with an Irregular Heart Beat......Page 174
6.3.2 Estimation of Category Proportions for Multinomial Variables......Page 176
Example 6.3: Estimating the Proportions of U. S. Adults by Blood Pressure Status......Page 177
6.3.3 Testing Hypotheses Concerning a Vector of Population Proportions......Page 178
6.3.4 Graphical Display for a Single Categorical Variable......Page 179
6.4.1 Response and Factor Variables......Page 180
6.4.2 Estimation of Total, Row, and Column Proportions for Two- Way Tables......Page 182
Example 6.7: Comparing the Proportions of U. S. Adult Men and Women with Lifetime Major Depression......Page 183
6.4.4 Chi- Square Tests of Independence of Rows and Columns......Page 184
Example 6.8: Testing the Independence of Alcohol Dependence and Education Level in Young Adults ( Ages 18– 28) Using the NCS- R Data......Page 188
6.4.5 Odds Ratios and Relative Risks......Page 190
6.4.6 Simple Logistic Regression to Estimate the Odds Ratio......Page 191
Example 6.9: Simple Logistic Regression to Estimate the NCS- R Male/ Female Odds Ratio for Lifetime Major Depressive Episode......Page 192
6.4.7 B ivariate Graphical Analysis......Page 193
Example 6.10: U sing the NCS- R Data to Estimate and Test the Association between Gender and Depression in the U. S. Adult Population When Controlling for Age......Page 194
6.5.2 Log- Linear Models for Contingency Tables......Page 196
6.6 Exercises......Page 197
7.1 Introduction......Page 199
7.2 The Linear Regression Model......Page 200
7.2.1 The Standard Linear Regression Model......Page 202
7.2.2 Survey Treatment of the Regression Model......Page 203
7.3 Four Steps in Linear Regression Analysis......Page 205
7.3.1 Step 1: Specifying and Refining the Model......Page 206
7.3.2.1 Estimation for the Standard Linear Regression Model......Page 207
7.3.2.2.1 Estimation of Parameters......Page 208
7.3.2.2.2 Estimation of Variances of Parameter Estimates......Page 211
7.3.3.1 Explained Variance and Goodness of Fit......Page 213
7.3.3.3 Model Specification and Homogeneity of Variance......Page 214
7.3.3.4 Normality of the Residual Errors......Page 215
7.3.4 Step 4: Inference......Page 216
7.3.4.1 Inference Concerning Model Parameters......Page 219
7.3.4.2 Prediction Intervals......Page 222
7.4.1 Distribution of the Dependent Variable......Page 224
7.4.2 Parameterization and Scaling for Independent Variables......Page 225
7.4.4 Specification and Interpretation of Interactions and Nonlinear Relationships......Page 228
7.4.5 Model- Building Strategies......Page 230
7.5 Application: Modeling Diastolic Blood Pressure with the NHANES Data......Page 231
7.5.1 Exploring the Bivariate Relationships......Page 232
7.5.2 Naïve Analysis: Ignoring Sample Design Features......Page 235
7.5.3 Weighted Regression Analysis......Page 236
7.5.4 Appropriate Analysis: Incorporating All Sample Design Features......Page 238
7.6 Exercises......Page 244
8.1 Introduction......Page 249
8.2 Generalized Linear Models for Binary Survey Responses......Page 250
8.2.1 The Logistic Regression Model......Page 251
8.2.3 The Complementary Log– Log Model......Page 254
8.3 Building the Logistic Regression Model: Stage 1, Model Specification......Page 255
8.4 Building the Logistic Regression Model: Stage 2, Estimation of Model Parameters and Standard Errors......Page 256
8.5.1 Wald Tests of Model Parameters......Page 259
8.5.2 Goodness of Fit and Logistic Regression Diagnostics......Page 263
8.6 Building the Logistic Regression Model: Stage 4, Interpretation and Inference......Page 265
Example 8.1: Examining Predictors of a Lifetime Major Depressive Episode in the NCS- R Data......Page 271
8.7.1 Stage 1: Model Specification......Page 272
8.7.2 Stage 2: Model Estimation......Page 273
8.7.3 Stage 3: Model Evaluation......Page 275
8.7.4 Stage 4: Model Interpretation/ Inference......Page 276
8.8 Comparing the Logistic, Probit, and Complementary Log– Log GLMs for Binary Dependent Variables......Page 279
8.9 Exercises......Page 282
9.2.1 The Multinomial Logit Regression Model......Page 285
9.2.2 Multinomial Logit Regression Model: Specification Stage......Page 287
9.2.4 Multinomial Logit Regression Model: Evaluation Stage......Page 288
9.2.5 Multinomial Logit Regression Model: Interpretation Stage......Page 290
9.2.6 E xample: Fitting a Multinomial Logit Regression Model to Complex Sample Survey Data......Page 291
9.3 Logistic Regression Models for Ordinal Survey Data......Page 297
9.3.1 Cumulative Logit Regression Model......Page 298
9.3.3 Cumulative Logit Regression Model: Estimation Stage......Page 299
9.3.4 Cumulative Logit Regression Model: Evaluation Stage......Page 300
9.3.5 Cumulative Logit Regression Model: Interpretation Stage......Page 301
9.3.6 E xample: Fitting a Cumulative Logit Regression Model to Complex Sample Survey Data......Page 302
9.4.1 Survey Count Variables and Regression Modeling Alternatives......Page 306
9.4.2.1 The Poisson Regression Model......Page 308
9.4.2.2 The Negative Binomial Regression Model......Page 309
9.4.2.3 Two- Part Models: Zero- Inflated Poisson and Negative Binomial Regression Models......Page 310
9.4.3 Regression Models for Count Data: Specification Stage......Page 311
9.4.5 Regression Models for Count Data: Evaluation Stage......Page 312
9.4.6 Regression Models for Count Data: Interpretation Stage......Page 313
9.4.7 Example: Fitting Poisson and Negative Binomial Regression Models to Complex Sample Survey Data......Page 314
9.5 Exercises......Page 318
10.2.1 Survey Measurement of Event History Data......Page 323
10.2.2 Data for Event History Models......Page 325
10.2.3 Important Notation and Definitions......Page 326
10.2.4 Models for Survival Analysis......Page 327
10.3 ( Nonparametric) Kaplan– Meier Estimation of the Survivor Function......Page 328
10.3.1 K– M Model Specification and Estimation......Page 329
10.3.2 K– M Estimator— Evaluation and Interpretation......Page 330
10.3.3 K– M Survival Analysis Example......Page 331
10.4.1 Cox Proportional Hazards Model: Specification......Page 335
10.4.2 Cox Proportional Hazards Model: Estimation Stage......Page 336
10.4.3 Cox Proportional Hazards Model: Evaluation and Diagnostics......Page 337
10.4.5 E xample: Fitting a Cox Proportional Hazards Model to Complex Sample Survey Data......Page 339
10.5 Discrete Time Survival Models......Page 342
10.5.1 The Discrete Time Logistic Model......Page 343
10.5.2 Data Preparation for Discrete Time Survival Models......Page 344
10.5.3 Discrete Time Models: Estimation Stage......Page 347
10.5.4 Discrete Time Models: Evaluation and Interpretation......Page 348
10.5.5 Fitting a Discrete Time Model to Complex Sample Survey Data......Page 349
10.6 Exercises......Page 353
11.1 Introduction......Page 355
11.2.1 Sources and Patterns of Item- Missing Data in Surveys......Page 356
11.2.2 Item- Missing Data Mechanisms......Page 358
11.2.3 Implications of Item- Missing Data for Survey Data Analysis......Page 361
11.2.4 Review of Strategies to Address Item- Missing Data in Surveys......Page 362
11.3.1 A Brief History of Imputation Procedures......Page 365
11.3.2 Why the Multiple Imputation Method?......Page 366
11.3.3 Overview of Multiple Imputation and MI Phases......Page 368
11.4.1 Choosing the Variables to Include in the Imputation Model......Page 370
11.4.2 Distributional Assumptions for the Imputation Model......Page 372
11.5.1 Transforming the Imputation Problem to Monotonic Missing Data......Page 373
11.5.3 Sequential Regression or " Chained Regressions"......Page 374
11.6 Estimation and Inference for Multiply Imputed Data......Page 375
11.6.1 Estimators for Population Parameters and Associated Variance Estimators......Page 376
11.6.2 Model Evaluation and Inference......Page 377
11.7.1 Problem Definition......Page 379
11.7.2 The Imputation Model for the NHANES Blood Pressure Example......Page 380
11.7.3 Imputation of the Item- Missing Data......Page 381
11.7.4 Multiple Imputation Estimation and Inference......Page 383
11.7.4.1 Multiple Imputation Analysis 1: Estimation of Mean Diastolic Blood Pressure......Page 384
11.7.4.2 Multiple Imputation Analysis 2: Estimation of the Linear Regression Model for Diastolic Blood Pressure......Page 385
11.8 Exercises......Page 388
12.1 Introduction......Page 391
12.2 Bayesian Analysis of Complex Sample Survey Data......Page 392
12.3.1 Overview of Generalized Linear Mixed Models......Page 395
12.3.2 Generalized Linear Mixed Models and Complex Sample Survey Data......Page 399
12.3.3 GL MM Approaches to Analyzing Longitudinal Survey Data......Page 402
12.3.4 Example: Longitudinal Analysis of the HRS Data......Page 409
12.4 Fitting Structural Equation Models to Complex Sample Survey Data......Page 415
12.5 Small Area Estimation and Complex Sample Survey Data......Page 416
12.6 N onparametric Methods for Complex Sample Survey Data......Page 417
A.1 Introduction......Page 419
A. 1.1 Historical Perspective......Page 420
A. 1.2 Software for Sampling Error Estimation......Page 421
A.2 Overview of Stata......Page 427
A.3 Overview of SAS......Page 430
A. 3.1 The SAS SUR VEY Procedures......Page 431
A.4 Overview of SUDAAN Version 9.0......Page 434
A. 4.1 The SUDAA N Procedures......Page 435
A.5 Overview of SPSS......Page 441
A. 5.1 The SPSS Complex Samples Commands......Page 442
A. 6.1 WesVar......Page 447
A. 6.2 IVEware ( Imputation and Variance Estimation Software)......Page 448
A. 6.4 The R survey Package......Page 449
A.7 Summary......Page 450
References......Page 451