This thoroughly practical and engaging textbook is designed to equip students with the skills needed to undertake sound regression analysis without requiring high-level math.
Regression Analysis covers the concepts needed to design optimal regression models and to properly interpret regressions. It details the most common pitfalls, including three sources of bias not covered in other textbooks. Rather than focusing on equations and proofs, the book develops an understanding of these biases visually and with examples of situations in which such biases could arise. In addition, it describes how ‘holding other factors constant’ actually works and when it does not work. This second edition features a new chapter on integrity and ethics, and has been updated throughout to include more international examples. Each chapter offers examples, exercises, and clear summaries, all of which are designed to support student learning to help towards producing responsible research.
This is the textbook the author wishes he had learned from, as it would have helped him avoid many research mistakes he made in his career. It is ideal for anyone learning quantitative methods in the social sciences, business, medicine, and data analytics. It will also appeal to researchers and academics looking to better understand regressions. Additional digital supplements are available at: www.youtube.com/channel/UCenm3BWqQyXA2JRKB_QXGyw.
Author(s): Jeremy Arkes
Edition: 2
Publisher: Routledge
Year: 2023
Language: English
Pages: 411
City: London
Cover
Half Title
Title Page
Copyright Page
Dedication
Table of Contents
List of figures
List of tables
About the author
Preface
Acknowledgments
List of abbreviations
1 Introduction
1.1 The problem
1.2 The purpose of research
1.3 What causes problems in the research process?
1.4 About this book
1.5 Quantitative vs. qualitative research
1.6 Stata and R code
1.7 Chapter summary
2 Regression analysis basics
2.1 What is a regression?
2.2 The four main objectives for regression analysis
2.3 The Simple Regression Model
2.4 How are regression lines determined?
2.5 The explanatory power of the regression
2.6 What contributes to slopes of regression lines?
2.7 Using residuals to gauge relative performance
2.8 Correlation vs. causation
2.9 The Multiple Regression Model
2.10 Assumptions of regression models
2.11 Everyone has their own effect
2.12 Causal effects can change over time
2.13 Why regression results might be wrong: inaccuracy and imprecision
2.14 The use of regression flowcharts
2.15 The underlying Linear Algebra in regression equations
2.16 Definitions and key concepts
2.17 Chapter summary
3 Essential tools for regression analysis
3.1 Using dummy (binary) variables
3.2 Non-linear functional forms using Ordinary Least Squares
3.3 Weighted regression models
3.4 Calculating standardized coefficient estimates to allow comparisons
3.5 Chapter summary
4 What does "holding other factors constant" mean?
4.1 Why do we want to "hold other factors constant"?
4.2 Operative-vs-"held constant" and good-vs-bad variation in a key-explanatory variable
4.3 How "holding other factors constant" works when done cleanly
4.4 Why is it difficult to "hold a factor constant"?
4.5 When you do not want to hold a factor constant
4.6 Proper terminology for controlling for a variable
4.7 Chapter summary
5 Standard errors, hypothesis tests, p-values, and aliens
5.1 Standard errors
5.2 How the standard error determines the likelihood of various values of the true coefficient
5.3 Hypothesis testing in regression analysis
5.4 Problems with standard errors (multicollinearity, heteroskedasticity, and clustering) and how to fix them
5.5 The Bayesian critique of p-values (and statistical significance)
5.6 What model diagnostics should you do?
5.7 What the research on the hot hand in basketball tells us about the existence of other life in the universe
5.8 What does an insignificant estimate tell you?
5.9 Statistical significance is not the goal
5.10 Why I believe we should scrap hypothesis tests
5.11 Chapter summary
6 What could go wrong when estimating causal effects?
6.1 Setting up the problem for estimating a causal effect
6.2 Good variation vs. bad variation in the key-explanatory variable
6.3 An introduction to the PITFALLS
6.4 PITFALL #1: Reverse causality
6.5 PITFALL #2: Omitted-factors bias
6.6 PITFALL #3: Self-selection bias
6.7 PITFALL #4: Measurement error
6.8 PITFALL #5: Using mediating factors or outcomes as control variables
6.9 PITFALL #6: Improper reference groups
6.10 PITFALL #7: Over-weighting groups (when using fixed effects or dummy variables)
6.11 How to choose the best set of control variables (model selection)
6.12 What could affect the validity of the sample?
6.13 Applying the PITFALLS to studies on estimating divorce effects on children
6.14 Applying the PITFALLS to nutritional studies
6.15 Chapter summary
7 Strategies for other regression objectives
7.1 Strategies and PITFALLS for forecasting/predicting an outcome
7.2 Strategies and PITFALLS for determining predictors of an outcome
7.3 Strategies and PITFALLS for adjusting outcomes for various factors and anomaly detection
7.4 Summary of the strategies and PITFALLS for each regression objective
8 Methods to address biases
8.1 Fixed effects
8.2 Correcting for over-weighted groups (PITFALL #7) using fixed effects
8.3 Random effects
8.4 First-differences
8.5 Difference-in-differences
8.6 Two-stage least squares (instrumental-variables)
8.7 Regression discontinuities
8.8 Knowing when to punt
8.9 Summary
9 Other methods besides Ordinary Least Squares
9.1 Types of outcome variables
9.2 Dichotomous outcomes
9.3 Ordinal outcomes – ordered models
9.4 Categorical outcomes – Multinomial Logit Model
9.5 Censored outcomes – Tobit models
9.6 Count variables – Negative Binomial and Poisson models
9.7 Duration models
9.8 Summary
10 Time-series models
10.1 The components of a time-series variable
10.2 Autocorrelation
10.3 Autoregressive models
10.4 Distributed-lag models
10.5 Consequences of and tests for autocorrelation
10.6 Stationarity
10.7 Vector Autoregression
10.8 Forecasting with time series
10.9 Summary
11 Some really interesting research
11.1 Can discrimination be a self-fulfilling prophecy?
11.2 Does Medicaid participation improve health outcomes?
11.3 Estimating peer effects on academic outcomes
11.4 How much does a GED improve labor-market outcomes?
11.5 How female integration in the Norwegian military affects gender attitudes among males
12 How to conduct a research project
12.1 Choosing a topic
12.2 Conducting the empirical part of the study
12.3 Writing the report
13 The ethics of regression analysis
13.1 What do we hope to see and not to see in others' research?
13.2 The incentives that could lead to unethical practices
13.3 P-hacking and other unethical practices
13.4 How to be ethical in your research
13.5 Examples of how studies could have been improved under the ethical guidelines I describe
13.6 Summary
14 Summarizing thoughts
14.1 Be aware of your cognitive biases
14.2 What betrays trust in published studies
14.3 How to do a referee report responsibly
14.4 Summary of the most important points and interpretations
14.5 Final words of wisdom (and one final Yogi quote)
Appendix of background statistical tools
A.1 Random variables and probability distributions
A.2 The normal distribution and other important distributions
A.3 Sampling distributions
A.4 Desired properties of estimators
Glossary
Index