Introduces the popular, powerful and free programming language and software package Python.
Python is an ideal candidate for starting to learn econometrics and data analysis. It has a huge user base, especially in the fields of data science, machine learning, and artificial intelligence, where it arguably is the most popular software overall. These are very exciting areas and there is a lot of cutting edge research in the integration of their tools into the econometrics toolbox. So why not kill two birds with one stone and master a powerful and important software package while learning econometrics at the same time? Because Python must be hard to learn and to apply to econometrics? It is not at all, as this book shows.
And Python is completely free and available for all relevant operating systems. When using it in econometrics courses, students can easily download a copy to their own computers and use it at home (or their favorite cafés) to replicate examples and work on take-home assignments. This hands-on experience is essential for the understanding of the econometric models and methods. It also prepares students to conduct their own empirical analyses for their theses, research projects, and professional work.
A problem we encountered when teaching introductory econometrics classes is that the textbooks that also introduce Python do not discuss econometrics. Conversely, our favorite introductory econometrics textbooks do not cover Python. Although it is possible to combine a good econometrics textbook with an unrelated introduction to Python, this creates substantial hurdles because the topics and order of presentation are different, and the terminology and notation are inconsistent.
Focus: implementation of standard tools and methods used in econometrics
Compatible with "Introductory Econometrics" by Jeffrey M. Wooldridge in terms of topics, organization, terminology and notation
Companion website with full text, all code for download and other goodies
Topics
A gentle introduction to Python
Simple and multiple regression in matrix form and using black box routines
Inference in small samples and asymptotics
Monte Carlo simulations
Heteroscedasticity
Time series regression
Pooled cross-sections and panel data
Instrumental variables and two-stage least squares
Simultaneous equation models
Limited dependent variables: binary, count data, censoring, truncation, and sample selection
Formatted reports using Jupyter Notebooks
The book is designed mainly for students of introductory econometrics who ideally use Wooldridge as their main textbook. It can also be useful for readers who are familiar with econometrics and possibly other software packages. For them, it offers an introduction to Python and can be used to look up the implementation of standard econometric methods.
Author(s): Florian Heiss, Daniel Brunner
Publisher: UPfIE
Year: 2023
Language: english
Pages: 428
Preface
Introduction
Getting Started
Software
Python Scripts
Modules
File Names and the Working Directory
Errors and Warnings
Other Resources
Objects in Python
Variables
Objects in Python
Objects in numpy
Objects in pandas
External Data
Data Sets in the Examples
Import and Export of Data Files
Data from other Sources
Base Graphics with matplotlib
Basic Graphs
Customizing Graphs with Options
Overlaying Several Plots
Exporting to a File
Descriptive Statistics
Discrete Distributions: Frequencies and Contingency Tables
Continuous Distributions: Histogram and Density
Empirical Cumulative Distribution Function (ECDF)
Fundamental Statistics
Probability Distributions
Discrete Distributions
Continuous Distributions
Cumulative Distribution Function (CDF)
Random Draws from Probability Distributions
Confidence Intervals and Statistical Inference
Confidence Intervals
t Tests
p Values
Advanced Python
Conditional Execution
Loops
Functions
Object Orientation
Outlook
Monte Carlo Simulation
Finite Sample Properties of Estimators
Asymptotic Properties of Estimators
Simulation of Confidence Intervals and t Tests
Regression Analysis with Cross-Sectional Data
The Simple Regression Model
Simple OLS Regression
Coefficients, Fitted Values, and Residuals
Goodness of Fit
Nonlinearities
Regression through the Origin and Regression on a Constant
Expected Values, Variances, and Standard Errors
Monte Carlo Simulations
One Sample
Many Samples
Violation of SLR.4
Violation of SLR.5
Multiple Regression Analysis: Estimation
Multiple Regression in Practice
OLS in Matrix Form
Ceteris Paribus Interpretation and Omitted Variable Bias
Standard Errors, Multicollinearity, and VIF
Multiple Regression Analysis: Inference
The t Test
General Setup
Standard Case
Other Hypotheses
Confidence Intervals
Linear Restrictions: F-Tests
Multiple Regression Analysis: OLS Asymptotics
Simulation Exercises
Normally Distributed Error Terms
Non-Normal Error Terms
(Not) Conditioning on the Regressors
LM Test
Multiple Regression Analysis: Further Issues
Model Formulae
Data Scaling: Arithmetic Operations Within a Formula
Standardization: Beta Coefficients
Logarithms
Quadratics and Polynomials
Hypothesis Testing
Interaction Terms
Prediction
Confidence and Prediction Intervals for Predictions
Effect Plots for Nonlinear Specifications
Multiple Regression Analysis with Qualitative Regressors
Linear Regression with Dummy Variables as Regressors
Boolean Variables
Categorical Variables
ANOVA Tables
Breaking a Numeric Variable Into Categories
Interactions and Differences in Regression Functions Across Groups
Heteroscedasticity
Heteroscedasticity-Robust Inference
Heteroscedasticity Tests
Weighted Least Squares
More on Specification and Data Issues
Functional Form Misspecification
Measurement Error
Missing Data and Nonrandom Samples
Outlying Observations
Least Absolute Deviations (LAD) Estimation
Regression Analysis with Time Series Data
Basic Regression Analysis with Time Series Data
Static Time Series Models
Time Series Data Types in Python
Equispaced Time Series in Python
Irregular Time Series in Python
Other Time Series Models
Finite Distributed Lag Models
Trends
Seasonality
Further Issues in Using OLS with Time Series Data
Asymptotics with Time Series
The Nature of Highly Persistent Time Series
Differences of Highly Persistent Time Series
Regression with First Differences
Serial Correlation and Heteroscedasticity in Time Series Regressions
Testing for Serial Correlation of the Error Term
FGLS Estimation
Serial Correlation-Robust Inference with OLS
Autoregressive Conditional Heteroscedasticity
Advanced Topics
Pooling Cross-Sections Across Time: Simple Panel Data Methods
Pooled Cross-Sections
Difference-in-Differences
Organizing Panel Data
First Differenced Estimator
Advanced Panel Data Methods
Fixed Effects Estimation
Random Effects Models
Dummy Variable Regression and Correlated Random Effects
Robust (Clustered) Standard Errors
Instrumental Variables Estimation and Two Stage Least Squares
Instrumental Variables in Simple Regression Models
More Exogenous Regressors
Two Stage Least Squares
Testing for Exogeneity of the Regressors
Testing Overidentifying Restrictions
Instrumental Variables with Panel Data
Simultaneous Equations Models
Setup and Notation
Estimation by 2SLS
Outlook: Estimation by 3SLS
Limited Dependent Variable Models and Sample Selection Corrections
Binary Responses
Linear Probability Models
Logit and Probit Models: Estimation
Inference
Predictions
Partial Effects
Count Data: The Poisson Regression Model
Corner Solution Responses: The Tobit Model
Censored and Truncated Regression Models
Sample Selection Corrections
Advanced Time Series Topics
Infinite Distributed Lag Models
Testing for Unit Roots
Spurious Regression
Cointegration and Error Correction Models
Forecasting
Carrying Out an Empirical Project
Working with Python Scripts
Logging Output in Text Files
Formatted Documents with Jupyter Notebook
Getting Started
Cells
Markdown Basics
Appendices
Python Scripts
Scripts Used in Chapter 01
Scripts Used in Chapter 02
Scripts Used in Chapter 03
Scripts Used in Chapter 04
Scripts Used in Chapter 05
Scripts Used in Chapter 06
Scripts Used in Chapter 07
Scripts Used in Chapter 08
Scripts Used in Chapter 09
Scripts Used in Chapter 10
Scripts Used in Chapter 11
Scripts Used in Chapter 12
Scripts Used in Chapter 13
Scripts Used in Chapter 14
Scripts Used in Chapter 15
Scripts Used in Chapter 16
Scripts Used in Chapter 17
Scripts Used in Chapter 18
Scripts Used in Chapter 19
Bibliography
List of Wooldridge (2019) Examples
Index