Econometrics with Machine Learning

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This book helps and promotes the use of machine learning tools and techniques in econometrics and explains how machine learning can enhance and expand the econometrics toolbox in theory and in practice. 
Throughout the volume, the authors raise and answer six questions: 1) What are the similarities between existing econometric and machine learning techniques? 2) To what extent can machine learning techniques assist econometric investigation? Specifically, how robust or stable is the prediction from machine learning algorithms given the ever-changing nature of human behavior? 3) Can machine learning techniques assist in testing statistical hypotheses and identifying causal relationships in ‘big data? 4) How can existing econometric techniques be extended by incorporating machine learning concepts? 5) How can new econometric tools and approaches be elaborated on based on machine learning techniques? 6) Is it possible to develop machine learning techniques further and make them even more readily applicable in econometrics?
As the data structures in economic and financial data become more complex and models become more sophisticated, the book takes a multidisciplinary approach in developing both disciplines of machine learning and econometrics in conjunction, rather than in isolation. This volume is a must-read for scholars, researchers, students, policy-makers, and practitioners, who are using econometrics in theory or in practice. 

Author(s): Felix Chan, László Mátyás
Series: Advanced Studies in Theoretical and Applied Econometrics, 53
Publisher: Springer
Year: 2022

Language: English
Pages: 384
City: Cham

Foreword
Preface
Acknowledgements
Contents
List of Contributors
Chapter 1 Linear Econometric Models with Machine Learning
1.1 Introduction
1.2 Shrinkage Estimators and Regularizers
1.2.1 ?? norm, Bridge, LASSO and Ridge
1.2.2 Elastic Net and SCAD
1.2.3 Adaptive LASSO
1.2.4 Group LASSO
1.3 Estimation
1.3.1 Computation and Least Angular Regression
1.3.2 Cross Validation and Tuning Parameters
1.4 Asymptotic Properties of Shrinkage Estimators
1.4.1 Oracle Properties
1.4.2 Asymptotic Distributions
1.4.3 Partially Penalized (Regularized) Estimator
1.5 Monte Carlo Experiments
1.5.1 Inference on Unpenalized Parameters
1.5.2 Variable Transformations and Selection Consistency
1.6 Econometrics Applications
1.6.1 Distributed Lag Models
1.6.2 Panel Data Models
1.6.3 Structural Breaks
1.7 Concluding Remarks
Appendix
Proof of Proposition 1.1
References
Chapter 2 Nonlinear Econometric Models with Machine
Learning
2.1 Introduction
2.2 Regularization for Nonlinear Econometric Models
2.2.1 Regularization with Nonlinear Least Squares
2.2.2 Regularization with Likelihood Function
Continuous Response Variable
Discrete Response Variables
2.2.3 Estimation, Tuning Parameter and Asymptotic Properties
Estimation
Tuning Parameter and Cross-Validation
Asymptotic Properties and Statistical Inference
2.2.4 Monte Carlo Experiments – Binary Model with shrinkage
2.2.5 Applications to Econometrics
2.3 Overview of Tree-based Methods - Classification Trees and Random Forest
2.3.1 Conceptual Example of a Tree
2.3.2 Bagging and Random Forests
2.3.3 Applications and Connections to Econometrics
Inference
2.4 Concluding Remarks
Appendix
Proof of Proposition 2.1
Proof of Proposition 2.2
References
Chapter 3 The Use of Machine Learning in Treatment Effect Estimation
3.1 Introduction
3.2 The Role of Machine Learning in Treatment Effect Estimation: a Selection-on-Observables Setup
3.3 Using Machine Learning to Estimate Average Treatment Effects
3.3.1 Direct versus Double Machine Learning
3.3.2 Why Does Double Machine Learning Work and Direct Machine Learning Does Not?
3.3.3 DML in a Method of Moments Framework
3.3.4 Extensions and Recent Developments in DML
3.4 Using Machine Learning to Discover Treatment Effect Heterogeneity
3.4.1 The Problem of Estimating the CATE Function
3.4.2 The Causal Tree Approach
3.4.3 Extensions and Technical Variations on the Causal Tree Approach
3.4.4 The Dimension Reduction Approach
3.5 Empirical Illustration
3.6 Conclusion
References
Chapter 4 Forecasting with Machine Learning Methods
4.1 Introduction
4.1.1 Notation
4.1.2 Organization
4.2 Modeling Framework and Forecast Construction
4.2.1 Setup
4.2.2 Forecasting Equation
4.2.3 Backtesting
4.2.4 Model Choice and Estimation
4.3 Forecast Evaluation and Model Comparison
4.3.1 The Diebold-Mariano Test
4.3.2 Li-Liao-Quaedvlieg Test
4.3.3 Model Confidence Sets
4.4 Linear Models
4.4.1 Factor Regression
4.4.2 Bridging Sparse and Dense Models
4.4.3 Ensemble Methods
4.4.3.1 Bagging
4.4.3.2 Complete Subset Regression
4.5 Nonlinear Models
4.5.1 Feedforward Neural Networks
4.5.1.1 Shallow Neural Networks
4.5.1.2 Deep Neural Networks
4.5.2 Long Short Term Memory Networks
4.5.3 Convolution Neural Networks
4.5.4 Autoenconders: Nonlinear Factor Regression
4.5.5 Hybrid Models
4.6 Concluding Remarks
References
Chapter 5 Causal Estimation of Treatment Effects From Obervational Health Care Data Using Machine Learning Methods
5.1 Introduction
5.2 Naïve Estimation of Causal Effects in Outcomes Models with Binary Treatment Variables
5.3 Is Machine Learning Compatible with Causal Inference?
5.4 The Potential Outcomes Model
5.5 Modeling the Treatment Exposure Mechanism–Propensity Score Matching and Inverse Probability Treatment Weights
5.6 Modeling Outcomes and Exposures: Doubly Robust Methods
5.7 Targeted Maximum Likelihood Estimation (TMLE) for Causal Inference
5.8 Empirical Applications of TMLE in Health Outcomes Studies
5.8.1 Use of Machine Learning to Estimate TMLE Models
5.9 Extending TMLE to Incorporate Instrumental Variables
5.10 Some Practical Considerations on the Use of IVs
5.11 Alternative Definitions of Treatment Effects
5.12 A Final Word on the Importance of Study Design in Mitigating Bias
References
Chapter 6 Econometrics of Networks with Machine
Learning
6.1 Introduction
6.2 Structure, Representation, and Characteristics of Networks
6.3 The Challenges of Working with Network Data
6.4 Graph Dimensionality Reduction
6.4.1 Types of Embeddings
6.4.2 Algorithmic Foundations of Embeddings
6.5 Sampling Networks
6.5.1 Node Sampling Approaches
6.5.2 Edge Sampling Approaches
Hybrid Approaches and the Importance of the Problem
6.5.3 Traversal-Based Sampling Approaches
6.5.3.1 Search Based Techniques
Pseudo Code for Search-Based Sampling Algorithms.
6.5.3.2 RandomWalk-Based Techniques
6.6 Applications of Machine Learning in the Econometrics of Networks
6.6.1 Applications of Machine Learning in Spatial Models
6.6.2 Gravity Models for Flow Prediction
6.6.3 The Geographically Weighted Regression Model and ML
6.7 Concluding Remarks
References
Chapter 7 Fairness in Machine Learning and Econometrics
7.1 Introduction
7.2 Examples in Econometrics
7.2.1 Linear IV Model
7.2.2 A Nonlinear IV Model with Binary Sensitive Attribute
7.2.3 Fairness and Structural Econometrics
7.3 Fairness for Inverse Problems
7.4 Full Fairness IV Approximation
7.4.1 Projection onto Fairness
7.4.2 Fair Solution of the Structural IV Equation
7.4.3 Approximate Fairness
7.5 Estimation with an Exogenous Binary Sensitive Attribute
7.6 An Illustration
7.7 Conclusions
References
Chapter 8 Graphical Models and their Interactions with
Machine Learning in the Context of Economics
and Finance
8.1 Introduction
8.1.1 Notation
8.2 Graphical Models: Methodology and Existing Approaches
8.2.1 Graphical LASSO
8.2.2 Nodewise Regression
8.2.3 CLIME
8.2.4 Solution Techniques
8.3 Graphical Models in the Context of Finance
8.3.1 The No-Short-Sale Constraint and Shrinkage
8.3.2 The A-Norm Constraint and Shrinkage
8.3.3 Classical Graphical Models for Finance
8.3.4 Augmented Graphical Models for Finance Applications
8.4 Graphical Models in the Context of Economics
8.4.1 Forecast Combinations
8.4.2 Vector Autoregressive Models
8.5 Further Integration of Graphical Models with Machine Learning
References
Chapter 9 Poverty, Inequality and Development Studies with
Machine Learning
9.1 Introduction
9.2 Measurement and Forecasting
9.2.1 Combining Sources to Improve Data Availability
9.2.2 More Granular Measurements
9.2.2.1 Data Visualization and High-Resolution Maps
9.2.2.2 Interpolation
9.2.2.3 Extended Regional Coverage
9.2.2.4 Extrapolation
9.2.3 Dimensionality Reduction
9.2.4 Data Imputation
9.2.5 Methods
9.3 Causal Inference
9.3.1 Heterogeneous Treatment Effects
9.3.2 Optimal Treatment Assignment
9.3.3 Handling High-Dimensional Data and Debiased ML
9.3.4 Machine-Building Counterfactuals
9.3.5 New Data Sources for Outcomes and Treatments
9.3.6 Combining Observational and Experimental Data
9.4 Computing Power and Tools
9.5 Concluding Remarks
References
Chapter 10 Machine Learning for Asset Pricing
10.1 Introduction
10.2 How Machine Learning Techniques Can Help Identify Stochastic Discount Factors
10.3 How Machine Learning Techniques Can Test/Evaluate Asset Pricing Models
10.4 How Machine Learning Techniques Can Estimate Linear Factor Models
10.4.1 Gagliardini, Ossola, and Scaillet’s (2016) Econometric Two-Pass Approach for Assessing Linear Factor Models
10.4.2 Kelly, Pruitt, and Su’s (2019) Instrumented Principal Components Analysis
10.4.3 Gu, Kelly, and Xiu’s (2021) Autoencoder
10.4.4 Kozak, Nagel, and Santosh’s (2020) Regularized Bayesian Approach
10.4.5 Which Factors to Choose and How to Deal withWeak Factors?
10.5 How Machine Learning Can Predict in Empirical Asset Pricing
10.6 Concluding Remarks
Appendix 1: An Upper Bound for the Sharpe Ratio
Appendix 2: A Comparison of Different PCA Approaches
References
Appendix A Terminology
A.1 Introduction
A.2 Terms