Bayesian Nonparametrics for Causal Inference and Missing Data

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Bayesian Nonparametrics for Causal Inference and Missing Data provides an overview of flexible Bayesian nonparametric (BNP) methods for modeling joint or conditional distributions and functional relationships, and their interplay with causal inference and missing data. This book emphasizes the importance of making untestable assumptions to identify estimands of interest, such as missing at random assumption for missing data and unconfoundedness for causal inference in observational studies. Unlike parametric methods, the BNP approach can account for possible violations of assumptions and minimize concerns about model misspecification. The overall strategy is to first specify BNP models for observed data and then to specify additional uncheckable assumptions to identify estimands of interest.

The book is divided into three parts. Part I develops the key concepts in causal inference and missing data and reviews relevant concepts in Bayesian inference. Part II introduces the fundamental BNP tools required to address causal inference and missing data problems. Part III shows how the BNP approach can be applied in a variety of case studies. The datasets in the case studies come from electronic health records data, survey data, cohort studies, and randomized clinical trials.

Features

• Thorough discussion of both BNP and its interplay with causal inference and missing data

• How to use BNP and g-computation for causal inference and non-ignorable missingness

• How to derive and calibrate sensitivity parameters to assess sensitivity to deviations from uncheckable causal and/or missingness assumptions

• Detailed case studies illustrating the application of BNP methods to causal inference and missing data

• R code and/or packages to implement BNP in causal inference and missing data problems

The book is primarily aimed at researchers and graduate students from statistics and biostatistics. It will also serve as a useful practical reference for mathematically sophisticated epidemiologists and medical researchers.

Author(s): Michael J. Daniels, Antonio Linero, Jason Roy
Series: Chapman & Hall/CRC Monographs on Statistics and Applied Probability
Publisher: CRC Press/Chapman & Hall
Year: 2023

Language: English
Pages: 262
City: Boca Raton

Cover
Half Title
Series Page
Title Page
Copyright Page
Dedication
Contents
Preface
I. Overview of Bayesian inference in causal inference and missing data and identifiability
1. Overview of causal inference
1.1. Introduction
1.1.1. Types of Causal Effects
1.1.2. Identifiability and Causal Assumptions
1.2. The g-Formula
1.2.1. Time-Dependent Confounding
1.2.2. Bayesian Nonparametrics and the g-Formula
1.3. Propensity Scores
1.3.1. Covariate Balance
1.3.2. Conditioning on the Propensity Score
1.3.3. Positivity and Overlap
1.4. Marginal Structural Models
1.5. Principal Stratification
1.6. Causal Mediation
1.7. Summary
2. Overview of missing data
2.1. Introduction
2.2. Overview of Missing Data
2.2.1. What is “Missing Data?”
2.2.2. Full vs. Observed Data
2.2.3. Notation and Data Structures
2.2.4. Processes Leading to Missing Data
2.3. Defining Estimands with Missing Data
2.4. Classification of Missing Data Mechanisms
2.4.1. Missing Completely at Random (MCAR)
2.4.2. Missing at Random (MAR)
2.4.3. Missing Not at Random (MNAR)
2.4.4. Everywhere MAR and MCAR
2.4.5. Identifiability of Estimands under MCAR, MAR, and MNAR
2.4.6. Deciding between MCAR, MAR, and MNAR
2.5. Ignorable versus Non-Ignorable Missingness
2.6. Types of Non-Ignorable Models
2.6.1. Selection Models
2.6.2. Pattern Mixture Models
2.6.3. Shared Parameter Models
2.6.4. Observed Data Modeling Strategies
2.7. Summary and a Look Forward
3. Overview of Bayesian inference for missing data and causal inference
3.1. The Posterior Distribution
3.2. Priors and Identifiability
3.2.1. Priors in General
3.2.2. Priors for Unidentified Parameters
3.2.3. Priors for the Distribution of Covariates
3.3. Computation of the Posterior
3.3.1. An Overview of Markov Chain Monte Carlo
3.3.2. Gibbs Sampling
3.3.3. The Metropolis-Hastings Algorithm
3.3.4. Slice Sampling
3.3.5. Hamiltonian Monte Carlo
3.3.6. Drawing Inferences from MCMC Output
3.4. Model Selection/Checking
3.4.1. Model Selection
3.4.2. Model Checking
3.5. Data Augmentation
3.6. Bayesian g-Computation
3.7. Summary
4. Identifiability and sensitivity analysis
4.1. Calibration of Sensitivity Parameters
4.2. Identifiability
4.2.1. Sensitivity to the Ignorability Assumption for Causal Inference with a Point Treatment
4.2.2. Sensitivity to the Sequential Ignorability Assumption for Causal Mediation
4.2.3. Monotonicity Assumptions for Principal Stratification
4.3. Monotone Restrictions
4.3.1. Pattern Mixture Alternatives to MAR
4.3.2. The Non-Future Dependence Assumption
4.3.3. Completing the NFD Specification
4.3.4. g-Computation for Interior Family Restrictions
4.3.5. g-Computation for the NFD Restriction
4.3.6. Differential Reasons for Dropout
4.4. Non-Monotone Restrictions
4.4.1. The Partial Missing at Random Assumption
4.4.2. Generic Non-Monotone Restrictions
4.4.3. Computation of Treatment Effects under Non-Monotone Missingness
4.4.4. Strategies for Introducing Sensitivity Parameters
4.5. Summary
II. Bayesian nonparametrics for causal inference and missing data
5. Bayesian decision trees and their ensembles
5.1. Motivation: The Need for Priors on Functions
5.1.1. Nonparametric Binary Regression and Semiparametric Gaussian Regression
5.1.2. Running Example: Medical Expenditure Data
5.2. From Basis Expansions to Tree Ensembles
5.3. Bayesian Additive Regression Trees
5.3.1. Decision Trees
5.3.2. Priors over Decision Trees
5.3.3. Ensembles of Decision Trees
5.3.4. Prior Specification for Bayesian Additive Regression Trees
5.3.5. Posterior Computation for Bayesian Additive Regression Trees
5.3.6. Non-Bayesian Approaches
5.4. Bayesian Additive Regression Trees Applied to Causal Inference
5.4.1. Estimating the Outcome Regression Function
5.4.2. Regularization-Induced Confounding and Bayesian Causal Forests
5.5. BART Models for Other Data Types
5.6. Summary
6. Dirichlet process mixtures and extensions
6.1. Motivation for Dirichlet Process Mixtures
6.2. Dirichlet Process Priors
6.3. Dirichlet Process Mixtures (DPMs)
6.3.1. Posterior Computations
6.3.2. DPMs for Causal Inference and Missing Data
6.3.3. Shahbaba and Neal DPM
6.3.4. Priors on Parameters of the Base Measure
6.4. Enriched Dirichlet Process Mixtures (EDPMs)
6.4.1. EDPM for Causal Inference and Missing Data
6.4.2. Posterior Computations
6.4.2.1. MCMC
6.4.2.2. Post-Processing Steps (after MCMC): g-Computation
6.5. Summary
7. Gaussian process priors and dependent Dirichlet processes
7.1. Motivation: Alternate Priors for Functions and Nonparametric Modeling of Conditional Distributions
7.2. Gaussian Process Priors
7.2.1. Normal Outcomes
7.2.2. Binary or Count Outcomes
7.2.3. Priors on GP Parameters
7.2.4. Posterior Computations
7.2.5. GP for Causal Inference
7.3. Dependent Dirichlet Process Priors
7.3.1. Sampling Algorithms
7.3.2. DDP+GP for Causal Inference
7.3.3. Considerations for Choosing between Various DP Mixture Models
7.4. Summary
III. Case studies
8. Causal inference on quantiles using propensity scores
8.1. EHR Data and Questions of Interest
8.2. Methods
8.3. Analysis
8.4. Conclusions
9. Causal inference with a point treatment using an EDPM model
9.1. Hepatic Safety of Therapies for HIV/HCV Coinfection
9.2. Methods
9.3. Analysis
9.4. Conclusions
10. DDP+GP for causal inference using marginal structural models
10.1. Changes in Neurocognitive Function among Individuals with HIV
10.2. Methods
10.3. Analysis
10.4. Conclusions
11. DPMs for dropout in longitudinal studies
11.1. Schizophrenia Clinical Trial
11.2. Methods
11.3. Posterior Computation
11.4. Analysis
11.5. Conclusions
12. DPMs for non-monotone missingness
12.1. The Breast Cancer Prevention Trial (BCPT)
12.2. Methods
12.3. Posterior Computation
12.4. Analysis
12.5. Conclusions
13. Causal mediation using DPMs
13.1. STRIDE Project
13.2. Methods
13.3. Analysis
13.4. Conclusions
14. Causal mediation using BART
14.1. Motivation
14.2. Methods
14.3. Regularization-Induced Confounding and the Prior on Selection Bias
14.4. Results
14.5. Conclusions
15. Causal analysis of semicompeting risks using a principal stratification estimand and DDP+GP
15.1. Brain Cancer Clinical Trial
15.2. Methods
15.3. Analysis
15.4. Conclusions
Bibliography
Index