Bayesian Analysis with Python: Introduction to statistical modeling and probabilistic programming using PyMC3 and ArviZ

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Bayesian modeling with PyMC3 and exploratory analysis of Bayesian models with ArviZ Key Features • A step-by-step guide to conduct Bayesian data analyses using PyMC3 and ArviZ • A modern, practical and computational approach to Bayesian statistical modeling • A tutorial for Bayesian analysis and best practices with the help of sample problems and practice exercises. Book Description The second edition of Bayesian Analysis with Python is an introduction to the main concepts of applied Bayesian inference and its practical implementation in Python using PyMC3, a state-of-the-art probabilistic programming library, and ArviZ, a new library for exploratory analysis of Bayesian models. The main concepts of Bayesian statistics are covered using a practical and computational approach. Synthetic and real data sets are used to introduce several types of models, such as generalized linear models for regression and classification, mixture models, hierarchical models, and Gaussian processes, among others. By the end of the book, you will have a working knowledge of probabilistic modeling and you will be able to design and implement Bayesian models for your own data science problems. After reading the book you will be better prepared to delve into more advanced material or specialized statistical modeling if you need to. What you will learn • Build probabilistic models using the Python library PyMC3 • Analyze probabilistic models with the help of ArviZ • Acquire the skills required to sanity check models and modify them if necessary • Understand the advantages and caveats of hierarchical models • Find out how different models can be used to answer different data analysis questions • Compare models and choose between alternative ones • Discover how different models are unified from a probabilistic perspective • Think probabilistically and benefit from the flexibility of the Bayesian framework Who this book is for If you are a student, data scientist, researcher, or a developer looking to get started with Bayesian data analysis and probabilistic programming, this book is for you. The book is introductory so no previous statistical knowledge is required, although some experience in using Python and NumPy is expected.

Author(s): Osvaldo Martin
Edition: 2
Publisher: Packt Publishing
Year: 2018

Language: English
Commentary: Vector PDF
Pages: 356
City: Birmingham, UK
Tags: Python; Probabilistic Programming; Bayesian Inference; Statistics; Linear Regression; Logistic Regression; Poisson Process; PyMC3; Gaussian Processes; Markov Models; Markov Decision Process; ArviZ; Information Criteria

Chapter 1: Thinking Probabilistically......Page 20
Statistics, models, and this book's approach......Page 21
Working with data......Page 22
Bayesian modeling......Page 23
Interpreting probabilities......Page 24
Defining probabilities......Page 25
Probability distributions......Page 26
Independently and identically distributed variables......Page 30
Bayes' theorem......Page 32
The coin-flipping problem......Page 35
Choosing the likelihood......Page 36
Choosing the prior......Page 38
Getting the posterior......Page 40
Computing and plotting the posterior......Page 41
The influence of the prior and how to choose one......Page 44
Model notation and visualization......Page 46
Highest-posterior density......Page 47
Posterior predictive checks......Page 48
Summary......Page 50
Exercises......Page 52
Chapter 2: Programming Probabilistically......Page 54
Probabilistic programming......Page 55
PyMC3 primer......Page 56
Model specification......Page 57
Pushing the inference button......Page 58
Summarizing the posterior......Page 59
ROPE......Page 61
Loss functions......Page 63
Gaussian inferences......Page 66
Robust inferences......Page 72
Student's t-distribution......Page 73
Groups comparison......Page 78
Cohen's d......Page 80
Probability of superiority......Page 81
The tips dataset......Page 82
Hierarchical models......Page 86
Shrinkage......Page 90
One more example......Page 94
Summary......Page 98
Exercises......Page 99
Chapter 3: Modeling with Linear Regression......Page 101
The machine learning connection......Page 102
The core of the linear regression models......Page 103
Linear models and high autocorrelation......Page 108
Modifying the data before running......Page 109
Interpreting and visualizing the posterior......Page 111
Pearson correlation coefficient......Page 114
Pearson coefficient from a multivariate Gaussian......Page 115
Robust linear regression......Page 118
Hierarchical linear regression......Page 123
Correlation, causation, and the messiness of life......Page 129
Polynomial regression......Page 130
Polynomial regression – the ultimate model?......Page 133
Multiple linear regression......Page 134
Confounding variables and redundant variables......Page 139
Multicollinearity or when the correlation is too high......Page 142
Masking effect variables......Page 147
Adding interactions......Page 149
Variable variance......Page 150
Exercises......Page 154
Chapter 4: Generalizing Linear Models......Page 157
Generalized linear models......Page 158
Logistic regression......Page 159
The logistic model......Page 160
The Iris dataset......Page 161
The logistic model applied to the iris dataset......Page 164
The boundary decision......Page 167
Implementing the model......Page 168
Interpreting the coefficients of a logistic regression......Page 169
Dealing with correlated variables......Page 172
Dealing with unbalanced classes......Page 174
Softmax regression......Page 176
Discriminative and generative models......Page 178
Poisson distribution......Page 181
The zero-inflated Poisson model......Page 183
Poisson regression and ZIP regression......Page 185
Robust logistic regression......Page 188
The GLM module......Page 189
Summary......Page 190
Exercises......Page 191
Chapter 5: Model Comparison......Page 193
Posterior predictive checks......Page 194
Occam's razor – simplicity and accuracy......Page 199
Too many parameters leads to overfitting......Page 201
The balance between simplicity and accuracy......Page 203
Predictive accuracy measures......Page 204
Cross-validation......Page 205
Log-likelihood and deviance......Page 206
Akaike information criterion......Page 207
Pareto smoothed importance sampling leave-one-out cross-validation......Page 208
Model comparison with PyMC3......Page 209
Model averaging......Page 212
Bayes factors......Page 216
Some remarks......Page 217
Computing Bayes factors......Page 218
Common problems when computing Bayes factors......Page 221
Using Sequential Monte Carlo to compute Bayes factors......Page 222
Bayes factors and Information Criteria......Page 223
Regularizing priors......Page 226
WAIC in depth......Page 227
Entropy......Page 228
Kullback-Leibler divergence......Page 230
Summary......Page 233
Exercises......Page 234
Chapter 6: Mixture Models......Page 235
Mixture models......Page 236
Finite mixture models......Page 237
The categorical distribution......Page 239
The Dirichlet distribution......Page 240
Non-identifiability of mixture models......Page 244
How to choose K......Page 246
Non-finite mixture model......Page 251
Dirichlet process......Page 252
Beta-binomial and negative binomial......Page 260
The Student's t-distribution......Page 261
Summary......Page 262
Exercises......Page 263
Chapter 7: Gaussian Processes......Page 264
Linear models and non-linear data......Page 265
Modeling functions......Page 266
Covariance functions and kernels......Page 268
Gaussian processes......Page 271
Gaussian process regression......Page 272
Regression with spatial autocorrelation......Page 279
Gaussian process classification......Page 285
The coal-mining disasters......Page 293
The redwood dataset......Page 296
Exercises......Page 300
Chapter 8: Inference Engines......Page 302
Inference engines......Page 303
Grid computing......Page 304
Quadratic method......Page 307
Variational methods......Page 309
Markovian methods......Page 312
Monte Carlo......Page 314
Metropolis-Hastings......Page 316
Hamiltonian Monte Carlo......Page 321
Sequential Monte Carlo......Page 322
Diagnosing the samples......Page 325
Convergence......Page 326
Autocorrelation......Page 331
Effective sample sizes......Page 332
Divergences......Page 333
Non-centered parameterization......Page 336
Exercises......Page 337
Chapter 9: Where To Go Next?......Page 339