Hamiltonian Monte Carlo Methods in Machine Learning introduces methods for optimal tuning of HMC parameters, along with an introduction of Shadow and Non-canonical HMC methods with improvements and speedup. Lastly, the authors address the critical issues of variance reduction for parameter estimates of numerous HMC based samplers. The book offers a comprehensive introduction to Hamiltonian Monte Carlo methods and provides a cutting-edge exposition of the current pathologies of HMC-based methods in both tuning, scaling and sampling complex real-world posteriors. These are mainly in the scaling of inference (e.g., Deep Neural Networks), tuning of performance-sensitive sampling parameters and high sample autocorrelation.
Other sections provide numerous solutions to potential pitfalls, presenting advanced HMC methods with applications in renewable energy, finance and image classification for biomedical applications. Readers will get acquainted with both HMC sampling theory and algorithm implementation.
Provides in-depth analysis for conducting optimal tuning of Hamiltonian Monte Carlo (HMC) parameters
Presents readers with an introduction and improvements on Shadow HMC methods as well as non-canonical HMC methods
Demonstrates how to perform variance reduction for numerous HMC-based samplers
Includes source code from applications and algorithms
Author(s): TshilidziMarwala,Wilson TsakaneMongwe, Rendani Mbuvha
Publisher: Elsevier
Year: 2023
Language: English
Commentary: true
Pages: 222
Front Cover
Hamiltonian Monte Carlo Methods in Machine Learning
Copyright
Contents
List of figures
List of tables
Authors
Tshilidzi Marwala
Wilson Tsakane Mongwe
Rendani Mbuvha
Foreword
Preface
Nomenclature
List of symbols
1 Introduction to Hamiltonian Monte Carlo
1.1 Introduction
1.2 Background to Markov Chain Monte Carlo
1.3 Metropolis-Hastings algorithm
1.4 Metropolis Adjusted Langevin algorithm
1.5 Hamiltonian Monte Carlo
1.6 Magnetic Hamiltonian Monte Carlo
1.7 Quantum-Inspired Hamiltonian Monte Carlo
1.8 Separable Shadow Hamiltonian Hybrid Monte Carlo
1.9 No-U-Turn Sampler algorithm
1.10 Antithetic Hamiltonian Monte Carlo
1.11 Book objectives
1.12 Book contributions
1.13 Conclusion
2 Sampling benchmarks and performance metrics
2.1 Benchmark problems and datasets
2.1.1 Banana shaped distribution
2.1.2 Multivariate Gaussian distributions
2.1.3 Neal's funnel density
2.1.4 Merton jump-diffusion process model
2.1.5 Bayesian logistic regression
2.1.6 Bayesian neural networks
2.1.7 Benchmark datasets
2.1.8 Processing of the datasets
2.2 Performance metrics
2.2.1 Effective sample size
2.2.2 Convergence analysis
2.2.3 Predictive performance on unseen data
2.3 Algorithm parameter tuning
2.4 Conclusion
3 Stochastic volatility Metropolis-Hastings
3.1 Proposed methods
3.2 Experiments
3.3 Results and discussion
3.4 Conclusion
4 Quantum-inspired magnetic Hamiltonian Monte Carlo
4.1 Proposed algorithm
4.2 Experiment description
4.2.1 Experiment settings
4.2.2 Sensitivity to the vol-of-vol parameter
4.3 Results and discussion
4.4 Conclusion
5 Generalised magnetic and shadow Hamiltonian Monte Carlo
5.1 Proposed partial momentum retention algorithms
5.2 Experiment description
5.2.1 Experiment settings
5.2.2 Sensitivity to momentum refreshment parameter
5.3 Results and discussion
5.4 Conclusion
6 Shadow Magnetic Hamiltonian Monte Carlo
6.1 Background
6.2 Shadow Hamiltonian for MHMC
6.3 Proposed Shadow Magnetic algorithm
6.4 Experiment description
6.4.1 Experiment settings
6.4.2 Sensitivity to momentum refreshment parameter
6.5 Results and discussion
6.6 Conclusion
7 Adaptive Shadow Hamiltonian Monte Carlo
7.1 Proposed adaptive shadow algorithm
7.2 Experiment description
7.3 Results and discussion
7.4 Conclusion
8 Adaptive noncanonical Hamiltonian Monte Carlo
8.1 Background
8.2 Proposed algorithm
8.3 Experiments
8.4 Results and discussion
8.5 Conclusions
9 Antithetic Hamiltonian Monte Carlo techniques
9.1 Proposed antithetic samplers
9.2 Experiment description
9.3 Results and discussion
9.4 Conclusion
10 Bayesian neural network inference in wind speed nowcasting
10.1 Background
10.1.1 Automatic relevance determination
10.1.1.1 Inference of ARD hyperparameters
10.1.1.2 ARD committees
10.2 Experiment setup
10.2.1 WASA meteorological datasets
10.2.2 Relationship to wind power
10.2.3 Performance evaluation
10.2.4 Preliminary step size tuning runs
10.3 Results and discussion
10.3.1 Sampling performance
10.3.2 Predictive performance with ARD
10.3.3 ARD committees and feature importance
10.3.4 Re-training BNNs on relevant features
10.4 Conclusion
11 A Bayesian analysis of the efficacy of Covid-19 lockdown measures
11.1 Background
11.1.1 Review of compartment models for Covid-19
11.1.2 Lockdown alert levels
11.2 Methods
11.2.1 Infection data
11.2.2 The adjusted SIR model
11.2.3 Parameter inference using the No-U-Turn sampler
11.2.3.1 Prior distributions
11.3 Results and discussion
11.3.1 Spreading rate under various lockdown alert levels
11.3.1.1 No restrictions (alert level 0): 5 March 2020 – 18 March 2020
11.3.1.2 Initial restrictions (adjusted alert level 0): 18 March 2020 – 25 March 2020
11.3.1.3 Alert level 5: 26 March 2020 – 30 April 2020
11.3.1.4 Alert level 4: 1 May 2020 – 31 May 2020
11.3.1.5 Alert level 3: 1 June 2020 – 17 August 2020
11.3.1.6 Alert level 2: 18 August 2020 – 20 September 2020
11.3.1.7 Alert level 1: 21 September 2020 – 28 December 2020
11.3.1.8 Adjusted level 3: 29 December 2020 – 28 February 2021
11.3.1.9 Adjusted alert levels [1-4]: 1 March 2021 – 18 December 2021
11.3.1.10 Transitions between alert levels and efficacy of restrictions
11.3.1.11 Recovery rate
11.4 Conclusion
12 Probabilistic inference of equity option prices under jump-diffusion processes
12.1 Background
12.1.1 Merton jump diffusion option pricing model
12.2 Numerical experiments
12.2.1 Data description
12.2.2 Experiment description
12.3 Results and discussions
12.4 Conclusion
13 Bayesian inference of local government audit outcomes
13.1 Background
13.2 Experiment description
13.2.1 Data description
13.2.2 Financial ratio calculation
13.2.3 Bayesian logistic regression with ARD
13.3 Results and discussion
13.4 Conclusion
14 Conclusions
14.1 Summary of contributions
14.2 Ongoing and future work
A Separable shadow Hamiltonian
A.1 Derivation of separable shadow Hamiltonian
A.2 S2HMC satisfies detailed balance
A.3 Derivatives from non-canonical Poisson brackets
B ARD posterior variances
C ARD committee feature selection
D Summary of audit outcome literature survey
References
Index
Back Cover