Apply advanced techniques for optimizing machine learning processes. Bayesian optimization helps pinpoint the best configuration for your machine learning models with speed and accuracy.
In Bayesian Optimization in Action you will learn how to:
• Train Gaussian processes on both sparse and large data sets
• Combine Gaussian processes with deep neural networks to make them flexible and expressive
• Find the most successful strategies for hyperparameter tuning
• Navigate a search space and identify high-performing regions
• Apply Bayesian optimization to practical use cases such as cost-constrained, multi-objective, and preference optimization
• Use PyTorch, GPyTorch, and BoTorch to implement Bayesian optimization
Bayesian Optimization in Action shows you how to optimize hyperparameter tuning, A/B testing, and other aspects of the machine learning process by applying cutting-edge Bayesian techniques. Using clear language, illustrations, and concrete examples, this book proves that Bayesian optimization doesn’t have to be difficult! You’ll get in-depth insights into how Bayesian optimization works and learn how to implement it with cutting edge Python libraries. The book’s easy-to-reuse code samples let you hit the ground running by plugging them straight into your own projects.
About the technology
Experimenting in science and engineering can be costly and time-consuming, especially without a reliable way to narrow down your choices. Bayesian optimization helps you identify optimal configurations to pursue in a search space. It uses a Gaussian process and machine learning techniques to model an objective function and quantify the uncertainty of predictions. Whether you’re tuning machine learning models, recommending products to customers, or engaging in research, Bayesian optimization can help you make better decisions, faster.
About the book
Bayesian Optimization in Action teaches you how to build Bayesian optimization systems from the ground up. This book transforms state-of-the-art research into usable techniques that you can easily put into practice, all fully illustrated with useful code samples. In it, you’ll hone your understanding of Bayesian optimization through engaging examples—from forecasting the weather, to finding the optimal amount of sugar for coffee, and even deciding if someone is psychic! Along the way, you’ll explore scenarios for when there are multiple objectives, when each decision has its own cost, and when feedback is in the form of pairwise comparisons. With this collection of techniques, you’ll be ready to find the optimal solution for everything from transport and logistics to cancer treatments.
About the reader
For machine learning practitioners who are confident in math and statistics.
About the author
Quan Nguyen is a Python programmer and machine learning enthusiast. He is interested in solving decision-making problems that involve uncertainty. Quan has authored several books on Python programming and scientific computing. He is currently pursuing a Ph.D. degree in computer science at Washington University in St. Louis where he does research on Bayesian methods in machine learning.
Author(s): Quan Nguyen
Edition: 1
Publisher: Manning Publications
Year: 2023
Language: English
Commentary: Publisher's PDF
Pages: 424
City: Shelter Island, NY
Tags: Python; Hyperparameter Tuning; Gaussian Processes; Bayesian Optimization
Bayesian Optimization in Action
contents
forewords
preface
acknowledgments
about this book
Who should read this book?
How this book is organized: A roadmap
About the code
liveBook discussion forum
about the author
About the technical editor
about the cover illustration
1 Introduction to Bayesian optimization
1.1 Finding the optimum of an expensive black box function
1.1.1 Hyperparameter tuning as an example of an expensive black box optimization problem
1.1.2 The problem of expensive black box optimization
1.1.3 Other real-world examples of expensive black box optimization problems
1.2 Introducing Bayesian optimization
1.2.1 Modeling with a Gaussian process
1.2.2 Making decisions with a BayesOpt policy
1.2.3 Combining the GP and the optimization policy to form the optimization loop
1.2.4 BayesOpt in action
1.3 What will you learn in this book?
Summary
Part 1—Modeling with Gaussian processes
2 Gaussian processes as distributions over functions
2.1 How to sell your house the Bayesian way
2.2 Modeling correlations with multivariate Gaussian distributions and Bayesian updates
2.2.1 Using multivariate Gaussian distributions to jointly model multiple variables
2.2.2 Updating MVN distributions
2.2.3 Modeling many variables with high-dimensional Gaussian distributions
2.3 Going from a finite to an infinite Gaussian
2.4 Implementing GPs in Python
2.4.1 Setting up the training data
2.4.2 Implementing a GP class
2.4.3 Making predictions with a GP
2.4.4 Visualizing predictions of a GP
2.4.5 Going beyond one-dimensional objective functions
2.5 Exercise
Summary
3 Customizing a Gaussian process with the mean and covariance functions
3.1 The importance of priors in Bayesian models
3.2 Incorporating what you already know into a GP
3.3 Defining the functional behavior with the mean function
3.3.1 Using the zero mean function as the base strategy
3.3.2 Using the constant function with gradient descent
3.3.3 Using the linear function with gradient descent
3.3.4 Using the quadratic function by implementing a custom mean function
3.4 Defining variability and smoothness with the covariance function
3.4.1 Setting the scales of the covariance function
3.4.2 Controlling smoothness with different covariance functions
3.4.3 Modeling different levels of variability with multiple length scales
3.5 Exercise
Summary
Part 2—Making decisions with Bayesian optimization
4 Refining the best result with improvement-based policies
4.1 Navigating the search space in BayesOpt
4.1.1 The BayesOpt loop and policies
4.1.2 Balancing exploration and exploitation
4.2 Finding improvement in BayesOpt
4.2.1 Measuring improvement with a GP
4.2.2 Computing the Probability of Improvement
4.2.3 Running the PoI policy
4.3 Optimizing the expected value of improvement
4.4 Exercises
4.4.1 Exercise 1: Encouraging exploration with PoI
4.4.2 Exercise 2: BayesOpt for hyperparameter tuning
Summary
5 Exploring the search space with bandit-style policies
5.1 Introduction to the MAB problem
5.1.1 Finding the best slot machine at a casino
5.1.2 From MAB to BayesOpt
5.2 Being optimistic under uncertainty with the Upper Confidence Bound policy
5.2.1 Optimism under uncertainty
5.2.2 Balancing exploration and exploitation
5.2.3 Implementation with BoTorch
5.3 Smart sampling with the Thompson sampling policy
5.3.1 One sample to represent the unknown
5.3.2 Implementation with BoTorch
5.4 Exercises
5.4.1 Exercise 1: Setting an exploration schedule for the UCB
5.4.2 Exercise 2: BayesOpt for hyperparameter tuning
Summary
6 Using information theory with entropy-based policies
6.1 Measuring knowledge with information theory
6.1.1 Measuring uncertainty with entropy
6.1.2 Looking for a remote control using entropy
6.1.3 Binary search using entropy
6.2 Entropy search in BayesOpt
6.2.1 Searching for the optimum using information theory
6.2.2 Implementing entropy search with BoTorch
6.3 Exercises
6.3.1 Exercise 1: Incorporating prior knowledge into entropy search
6.3.2 Exercise 2: Bayesian optimization for hyperparameter tuning
Summary
Part 3—Extending Bayesian optimization to specialized settings
7 Maximizing throughput with batch optimization
7.1 Making multiple function evaluations simultaneously
7.1.1 Making use of all available resources in parallel
7.1.2 Why can’t we use regular BayesOpt policies in the batch setting?
7.2 Computing the improvement and upper confidence bound of a batch of points
7.2.1 Extending optimization heuristics to the batch setting
7.2.2 Implementing batch improvement and UCB policies
7.3 Exercise 1: Extending TS to the batch setting via resampling
7.4 Computing the value of a batch of points using information theory
7.4.1 Finding the most informative batch of points with cyclic refinement
7.4.2 Implementing batch entropy search with BoTorch
7.5 Exercise 2: Optimizing airplane designs
Summary
8 Satisfying extra constraints with constrained optimization
8.1 Accounting for constraints in a constrained optimization problem
8.1.1 Constraints can change the solution of an optimization problem
8.1.2 The constraint-aware BayesOpt framework
8.2 Constraint-aware decision-making in BayesOpt
8.3 Exercise 1: Manual computation of constrained EI
8.4 Implementing constrained EI with BoTorch
8.5 Exercise 2: Constrained optimization of airplane design
Summary
9 Balancing utility and cost with multifidelity optimization
9.1 Using low-fidelity approximations to study expensive phenomena
9.2 Multifidelity modeling with GPs
9.2.1 Formatting a multifidelity dataset
9.2.2 Training a multifidelity GP
9.3 Balancing information and cost in multifidelity optimization
9.3.1 Modeling the costs of querying different fidelities
9.3.2 Optimizing the amount of information per dollar to guide optimization
9.4 Measuring performance in multifidelity optimization
9.5 Exercise 1: Visualizing average performance in multifidelity optimization
9.6 Exercise 2: Multifidelity optimization with multiple low-fidelity approximations
Summary
10 Learning from pairwise comparisons with preference optimization
10.1 Black-box optimization with pairwise comparisons
10.2 Formulating a preference optimization problem and formatting pairwise comparison data
10.3 Training a preference-based GP
10.4 Preference optimization by playing king of the hill
Summary
11 Optimizing multiple objectives at the same time
11.1 Balancing multiple optimization objectives with BayesOpt
11.2 Finding the boundary of the most optimal data points
11.3 Seeking to improve the optimal data boundary
11.4 Exercise: Multiobjective optimization of airplane design
Summary
Part 4—Special Gaussian process models
12 Scaling Gaussian processes to large datasets
12.1 Training a GP on a large dataset
12.1.1 Setting up the learning task
12.1.2 Training a regular GP
12.1.3 Problems with training a regular GP
12.2 Automatically choosing representative points from a large dataset
12.2.1 Minimizing the difference between two GPs
12.2.2 Training the model in small batches
12.2.3 Implementing the approximate model
12.3 Optimizing better by accounting for the geometry of the loss surface
12.4 Exercise
Summary
13 Combining Gaussian processes with neural networks
13.1 Data that contains structures
13.2 Capturing similarity within structured data
13.2.1 Using a kernel with GPyTorch
13.2.2 Working with images in PyTorch
13.2.3 Computing the covariance of two images
13.2.4 Training a GP on image data
13.3 Using neural networks to process complex structured data
13.3.1 Why use neural networks for modeling?
13.3.2 Implementing the combined model in GPyTorch
Summary
Appendix—Solutions to the exercises
A.1 Chapter 2: Gaussian processes as distributions over functions
A.2 Chapter 3: Incorporating prior knowledge with the mean and covariance functions
A.3 Chapter 4: Refining the best result with improvement-based policies
A.3.1 Exercise 1: Encouraging exploration with Probability of Improvement
A.3.2 Exercise 2: BayesOpt for hyperparameter tuning
A.4 Chapter 5: Exploring the search space with bandit-style policies
A.4.1 Exercise 1: Setting an exploration schedule for Upper Confidence Bound
A.4.2 Exercise 2: BayesOpt for hyperparameter tuning
A.5 Chapter 6: Using information theory with entropy-based policies
A.5.1 Exercise 1: Incorporating prior knowledge into entropy search
A.5.2 Exercise 2: BayesOpt for hyperparameter tuning
A.6 Chapter 7: Maximizing throughput with batch optimization
A.6.1 Exercise 1: Extending TS to the batch setting via resampling
A.6.2 Exercise 2: Optimizing airplane designs
A.7 Chapter 8: Satisfying extra constraints with constrained optimization
A.7.1 Exercise 1: Manual computation of constrained EI
A.7.2 Exercise 2: Constrained optimization of airplane design
A.8 Chapter 9: Balancing utility and cost with multifidelity optimization
A.8.1 Exercise 1: Visualizing average performance in multifidelity optimization
A.8.2 Exercise 2: Multifidelity optimization with multiple low-fidelity approximations
A.9 Chapter 11: Optimizing multiple objectives at the same time
A.10 Chapter 12: Scaling Gaussian processes to large data sets
index
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W