Experimentation for Engineers: From A/B testing to Bayesian optimization

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Optimize the performance of your systems with practical experiments used by engineers in the world’s most competitive industries. In Experimentation for Engineers: From A/B testing to Bayesian optimization you will learn how to: • Design, run, and analyze an A/B test • Break the "feedback loops" cause by periodic retraining of ML models • Increase experimentation rate with multi-armed bandits • Tune multiple parameters experimentally with Bayesian optimization • Clearly define business metrics used for decision making • Identify and avoid the common pitfalls of experimentation Experimentation for Engineers: From A/B testing to Bayesian optimization is a toolbox of techniques for evaluating new features and fine-tuning parameters. You’ll start with a deep dive into methods like A/B testing, and then graduate to advanced techniques used to measure performance in industries such as finance and social media. Learn how to evaluate the changes you make to your system and ensure that your testing doesn’t undermine revenue or other business metrics. By the time you’re done, you’ll be able to seamlessly deploy experiments in production while avoiding common pitfalls. About the technology Does my software really work? Did my changes make things better or worse? Should I trade features for performance? Experimentation is the only way to answer questions like these. This unique book reveals sophisticated experimentation practices developed and proven in the world’s most competitive industries that will help you enhance machine learning systems, software applications, and quantitative trading solutions. About the book Experimentation for Engineers: From A/B testing to Bayesian optimization delivers a toolbox of processes for optimizing software systems. You’ll start by learning the limits of A/B testing, and then graduate to advanced experimentation strategies that take advantage of machine learning and probabilistic methods. The skills you’ll master in this practical guide will help you minimize the costs of experimentation and quickly reveal which approaches and features deliver the best business results. What's inside • Design, run, and analyze an A/B test • Break the “feedback loops” caused by periodic retraining of ML models • Increase experimentation rate with multi-armed bandits • Tune multiple parameters experimentally with Bayesian optimization About the reader For ML and software engineers looking to extract the most value from their systems. Examples in Python and NumPy. About the author David Sweet has worked as a quantitative trader at GETCO and a machine learning engineer at Instagram. He teaches in the AI and Data Science master's programs at Yeshiva University.

Author(s): David Sweet
Edition: 1
Publisher: Manning Publications
Year: 2023

Language: English
Commentary: Publisher's PDF
Pages: 248
City: Shelter Island, NY
Tags: Machine Learning; Bayesian Inference; Statistics; Optimization; Statistical Inference; Business Analytics; A/B Testing; Metrics; Experiment Planning

Experimentation for Engineers
brief contents
contents
preface
acknowledgments
about this book
Who should read this book
How this book is organized: A road map
About the code
liveBook discussion forum
about the author
about the cover illustration
1 Optimizing systems by experiment
1.1 Examples of engineering workflows
1.1.1 Machine learning engineer’s workflow
1.1.2 Quantitative trader’s workflow
1.1.3 Software engineer’s workflow
1.2 Measuring by experiment
1.2.1 Experimental methods
1.2.2 Practical problems and pitfalls
1.3 Why are experiments necessary?
1.3.1 Domain knowledge
1.3.2 Offline model quality
1.3.3 Simulation
Summary
2 A/B testing: Evaluating a modification to your system
2.1 Take an ad hoc measurement
2.1.1 Simulate the trading system
2.1.2 Compare execution costs
2.2 Take a precise measurement
2.2.1 Mitigate measurement variation with replication
2.3 Run an A/B test
2.3.1 Analyze your measurements
2.3.2 Design the A/B test
2.3.3 Measure and analyze
2.3.4 Recap of A/B test stages
Summary
3 Multi-armed bandits: Maximizing business metrics while experimenting
3.1 Epsilon-greedy: Account for the impact of evaluation on business metrics
3.1.1 A/B testing as a baseline
3.1.2 The epsilon-greedy algorithm
3.1.3 Deciding when to stop
3.2 Evaluating multiple system changes simultaneously
3.3 Thompson sampling: A more efficient MAB algorithm
3.3.1 Estimate the probability that an arm is the best
3.3.2 Randomized probability matching
3.3.3 The complete algorithm
Summary
4 Response surface methodology: Optimizing continuous parameters
4.1 Optimize a single continuous parameter
4.1.1 Design: Choose parameter values to measure
4.1.2 Take the measurements
4.1.3 Analyze I: Interpolate between measurements
4.1.4 Analyze II: Optimize the business metric
4.1.5 Validate the optimal parameter value
4.2 Optimizing two or more continuous parameters
4.2.1 Design the two-parameter experiment
4.2.2 Measure, analyze, and validate the 2D experiment
Summary
5 Contextual bandits: Making targeted decisions
5.1 Model a business metric offline to make decisions online
5.1.1 Model the business-metric outcome of a decision
5.1.2 Add the decision-making component
5.1.3 Run and evaluate the greedy recommender
5.2 Explore actions with epsilon-greedy
5.2.1 Missing counterfactuals degrade predictions
5.2.2 Explore with epsilon-greedy to collect counterfactuals
5.3 Explore parameters with Thompson sampling
5.3.1 Create an ensemble of prediction models
5.3.2 Randomized probability matching
5.4 Validate the contextual bandit
Summary
6 Bayesian optimization: Automating experimental optimization
6.1 Optimizing a single compiler parameter, a visual explanation
6.1.1 Simulate the compiler
6.1.2 Run the initial experiment
6.1.3 Analyze: Model the response surface
6.1.4 Design: Select the parameter value to measure next
6.1.5 Design: Balance exploration with exploitation
6.2 Model the response surface with Gaussian process regression
6.2.1 Estimate the expected CPU time
6.2.2 Estimate uncertainty with GPR
6.3 Optimize over an acquisition function
6.3.1 Minimize the acquisition function
6.4 Optimize all seven compiler parameters
6.4.1 Random search
6.4.2 A complete Bayesian optimization
Summary
7 Managing business metrics
7.1 Focus on the business
7.1.1 Don’t evaluate a model
7.1.2 Evaluate the product
7.2 Define business metrics
7.2.1 Be specific to your business
7.2.2 Update business metrics periodically
7.2.3 Business metric timescales
7.3 Trade off multiple business metrics
7.3.1 Reduce negative side effects
7.3.2 Evaluate with multiple metrics
Summary
8 Practical considerations
8.1 Violations of statistical assumptions
8.1.1 Violation of the iid assumption
8.1.2 Nonstationarity
8.2 Don’t stop early
8.3 Control family-wise error
8.3.1 Cherry-picking increases the false-positive rate
8.3.2 Control false positives with the Bonferroni correction
8.4 Be aware of common biases
8.4.1 Confounder bias
8.4.2 Small-sample bias
8.4.3 Optimism bias
8.4.4 Experimenter bias
8.5 Replicate to validate results
8.5.1 Validate complex experiments
8.5.2 Monitor changes with a reverse A/B test
8.5.3 Measure quarterly changes with holdouts
8.6 Wrapping up
Summary
Appendix A—Linear regression and the normal equations
A.1 Univariate linear regression
A.2 Multivariate linear regression
Appendix B—One factor at a time
Appendix C—Gaussian process regression
index
A
B
C
D
E
F
G
H
I
K
L
M
N
O
P
Q
R
S
T
U
V
W
Y
Z