Learn the skills necessary to design, build, and deploy applications powered by machine learning (ML). Through the course of this hands-on book, you’ll build an example ML-driven application from initial idea to deployed product. Data scientists, software engineers, and product managers—including experienced practitioners and novices alike—will learn the tools, best practices, and challenges involved in building a real-world ML application step by step.
Author Emmanuel Ameisen, an experienced data scientist who led an AI education program, demonstrates practical ML concepts using code snippets, illustrations, screenshots, and interviews with industry leaders. Part I teaches you how to plan an ML application and measure success. Part II explains how to build a working ML model. Part III demonstrates ways to improve the model until it fulfills your original vision. Part IV covers deployment and monitoring strategies.
This book will help you:
• Define your product goal and set up a machine learning problem
• Build your first end-to-end pipeline quickly and acquire an initial dataset
• Train and evaluate your ML models and address performance bottlenecks
• Deploy and monitor your models in a production environment
Author(s): Emmanuel Ameisen
Edition: 1
Publisher: O'Reilly Media
Year: 2020
Language: English
Commentary: Vector PDF
Pages: 260
City: Sebastopol, CA
Tags: Machine Learning; Debugging; Recommender Systems; Classification; Monitoring; Pipelines; Deployment; Application Development; Model Evaluation
Copyright
Table of Contents
Preface
The Goal of Using Machine Learning Powered Applications
Use ML to Build Practical Applications
Additional Resources
Practical ML
What This Book Covers
Prerequisites
Our Case Study: ML–Assisted Writing
The ML Process
Conventions Used in This Book
Using Code Examples
O’Reilly Online Learning
How to Contact Us
Acknowledgments
Part I. Find the Correct ML Approach
Chapter 1. From Product Goal to ML Framing
Estimate What Is Possible
Models
Data
Framing the ML Editor
Trying to Do It All with ML: An End-to-End Framework
The Simplest Approach: Being the Algorithm
Middle Ground: Learning from Our Experience
Monica Rogati: How to Choose and Prioritize ML Projects
Conclusion
Chapter 2. Create a Plan
Measuring Success
Business Performance
Model Performance
Freshness and Distribution Shift
Speed
Estimate Scope and Challenges
Leverage Domain Expertise
Stand on the Shoulders of Giants
ML Editor Planning
Initial Plan for an Editor
Always Start with a Simple Model
To Make Regular Progress: Start Simple
Start with a Simple Pipeline
Pipeline for the ML Editor
Conclusion
Part II. Build a Working Pipeline
Chapter 3. Build Your First End-to-End Pipeline
The Simplest Scaffolding
Prototype of an ML Editor
Parse and Clean Data
Tokenizing Text
Generating Features
Test Your Workflow
User Experience
Modeling Results
ML Editor Prototype Evaluation
Model
User Experience
Conclusion
Chapter 4. Acquire an Initial Dataset
Iterate on Datasets
Do Data Science
Explore Your First Dataset
Be Efficient, Start Small
Insights Versus Products
A Data Quality Rubric
Label to Find Data Trends
Summary Statistics
Explore and Label Efficiently
Be the Algorithm
Data Trends
Let Data Inform Features and Models
Build Features Out of Patterns
ML Editor Features
Robert Munro: How Do You Find, Label, and Leverage Data?
Conclusion
Part III. Iterate on Models
Chapter 5. Train and Evaluate Your Model
The Simplest Appropriate Model
Simple Models
From Patterns to Models
Split Your Dataset
ML Editor Data Split
Judge Performance
Evaluate Your Model: Look Beyond Accuracy
Contrast Data and Predictions
Confusion Matrix
ROC Curve
Calibration Curve
Dimensionality Reduction for Errors
The Top-k Method
Other Models
Evaluate Feature Importance
Directly from a Classifier
Black-Box Explainers
Conclusion
Chapter 6. Debug Your ML Problems
Software Best Practices
ML-Specific Best Practices
Debug Wiring: Visualizing and Testing
Start with One Example
Test Your ML Code
Debug Training: Make Your Model Learn
Task Difficulty
Optimization Problems
Debug Generalization: Make Your Model Useful
Data Leakage
Overfitting
Consider the Task at Hand
Conclusion
Chapter 7. Using Classifiers for Writing Recommendations
Extracting Recommendations from Models
What Can We Achieve Without a Model?
Extracting Global Feature Importance
Using a Model’s Score
Extracting Local Feature Importance
Comparing Models
Version 1: The Report Card
Version 2: More Powerful, More Unclear
Version 3: Understandable Recommendations
Generating Editing Recommendations
Conclusion
Part IV. Deploy and Monitor
Chapter 8. Considerations When Deploying Models
Data Concerns
Data Ownership
Data Bias
Systemic Bias
Modeling Concerns
Feedback Loops
Inclusive Model Performance
Considering Context
Adversaries
Abuse Concerns and Dual-Use
Chris Harland: Shipping Experiments
Conclusion
Chapter 9. Choose Your Deployment Option
Server-Side Deployment
Streaming Application or API
Batch Predictions
Client-Side Deployment
On Device
Browser Side
Federated Learning: A Hybrid Approach
Conclusion
Chapter 10. Build Safeguards for Models
Engineer Around Failures
Input and Output Checks
Model Failure Fallbacks
Engineer for Performance
Scale to Multiple Users
Model and Data Life Cycle Management
Data Processing and DAGs
Ask for Feedback
Chris Moody: Empowering Data Scientists to Deploy Models
Conclusion
Chapter 11. Monitor and Update Models
Monitoring Saves Lives
Monitoring to Inform Refresh Rate
Monitor to Detect Abuse
Choose What to Monitor
Performance Metrics
Business Metrics
CI/CD for ML
A/B Testing and Experimentation
Other Approaches
Conclusion
Index
About the Author
Colophon