MLOps Engineering at Scale

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Dodge costly and time-consuming infrastructure tasks, and rapidly bring your machine learning models to production with MLOps and pre-built serverless tools! In MLOps Engineering at Scale you will learn: • Extracting, transforming, and loading datasets • Querying datasets with SQL • Understanding automatic differentiation in PyTorch • Deploying model training pipelines as a service endpoint • Monitoring and managing your pipeline’s life cycle • Measuring performance improvements MLOps Engineering at Scale shows you how to put machine learning into production efficiently by using pre-built services from AWS and other cloud vendors. You’ll learn how to rapidly create flexible and scalable machine learning systems without laboring over time-consuming operational tasks or taking on the costly overhead of physical hardware. Following a real-world use case for calculating taxi fares, you will engineer an MLOps pipeline for a PyTorch model using AWS server-less capabilities. About the technology A production-ready machine learning system includes efficient data pipelines, integrated monitoring, and means to scale up and down based on demand. Using cloud-based services to implement ML infrastructure reduces development time and lowers hosting costs. Serverless MLOps eliminates the need to build and maintain custom infrastructure, so you can concentrate on your data, models, and algorithms. About the book MLOps Engineering at Scale teaches you how to implement efficient machine learning systems using pre-built services from AWS and other cloud vendors. This easy-to-follow book guides you step-by-step as you set up your serverless ML infrastructure, even if you’ve never used a cloud platform before. You’ll also explore tools like PyTorch Lightning, Optuna, and MLFlow that make it easy to build pipelines and scale your deep learning models in production. What's inside • Reduce or eliminate ML infrastructure management • Learn state-of-the-art MLOps tools like PyTorch Lightning and MLFlow • Deploy training pipelines as a service endpoint • Monitor and manage your pipeline’s life cycle • Measure performance improvements About the reader Readers need to know Python, SQL, and the basics of machine learning. No cloud experience required. About the author Carl Osipov implemented his first neural net in 2000 and has worked on deep learning and machine learning at Google and IBM.

Author(s): Carl Osipov
Edition: 1
Publisher: Manning
Year: 2022

Language: English
Commentary: Vector PDF
Pages: 344
City: Shelter Island, NY
Tags: Machine Learning; Pipelines; Scalability; Hyperparameter Tuning; PyTorch; Serverless Applications; MLOps; Datasets; Data Exploration; Feature Selection

MLOps Engineering at Scale
brief contents
contents
preface
acknowledgments
about this book
Who should read this book
How this book is organized: A road map
About the code
liveBook discussion forum
about the author
about the cover illustration
Part 1—Mastering the data set
1 Introduction to serverless machine learning
1.1 What is a machine learning platform?
1.2 Challenges when designing a machine learning platform
1.3 Public clouds for machine learning platforms
1.4 What is serverless machine learning?
1.5 Why serverless machine learning?
1.5.1 Serverless vs. IaaS and PaaS
1.5.2 Serverless machine learning life cycle
1.6 Who is this book for?
1.6.1 What you can get out of this book
1.7 How does this book teach?
1.8 When is this book not for you?
1.9 Conclusions
Summary
2 Getting started with the data set
2.1 Introducing the Washington, DC, taxi rides data set
2.1.1 What is the business use case?
2.1.2 What are the business rules?
2.1.3 What is the schema for the business service?
2.1.4 What are the options for implementing the business service?
2.1.5 What data assets are available for the business service?
2.1.6 Downloading and unzipping the data set
2.2 Starting with object storage for the data set
2.2.1 Understanding object storage vs. filesystems
2.2.2 Authenticating with Amazon Web Services
2.2.3 Creating a serverless object storage bucket
2.3 Discovering the schema for the data set
2.3.1 Introducing AWS Glue
2.3.2 Authorizing the crawler to access your objects
2.3.3 Using a crawler to discover the data schema
2.4 Migrating to columnar storage for more efficient analytics
2.4.1 Introducing column-oriented data formats for analytics
2.4.2 Migrating to a column-oriented data format
Summary
3 Exploring and preparing the data set
3.1 Getting started with interactive querying
3.1.1 Choosing the right use case for interactive querying
3.1.2 Introducing AWS Athena
3.1.3 Preparing a sample data set
3.1.4 Interactive querying using Athena from a browser
3.1.5 Interactive querying using a sample data set
3.1.6 Querying the DC taxi data set
3.2 Getting started with data quality
3.2.1 From “garbage in, garbage out” to data quality
3.2.2 Before starting with data quality
3.2.3 Normative principles for data quality
3.3 Applying VACUUM to the DC taxi data
3.3.1 Enforcing the schema to ensure valid values
3.3.2 Cleaning up invalid fare amounts
3.3.3 Improving the accuracy
3.4 Implementing VACUUM in a PySpark job
Summary
4 More exploratory data analysis and data preparation
4.1 Getting started with data sampling
4.1.1 Exploring the summary statistics of the cleaned-up data set
4.1.2 Choosing the right sample size for the test data set
4.1.3 Exploring the statistics of alternative sample sizes
4.1.4 Using a PySpark job to sample the test set
Summary
Part 2—PyTorch for serverless machine learning
5 Introducing PyTorch: Tensor basics
5.1 Getting started with tensors
5.2 Getting started with PyTorch tensor creation operations
5.3 Creating PyTorch tensors of pseudorandom and interval values
5.4 PyTorch tensor operations and broadcasting
5.5 PyTorch tensors vs. native Python lists
Summary
6 Core PyTorch: Autograd, optimizers, and utilities
6.1 Understanding the basics of autodiff
6.2 Linear regression using PyTorch automatic differentiation
6.3 Transitioning to PyTorch optimizers for gradient descent
6.4 Getting started with data set batches for gradient descent
6.5 Data set batches with PyTorch Dataset and DataLoader
6.6 Dataset and DataLoader classes for gradient descent with batches
Summary
7 Serverless machine learning at scale
7.1 What if a single node is enough for my machine learning model?
7.2 Using IterableDataset and ObjectStorageDataset
7.3 Gradient descent with out-of-memory data sets
7.4 Faster PyTorch tensor operations with GPUs
7.5 Scaling up to use GPU cores
Summary
8 Scaling out with distributed training
8.1 What if the training data set does not fit in memory?
8.1.1 Illustrating gradient accumulation
8.1.2 Preparing a sample model and data set
8.1.3 Understanding gradient descent using out-of-memory data shards
8.2 Parameter server approach to gradient accumulation
8.3 Introducing logical ring-based gradient descent
8.4 Understanding ring-based distributed gradient descent
8.5 Phase 1: Reduce-scatter
8.6 Phase 2: All-gather
Summary
Part 3—Serverless machine learning pipeline
9 Feature selection
9.1 Guiding principles for feature selection
9.1.1 Related to the label
9.1.2 Recorded before inference time
9.1.3 Supported by abundant examples
9.1.4 Expressed as a number with a meaningful scale
9.1.5 Based on expert insights about the project
9.2 Feature selection case studies
9.3 Feature selection using guiding principles
9.3.1 Related to the label
9.3.2 Recorded before inference time
9.3.3 Supported by abundant examples
9.3.4 Numeric with meaningful magnitude
9.3.5 Bring expert insight to the problem
9.4 Selecting features for the DC taxi data set
Summary
10 Adopting PyTorch Lightning
10.1 Understanding PyTorch Lightning
10.1.1 Converting PyTorch model training to PyTorch Lightning
10.1.2 Enabling test and reporting for a trained model
10.1.3 Enabling validation during model training
Summary
11 Hyperparameter optimization
11.1 Hyperparameter optimization with Optuna
11.1.1 Understanding loguniform hyperparameters
11.1.2 Using categorical and log-uniform hyperparameters
11.2 Neural network layers configuration as a hyperparameter
11.3 Experimenting with the batch normalization hyperparameter
11.3.1 Using Optuna study for hyperparameter optimization
11.3.2 Visualizing an HPO study in Optuna
Summary
12 Machine learning pipeline
12.1 Describing the machine learning pipeline
12.2 Enabling PyTorch-distributed training support with Kaen
12.2.1 Understanding PyTorch-distributed training settings
12.3 Unit testing model training in a local Kaen container
12.4 Hyperparameter optimization with Optuna
12.4.1 Enabling MLFlow support
12.4.2 Using HPO for DcTaxiModel in a local Kaen provider
12.4.3 Training with the Kaen AWS provider
Summary
Appendix A—Introduction to machine learning
A.1 Why machine learning?
A.2 Machine learning at first glance
A.3 Machine learning with structured data sets
A.4 Regression with structured data sets
A.5 Classification with structured data sets
A.6 Training a supervised machine learning model
Appendix B—Getting started with Docker
B.1 Getting started with Docker
B.2 Building a custom image
B.3 Sharing your custom image with the world
index
Symbols
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X
Y
Z