Kubeflow for Machine Learning: From Lab to Production

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

If you're training a machine learning model but aren't sure how to put it into production, this book will get you there. Kubeflow provides a collection of cloud native tools for different stages of a model's lifecycle, from data exploration, feature preparation, and model training to model serving. This guide helps data scientists build production-grade machine learning implementations with Kubeflow and shows data engineers how to make models scalable and reliable. Using examples throughout the book, authors Holden Karau, Trevor Grant, Ilan Filonenko, Richard Liu, and Boris Lublinsky explain how to use Kubeflow to train and serve your machine learning models on top of Kubernetes in the cloud or in a development environment on-premises. • Understand Kubeflow's design, core components, and the problems it solves • Understand the differences between Kubeflow on different cluster types • Train models using Kubeflow with popular tools including Scikit-learn, TensorFlow, and Apache Spark • Keep your model up to date with Kubeflow Pipelines • Understand how to capture model training metadata • Explore how to extend Kubeflow with additional open source tools • Use hyperparameter tuning for training • Learn how to serve your model in production

Author(s): Trevor Grant, Holden Karau, Boris Lublinsky, Richard Liu, Ilan Filonenko
Edition: 1
Publisher: O'Reilly Media
Year: 2020

Language: English
Pages: 264
City: Sebastopol, CA

Copyright
Table of Contents
Foreword
Preface
Our Assumption About You
Your Responsibility as a Practitioner
Conventions Used in This Book
Code Examples
Using Code Examples
O’Reilly Online Learning
How to Contact the Authors
How to Contact Us
Acknowledgments
Grievances
Chapter 1. Kubeflow: What It Is and Who It Is For
Model Development Life Cycle
Where Does Kubeflow Fit In?
Why Containerize?
Why Kubernetes?
Kubeflow’s Design and Core Components
Data Exploration with Notebooks
Data/Feature Preparation
Training
Hyperparameter Tuning
Model Validation
Inference/Prediction
Pipelines
Component Overview
Alternatives to Kubeflow
Clipper (RiseLabs)
MLflow (Databricks)
Others
Introducing Our Case Studies
Modified National Institute of Standards and Technology
Mailing List Data
Product Recommender
CT Scans
Conclusion
Chapter 2. Hello Kubeflow
Getting Set Up with Kubeflow
Installing Kubeflow and Its Dependencies
Setting Up Local Kubernetes
Setting Up Your Kubeflow Development Environment
Creating Our First Kubeflow Project
Training and Deploying a Model
Training and Monitoring Progress
Test Query
Going Beyond a Local Deployment
Conclusion
Chapter 3. Kubeflow Design: Beyond the Basics
Getting Around the Central Dashboard
Notebooks (JupyterHub)
Training Operators
Kubeflow Pipelines
Hyperparameter Tuning
Model Inference
Metadata
Component Summary
Support Components
MinIO
Istio
Knative
Apache Spark
Kubeflow Multiuser Isolation
Conclusion
Chapter 4. Kubeflow Pipelines
Getting Started with Pipelines
Exploring the Prepackaged Sample Pipelines
Building a Simple Pipeline in Python
Storing Data Between Steps
Introduction to Kubeflow Pipelines Components
Argo: the Foundation of Pipelines
What Kubeflow Pipelines Adds to Argo Workflow
Building a Pipeline Using Existing Images
Kubeflow Pipeline Components
Advanced Topics in Pipelines
Conditional Execution of Pipeline Stages
Running Pipelines on Schedule
Conclusion
Chapter 5. Data and Feature Preparation
Deciding on the Correct Tooling
Local Data and Feature Preparation
Fetching the Data
Data Cleaning: Filtering Out the Junk
Formatting the Data
Feature Preparation
Custom Containers
Distributed Tooling
TensorFlow Extended
Distributed Data Using Apache Spark
Distributed Feature Preparation Using Apache Spark
Putting It Together in a Pipeline
Using an Entire Notebook as a Data Preparation Pipeline Stage
Conclusion
Chapter 6. Artifact and Metadata Store
Kubeflow ML Metadata
Programmatic Query
Kubeflow Metadata UI
Using MLflow’s Metadata Tools with Kubeflow
Creating and Deploying an MLflow Tracking Server
Logging Data on Runs
Using the MLflow UI
Conclusion
Chapter 7. Training a Machine Learning Model
Building a Recommender with TensorFlow
Getting Started
Starting a New Notebook Session
TensorFlow Training
Deploying a TensorFlow Training Job
Distributed Training
Using GPUs
Using Other Frameworks for Distributed Training
Training a Model Using Scikit-Learn
Starting a New Notebook Session
Data Preparation
Scikit-Learn Training
Explaining the Model
Exporting Model
Integration into Pipelines
Conclusion
Chapter 8. Model Inference
Model Serving
Model Serving Requirements
Model Monitoring
Model Accuracy, Drift, and Explainability
Model Monitoring Requirements
Model Updating
Model Updating Requirements
Summary of Inference Requirements
Model Inference in Kubeflow
TensorFlow Serving
Review
Seldon Core
Designing a Seldon Inference Graph
Testing Your Model
Serving Requests
Monitoring Your Models
Review
KFServing
Serverless and the Service Plane
Data Plane
Example Walkthrough
Peeling Back the Underlying Infrastructure
Review
Conclusion
Chapter 9. Case Study Using Multiple Tools
The Denoising CT Scans Example
Data Prep with Python
DS-SVD with Apache Spark
Visualization
The CT Scan Denoising Pipeline
Sharing the Pipeline
Conclusion
Chapter 10. Hyperparameter Tuning and Automated Machine Learning
AutoML: An Overview
Hyperparameter Tuning with Kubeflow Katib
Katib Concepts
Installing Katib
Running Your First Katib Experiment
Prepping Your Training Code
Configuring an Experiment
Running the Experiment
Katib User Interface
Tuning Distributed Training Jobs
Neural Architecture Search
Advantages of Katib over Other Frameworks
Conclusion
Appendix A. Argo Executor Configurations and Trade-Offs
Appendix B. Cloud-Specific Tools and Configuration
Google Cloud
TPU-Accelerated Instances
Dataflow for TFX
Appendix C. Using Model Serving in Applications
Building Streaming Applications Leveraging Model Serving
Stream Processing Engines and Libraries
Introducing Cloudflow
Building Batch Applications Leveraging Model Serving
Index
About the Authors
Colophon