Kubeflow Operations Guide: Managing Cloud and On-Premise Deployment

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

When deploying machine learning applications, building models is only a small part of the story. The entire process involves developing, orchestrating, deploying, and running scalable and portable machine learning workloads—a process Kubeflow makes much easier. With this practical guide, data scientists, data engineers, and platform architects will learn how to plan and execute a Kubeflow project that can support workflows from on-premises to the cloud. Kubeflow is an open source Kubernetes-native platform based on Google’s internal machine learning pipelines, and yet major cloud vendors including AWS and Azure advocate the use of Kubernetes and Kubeflow to manage containers and machine learning infrastructure. In today’s cloud-based world, this book is ideal for any team planning to build machine learning applications. With this book, you will: • Get a concise overview of Kubernetes and Kubeflow • Learn how to plan and build a Kubeflow installation • Operate, monitor, and automate your installation • Provide your Kubeflow installation with adequate security • Serve machine learning models on Kubeflow

Author(s): Josh Patterson, Michael Katzenellenbogen, Austin Harris
Edition: 1
Publisher: O'Reilly Media
Year: 2020

Language: English
Commentary: Vector PDF
Pages: 300
City: Sebastopol, CA
Tags: DevOps; Google Cloud Platform; Amazon Web Services; Microsoft Azure; Cloud Computing; Machine Learning; Security; Networking; Kubeflow Pipelines; KubeFlow

Copyright
Table of Contents
Preface
What Is in This Book?
Who Is This Book For?
Conventions Used in This Book
Using Code Examples
O’Reilly Online Learning
How to Contact Us
Acknowledgments
Josh
Michael
Austin
Chapter 1. Introduction to Kubeflow
Machine Learning on Kubernetes
The Evolution of Machine Learning in Enterprise
It’s Harder Than Ever to Run Enterprise Infrastructure
Identifying Next-Generation Infrastructure (NGI) Core Principles
Kubernetes for Production Application Deployment
Enter: Kubeflow
What Problems Does Kubeflow Solve?
Origin of Kubeflow
Who Uses Kubeflow?
Common Kubeflow Use Cases
Running Notebooks on GPUs
Shared Multitenant Machine Learning Environment
Building a Transfer Learning Pipeline
Deploying Models to Production for Application Integration
Components of Kubeflow
Machine Learning Tools
Applications and Scaffolding
Machine Learning Model Inference Serving with KFServing
Platforms and Clouds
Summary
Chapter 2. Kubeflow Architecture and Best Practices
Kubeflow Architecture Overview
Kubeflow and Kubernetes
Ways to Run a Job on Kubeflow
Machine Learning Metadata Service
Artifact Storage
Istio Operations in Kubeflow
Kubeflow Multitenancy Architecture
Multitenancy and Isolation
Multiuser Architecture
Multiuser Authorization Flow
Kubeflow Profiles
Multiuser Isolation
Notebook Architecture
Notebook Server Launcher UI
Notebook Controller
Pipelines Architecture
Kubeflow Best Practices
Managing Job Dependencies
Using GPUs
Experiment Management
Summary
Chapter 3. Planning a Kubeflow Installation
Security Planning
Components That Extend the Kubernetes API
Components Running Atop Kubernetes
Background and Motivation
Kubeflow and Deployed Applications
Integration
Users
Profiling Users
Varying Skillsets
Workloads
Cluster Utilization
Data Patterns
GPU Planning
Planning for GPUs
Models that Benefit from GPUs
Infrastructure Planning
Kubernetes Considerations
On-Premise
Cloud
Placement
Container Management
Serverless Container Operations with Knative
Sizing and Growing
Forecasting
Storage
Scaling
Summary
Chapter 4. Installing Kubeflow On-Premise
Kubernetes Operations from the Command Line
Installing kubectl
Using kubectl
Using Docker
Basic Install Process
Installing On-Premise
Considerations for Building Kubernetes Clusters
Gateway Host Access to Kubernetes Cluster
Active Directory Integration and User Management
Kerberos Integration
Storage Integration
Container Management and Artifact Repositories
Accessing and Interacting with Kubeflow
Common Command-Line Operations
Accessible Web UIs
Installing Kubeflow
System Requirements
Set Up and Deploy
Summary
Chapter 5. Running Kubeflow on Google Cloud
Overview of the Google Cloud Platform
Storage
Google Cloud Identity-Aware Proxy
Google Cloud Security and the Cloud Identity-Aware Proxy
GCP Projects for Application Deployments
GCP Service Accounts
Signing Up for Google Cloud Platform
Installing the Google Cloud SDK
Update Python
Download and Install Google Cloud SDK
Installing Kubeflow on Google Cloud Platform
Create a Project in the GCP Console
Enabling APIs for a Project
Set Up OAuth for GCP Cloud IAP
Deploy Kubeflow Using the Command-Line Interface
Accessing the Kubeflow UI Post-Installation
Summary
Chapter 6. Running Kubeflow on Amazon Web Services
Overview of Amazon Web Services
Storage
Amazon Storage Pricing
Amazon Cloud Security
AWS Compute Services
Managed Kubernetes on EKS
Signing Up for Amazon Web Services
Installing the AWS CLI
Update Python
Install the AWS CLI
Kubeflow on Amazon Web Services
Installing kubectl
Install the eksctl CLI for Amazon EKS
Install AWS IAM Authenticator
Install jq
Using Managed Kubernetes on Amazon EKS
Create an EKS Service Role
Create an AWS VPC
Creating EKS Clusters
Deploying an EKS Cluster with eksctl
Understanding the Deployment Process
Kubeflow Configuration and Deployment
Customize the Kubeflow Deployment
Customize Authentication
Resizing EKS Clusters
Deleting EKS Clusters
Adding Logging
Troubleshooting Deployments
Summary
Chapter 7. Running Kubeflow on Azure
Overview of the Azure Cloud Platform
Key Azure Components
Storage on Azure
The Azure Security Model
Service Accounts
Resources and Resource Groups
Azure Virtual Machines
Containers and Managed Azure Kubernetes Services
The Azure CLI
Installing the Azure CLI
Installing Kubeflow on Azure Kubernetes
Azure Login and Configuration
Create an AKS Cluster for Kubeflow
Kubeflow Installation
Authorizing Network Access to Deployment
Summary
Chapter 8. Model Serving and Integration
Basic Concepts of Model Management
Understanding Training Models Versus Model Inference
Building an Intuition for Model Integration
Scaling Model Inference Throughput
Model Management
Introduction to KFServing
Advantages of Using KFServing
Core Concepts in KFServing
Supported Pre-Built Model Servers
KFServing Security Model
Managing Models with KFServing
Installing KFServing on a Kubernetes Cluster
Deploying a Model on KFServing
Managing Model Traffic with Canarying
Deploying a Custom Transformer
Roll Back a Deployed Model
Removing a Deployed Model
Summary
Appendix A. Infrastructure Concepts
Public Key Infrastructure
Authentication
Kubeflow and Authentication
Authorization
Authorization and Role-Based Access Control
Lightweight Directory Access Protocol
Kerberos
Transport Layer Security
X.509 Cert
Webhook
Active Directory
Identity Providers
Identity-Aware Proxy (IAP)
IAP and Google Cloud Platform
OAuth
OpenID Connect
End-User Authentication with JWT
Simple and Protected GSS_API Negotiation Mechanism
Dex: A Federated OpenID Connect Provider
Dex and Kerberos
Service Accounts
The Control Plane
Options for Securing the Control Plane
Appendix B. An Overview of Kubernetes
Core Kubernetes Concepts
Pod
Object Spec and Status
Describing a Kubernetes Object
Submitting Containers to Kubernetes
Kubernetes Resource Model
Custom Resources, Controllers, and Operators
Custom Controllers
Custom Resource Definition
Appendix C. Istio Operations and Kubeflow
Service Mesh Management with Istio
Istio Architecture
Traffic Management
Istio Security Architecture
Istio Authorization and Role-Based Access Control
Index
About the Authors
Colophon