Machine Learning Bookcamp: Build a portfolio of real-life projects

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Time to flex your machine learning muscles! Take on the carefully designed challenges of the Machine Learning Bookcamp and master essential ML techniques through practical application. In Machine Learning Bookcamp you will: • Collect and clean data for training models • Use popular Python tools, including NumPy, Scikit-Learn, and TensorFlow • Apply ML to complex datasets with images • Deploy ML models to a production-ready environment The only way to learn is to practice! In Machine Learning Bookcamp, you’ll create and deploy Python-based machine learning models for a variety of increasingly challenging projects. Taking you from the basics of machine learning to complex applications such as image analysis, each new project builds on what you’ve learned in previous chapters. You’ll build a portfolio of business-relevant machine learning projects that hiring managers will be excited to see. About the technology Master key machine learning concepts as you build actual projects! Machine learning is what you need for analyzing customer behavior, predicting price trends, evaluating risk, and much more. To master ML, you need great examples, clear explanations, and lots of practice. This book delivers all three! About the book Machine Learning Bookcamp presents realistic, practical machine learning scenarios, along with crystal-clear coverage of key concepts. In it, you’ll complete engaging projects, such as creating a car price predictor using linear regression and deploying a churn prediction service. You’ll go beyond the algorithms and explore important techniques like deploying ML applications on serverless systems and serving models with Kubernetes and Kubeflow. Dig in, get your hands dirty, and have fun building your ML skills! What's inside • Collect and clean data for training models • Use popular Python tools, including NumPy, Scikit-Learn, and TensorFlow • Deploy ML models to a production-ready environment About the reader Python programming skills assumed. No previous machine learning knowledge is required. About the author Alexey Grigorev is a principal data scientist at OLX Group. He runs DataTalks.Club, a community of people who love data.

Author(s): Alexey Grigorev
Edition: 1
Publisher: Manning
Year: 2021

Language: English
Commentary: Vector PDF
Pages: 472
City: Shelter Island, NY
Tags: Machine Learning; Neural Networks; Deep Learning; Regression; Decision Trees; Python; Classification; Predictive Models; Feature Engineering; TensorFlow; Ensemble Learning; Kubernetes; Model Evaluation; Random Forest; AWS Lambda; Serverless Applications; Gradient Boosting; Kuberflow; Churn Rate; Data Exploration; Metrics

Machine Learning Bookcamp
brief contents
contents
foreword
preface
acknowledgments
about this book
Who should read this book
How this book is organized: a roadmap
About the code
liveBook discussion forum
Other online resources
about the author
about the cover illustration
1 Introduction to machine learning
1.1 Machine learning
1.1.1 Machine learning vs. rule-based systems
1.1.2 When machine learning isn’t helpful
1.1.3 Supervised machine learning
1.2 Machine learning process
1.2.1 Business understanding
1.2.2 Data understanding
1.2.3 Data preparation
1.2.4 Modeling
1.2.5 Evaluation
1.2.6 Deployment
1.2.7 Iterate
1.3 Modeling and model validation
Summary
2 Machine learning for regression
2.1 Car-price prediction project
2.1.1 Downloading the dataset
2.2 Exploratory data analysis
2.2.1 Exploratory data analysis toolbox
2.2.2 Reading and preparing data
2.2.3 Target variable analysis
2.2.4 Checking for missing values
2.2.5 Validation framework
2.3 Machine learning for regression
2.3.1 Linear regression
2.3.2 Training linear regression model
2.4 Predicting the price
2.4.1 Baseline solution
2.4.2 RMSE: Evaluating model quality
2.4.3 Validating the model
2.4.4 Simple feature engineering
2.4.5 Handling categorical variables
2.4.6 Regularization
2.4.7 Using the model
2.5 Next steps
2.5.1 Exercises
2.5.2 Other projects
Summary
Answers to exercises
3 Machine learning for classification
3.1 Churn prediction project
3.1.1 Telco churn dataset
3.1.2 Initial data preparation
3.1.3 Exploratory data analysis
3.1.4 Feature importance
3.2 Feature engineering
3.2.1 One-hot encoding for categorical variables
3.3 Machine learning for classification
3.3.1 Logistic regression
3.3.2 Training logistic regression
3.3.3 Model interpretation
3.3.4 Using the model
3.4 Next steps
3.4.1 Exercises
3.4.2 Other projects
Summary
Answers to exercises
4 Evaluation metrics for classification
4.1 Evaluation metrics
4.1.1 Classification accuracy
4.1.2 Dummy baseline
4.2 Confusion table
4.2.1 Introduction to the confusion table
4.2.2 Calculating the confusion table with NumPy
4.2.3 Precision and recall
4.3 ROC curve and AUC score
4.3.1 True positive rate and false positive rate
4.3.2 Evaluating a model at multiple thresholds
4.3.3 Random baseline model
4.3.4 The ideal model
4.3.5 ROC Curve
4.3.6 Area under the ROC curve (AUC)
4.4 Parameter tuning
4.4.1 K-fold cross-validation
4.4.2 Finding best parameters
4.5 Next steps
4.5.1 Exercises
4.5.2 Other projects
Summary
Answers to exercises
5 Deploying machine learning models
5.1 Churn-prediction model
5.1.1 Using the model
5.1.2 Using Pickle to save and load the model
5.2 Model serving
5.2.1 Web services
5.2.2 Flask
5.2.3 Serving churn model with Flask
5.3 Managing dependencies
5.3.1 Pipenv
5.3.2 Docker
5.4 Deployment
5.4.1 AWS Elastic Beanstalk
5.5 Next steps
5.5.1 Exercises
5.5.2 Other projects
Summary
6 Decision trees and ensemble learning
6.1 Credit risk scoring project
6.1.1 Credit scoring dataset
6.1.2 Data cleaning
6.1.3 Dataset preparation
6.2 Decision trees
6.2.1 Decision tree classifier
6.2.2 Decision tree learning algorithm
6.2.3 Parameter tuning for decision tree
6.3 Random forest
6.3.1 Training a random forest
6.3.2 Parameter tuning for random forest
6.4 Gradient boosting
6.4.1 XGBoost: Extreme gradient boosting
6.4.2 Model performance monitoring
6.4.3 Parameter tuning for XGBoost
6.4.4 Testing the final model
6.5 Next steps
6.5.1 Exercises
6.5.2 Other projects
Summary
Answers to exercises
7 Neural networks and deep learning
7.1 Fashion classification
7.1.1 GPU vs. CPU
7.1.2 Downloading the clothing dataset
7.1.3 TensorFlow and Keras
7.1.4 Loading images
7.2 Convolutional neural networks
7.2.1 Using a pretrained model
7.2.2 Getting predictions
7.3 Internals of the model
7.3.1 Convolutional layers
7.3.2 Dense layers
7.4 Training the model
7.4.1 Transfer learning
7.4.2 Loading the data
7.4.3 Creating the model
7.4.4 Training the model
7.4.5 Adjusting the learning rate
7.4.6 Saving the model and checkpointing
7.4.7 Adding more layers
7.4.8 Regularization and dropout
7.4.9 Data augmentation
7.4.10 Training a larger model
7.5 Using the model
7.5.1 Loading the model
7.5.2 Evaluating the model
7.5.3 Getting the predictions
7.6 Next steps
7.6.1 Exercises
7.6.2 Other projects
Summary
Answers to exercises
8 Serverless deep learning
8.1 Serverless: AWS Lambda
8.1.1 TensorFlow Lite
8.1.2 Converting the model to TF Lite format
8.1.3 Preparing the images
8.1.4 Using the TensorFlow Lite model
8.1.5 Code for the lambda function
8.1.6 Preparing the Docker image
8.1.7 Pushing the image to AWS ECR
8.1.8 Creating the lambda function
8.1.9 Creating the API Gateway
8.2 Next steps
8.2.1 Exercises
8.2.2 Other projects
Summary
9 Serving models with Kubernetes and Kubeflow
9.1 Kubernetes and Kubeflow
9.2 Serving models with TensorFlow Serving
9.2.1 Overview of the serving architecture
9.2.2 The saved_model format
9.2.3 Running TensorFlow Serving locally
9.2.4 Invoking the TF Serving model from Jupyter
9.2.5 Creating the Gateway service
9.3 Model deployment with Kubernetes
9.3.1 Introduction to Kubernetes
9.3.2 Creating a Kubernetes cluster on AWS
9.3.3 Preparing the Docker images
9.3.4 Deploying to Kubernetes
9.3.5 Testing the service
9.4 Model deployment with Kubeflow
9.4.1 Preparing the model: Uploading it to S3
9.4.2 Deploying TensorFlow models with KFServing
9.4.3 Accessing the model
9.4.4 KFServing transformers
9.4.5 Testing the transformer
9.4.6 Deleting the EKS cluster
9.5 Next steps
9.5.1 Exercises
9.5.2 Other projects
Summary
Appendix A—Preparing the environment
A.1 Installing Python and Anaconda
A.1.1 Installing Python and Anaconda on Linux
A.1.2 Installing Python and Anaconda on Windows
A.1.3 Installing Python and Anaconda on macOS
A.2 Running Jupyter
A.2.1 Running Jupyter on Linux
A.2.2 Running Jupyter on Windows
A.2.3 Running Jupyter on MacOS
A.3 Installing the Kaggle CLI
A.4 Accessing the source code
A.5 Installing Docker
A.5.1 Installing Docker on Linux
A.5.2 Installing Docker on Windows
A.5.3 Installing Docker on MacOS
A.6 Renting a server on AWS
A.6.1 Registering on AWS
A.6.2 Accessing billing information
A.6.3 Creating an EC2 instance
A.6.4 Connecting to the instance
A.6.5 Shutting down the instance
A.6.6 Configuring AWS CLI
Appendix B—Introduction to Python
B.1 Variables
B.1.1 Control flow
B.1.2 Collections
B.1.3 Code reusability
B.1.4 Installing libraries
B.1.5 Python programs
Appendix C—Introduction to NumPy
C.1 NumPy
C.1.1 NumPy arrays
C.1.2 Two-dimensional NumPy arrays
C.1.3 Randomly generated arrays
C.2 NumPy operations
C.2.1 Element-wise operations
C.2.2 Summarizing operations
C.2.3 Sorting
C.2.4 Reshaping and combining
C.2.5 Slicing and filtering
C.3 Linear algebra
C.3.1 Multiplication
C.3.2 Matrix inverse
C.3.3 Normal equation
Appendix D—Introduction to Pandas
D.1 Pandas
D.1.1 DataFrame
D.1.2 Series
D.1.3 Index
D.1.4 Accessing rows
D.1.5 Splitting a DataFrame
D.2 Operations
D.2.1 Element-wise operations
D.2.2 Filtering
D.2.3 String operations
D.2.4 Summarizing operations
D.2.5 Missing values
D.2.6 Sorting
D.2.7 Grouping
Appendix E—AWS SageMaker
E.1 AWS SageMaker Notebooks
E.1.1 Increasing the GPU quota limits
E.1.2 Creating a notebook instance
E.1.3 Training a model
E.1.4 Turning off the notebook
index
Symbols
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
R
S
T
U
V
W
X
Y
Z