Optimize every stage of your machine learning pipelines with powerful automation components and cutting-edge tools like AutoKeras and KerasTuner.
In Automated Machine Learning in Action you will learn how to:
• Improve a machine learning model by automatically tuning its hyperparameters
• Pick the optimal components for creating and improving your pipelines
• Use AutoML toolkits such as AutoKeras and KerasTuner
• Design and implement search algorithms to find the best component for your ML task
• Accelerate the AutoML process with data-parallel, model pretraining, and other techniques
Automated Machine Learning in Action reveals how you can automate the burdensome elements of designing and tuning your machine learning systems. It’s written in a math-lite and accessible style, and filled with hands-on examples for applying AutoML techniques to every stage of a pipeline. AutoML can even be implemented by machine learning novices! If you’re new to ML, you’ll appreciate how the book primes you on machine learning basics. Experienced practitioners will love learning how automated tools like AutoKeras and KerasTuner can create pipelines that automatically select the best approach for your task, or tune any customized search space with user-defined hyperparameters, which removes the burden of manual tuning.
About the technology
Machine learning tasks like data pre-processing, feature selection, and model optimization can be time-consuming and highly technical. Automated machine learning, or AutoML, applies pre-built solutions to these chores, eliminating errors caused by manual processing. By accelerating and standardizing work throughout the ML pipeline, AutoML frees up valuable data scientist time and enables less experienced users to apply machine learning effectively.
About the book
Automated Machine Learning in Action shows you how to save time and get better results using AutoML. As you go, you’ll learn how each component of an ML pipeline can be automated with AutoKeras and KerasTuner. The book is packed with techniques for automating classification, regression, data augmentation, and more. The payoff: Your ML systems will be able to tune themselves with little manual work.
What's inside
• Automatically tune model hyperparameters
• Pick the optimal pipeline components
• Select appropriate models and features
• Learn different search algorithms and acceleration strategies
About the reader
For ML novices building their first pipelines and experienced ML engineers looking to automate tasks.
About the author
Drs. Qingquan Song, Haifeng Jin, and Xia “Ben” Hu are the creators of the AutoKeras automated deep learning library.
Author(s): Qingquan Song, Haifeng Jin, Xia Hu
Edition: 1
Publisher: Manning Publications
Year: 2022
Language: English
Commentary: Vector PDF
Pages: 336
City: Shelter Island, NY
Tags: Machine Learning; Deep Learning; Bayesian Inference; Pipelines; Scalability; AutoML; Automation; AutoKeras
Automated Machine Learning in Action
brief contents
contents
preface
acknowledgments
about this book
Who should read this book
How this book is organized: A road map
About the code
liveBook discussion forum
Other online resources
about the authors
about the cover illustration
Part 1—Fundamentals of AutoML
1 From machine learning to automated machine learning
1.1 A glimpse of automated machine learning
1.2 Getting started with machine learning
1.2.1 What is machine learning?
1.2.2 The machine learning process
1.2.3 Hyperparameter tuning
1.2.4 The obstacles to applying machine learning
1.3 AutoML: The automation of automation
1.3.1 Three key components of AutoML
1.3.2 Are we able to achieve full automation?
Summary
2 The end-to-end pipeline of an ML project
2.1 An overview of the end-to-end pipeline
2.2 Framing the problem and assembling the dataset
2.3 Data preprocessing
2.4 Feature engineering
2.5 ML algorithm selection
2.5.1 Building the linear regression model
2.5.2 Building the decision tree model
2.6 Fine-tuning the ML model: Introduction to grid search
Summary
3 Deep learning in a nutshell
3.1 What is deep learning?
3.2 TensorFlow and Keras
3.3 California housing price prediction with a multilayer perceptron
3.3.1 Assembling and preparing the data
3.3.2 Building up the multilayer perceptron
3.3.3 Training and testing the neural network
3.3.4 Tuning the number of epochs
3.4 Classifying handwritten digits with convolutional neural networks
3.4.1 Assembling and preparing the dataset
3.4.2 Addressing the problem with an MLP
3.4.3 Addressing the problem with a CNN
3.5 IMDB review classification with recurrent neural networks
3.5.1 Preparing the data
3.5.2 Building up the RNN
3.5.3 Training and validating the RNN
Summary
Part 2—AutoML in practice
4 Automated generation of end-to-end ML solutions
4.1 Preparing the AutoML toolkit: AutoKeras
4.2 Automated image classification
4.2.1 Attacking the problem with five lines of code
4.2.2 Dealing with different data formats
4.2.3 Configuring the tuning process
4.3 End-to-end AutoML solutions for four supervised learning problems
4.3.1 Text classification with the 20 newsgroups dataset
4.3.2 Structured data classification with the Titanic dataset
4.3.3 Structured data regression with the California housing dataset
4.3.4 Multilabel image classification
4.4 Addressing tasks with multiple inputs or outputs
4.4.1 Automated image classification with the AutoKeras IO API
4.4.2 Automated multi-input learning
4.4.3 Automated multi-output learning
Summary
5 Customizing the search space by creating AutoML pipelines
5.1 Working with sequential AutoML pipelines
5.2 Creating a sequential AutoML pipeline for automated hyperparameter tuning
5.2.1 Tuning MLPs for structured data regression
5.2.2 Tuning CNNs for image classification
5.3 Automated pipeline search with hyperblocks
5.3.1 Automated model selection for image classification
5.3.2 Automated selection of image preprocessing methods
5.4 Designing a graph-structured AutoML pipeline
5.5 Designing custom AutoML blocks
5.5.1 Tuning MLPs with a custom MLP block
5.5.2 Designing a hyperblock for model selection
Summary
6 AutoML with a fully customized search space
6.1 Customizing the search space in a layerwise fashion
6.1.1 Tuning an MLP for regression with KerasTuner
6.1.2 Tuning an autoencoder model for unsupervised learning
6.2 Tuning the autoencoder model
6.3 Tuning shallow models with different search methods
6.3.1 Selecting and tuning shallow models
6.3.2 Tuning a shallow model pipeline
6.3.3 Trying out different search methods
6.3.4 Automated feature engineering
6.4 Controlling the AutoML process by customizing tuners
6.4.1 Creating a tuner for tuning scikit-learn models
6.4.2 Creating a tuner for tuning Keras models
6.4.3 Jointly tuning and selection among deep learning and shallow models
6.4.4 Hyperparameter tuning beyond Keras and scikit-learn models
Summary
Part 3—Advanced topics in AutoML
7 Customizing the search method of AutoML
7.1 Sequential search methods
7.2 Getting started with a random search method
7.3 Customizing a Bayesian optimization search method
7.3.1 Vectorizing the hyperparameters
7.3.2 Updating the surrogate function based on historical model evaluations
7.3.3 Designing the acquisition function
7.3.4 Sampling the new hyperparameters via the acquisition function
7.3.5 Tuning the GBDT model with the Bayesian optimization method
7.3.6 Resuming the search process and recovering the search method
7.4 Customizing an evolutionary search method
7.4.1 Selection strategies in the evolutionary search method
7.4.2 The aging evolutionary search method
7.4.3 Implementing a simple mutation operation
7.4.4 Evaluating the aging evolutionary search method
Summary
8 Scaling up AutoML
8.1 Handling large-scale datasets
8.1.1 Loading an image-classification dataset
8.1.2 Splitting the loaded dataset
8.1.3 Loading a text-classification dataset
8.1.4 Handling large datasets in general
8.2 Parallelization on multiple GPUs
8.2.1 Data parallelism
8.2.2 Model parallelism
8.2.3 Parallel tuning
8.3 Search speedup strategies
8.3.1 Model scheduling with Hyperband
8.3.2 Faster convergence with pretrained weights in the search space
8.3.3 Warm-starting the search space
Summary
9 Wrapping up
9.1 Key concepts in review
9.1.1 The AutoML process and its key components
9.1.2 The machine learning pipeline
9.1.3 The taxonomy of AutoML
9.1.4 Applications of AutoML
9.1.5 Automated deep learning with AutoKeras
9.1.6 Fully personalized AutoML with KerasTuner
9.1.7 Implementing search techniques
9.1.8 Scaling up the AutoML process
9.2 AutoML tools and platforms
9.2.1 Open source AutoML tools
9.2.2 Commercial AutoML platforms
9.3 The challenges and future of AutoML
9.3.1 Measuring the performance of AutoML
9.3.2 Resource complexity
9.3.3 Interpretability and transparency
9.3.4 Reproducibility and robustness
9.3.5 Generalizability and transferability
9.3.6 Democratization and productionization
9.4 Staying up-to-date in a fast-moving field
Summary
appendix A—Setting up an environment for running code
A.1 Getting started with Google Colaboratory
A.1.1 Basic Google Colab notebook operations
A.1.2 Packages and hardware configuration
A.2 Setting up a Jupyter Notebook environment on a local Ubuntu system
A.2.1 Creating a Python 3 virtual environment
A.2.2 Installing the required Python packages
A.2.3 Setting up the IPython kernel
A.2.4 Working on the Jupyter notebooks
appendix B—Three examples: Classification of image, text, and tabular data
B.1 Image classification: Recognizing handwritten digits
B.1.1 Problem framing and data assembly
B.1.2 Exploring and preparing the data
B.1.3 Using principal component analysis to condense the features
B.1.4 Classification with a support vector machine
B.1.5 Building a data-processing pipeline with PCA and SVMPCA (principal component analysis)
B.1.6 Jointly tuning multiple components in the pipeline
B.2 Text classification: Classifying topics of newsgroup posts
B.2.1 Problem framing and data assembly
B.2.2 Data preprocessing and feature engineering
B.2.3 Building a text classifier with the logistic regression model
B.2.4 Building a text classifier with the naive Bayes model
B.2.5 Tuning the text-classification pipeline with grid search
B.3 Tabular classification: Identifying Titanic survivors
B.3.1 Problem framing and data assembly
B.3.2 Data preprocessing and feature engineering
B.3.3 Building tree-based classifiers
index
Numerics
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
R
S
T
U
V
W
X