Take a data-first and use-case-driven approach with Low-Code AI to understand machine learning and deep learning concepts. This hands-on guide presents three problem-focused ways to learn no-code ML using AutoML, low-code using BigQuery ML, and custom code using scikit-learn and Keras. In each case, you'll learn key ML concepts by using real-world datasets with realistic problems.
Business and data analysts get a project-based introduction to ML/AI using a detailed, data-driven approach: loading and analyzing data; feeding data into an ML model; building, training, and testing; and deploying the model into production. Authors Michael Abel and Gwendolyn Stripling show you how to build machine learning models for retail, healthcare, financial services, energy, and telecommunications.
You'll learn how to:
• Distinguish between structured and unstructured data and the challenges they present
• Visualize and analyze data
• Preprocess data for input into a machine learning model
• Differentiate between the regression and classification supervised learning models
• Compare different ML model types and architectures, from no code to low code to custom training
• Design, implement, and tune ML models
• Export data to a GitHub repository for data management and governance
Author(s): Gwendolyn Stripling, Michael Abel
Edition: 1
Publisher: O'Reilly Media
Year: 2023
Language: English
Commentary: Publisher's PDF
Pages: 325
City: Sebastopol, CA
Tags: Machine Learning; Data Analysis; Regression; Keras; Hyperparameter Tuning; Linear Regression; Logistic Regression; scikit-learn; matplotlib; pandas; AutoML; Seaborn; Fraud Detection; Data Collection; Google BigQuery; Elementary; Workflows; Data Preprocessing; Model Training
Cover
Copyright
Table of Contents
Preface
Who Should Read This Book?
What Is and Isn’t in This Book
Conventions Used in This Book
Using Code Examples
O’Reilly Online Learning
How to Contact Us
Acknowledgments
Chapter 1. How Data Drives Decision Making in Machine Learning
What Is the Goal or Use Case?
An Enterprise ML Workflow
Defining the Business Objective or Problem Statement
Data Collection
Data Preprocessing
Data Analysis
Data Transformation and Feature Selection
Researching the Model Selection or Using AutoML (a No-Code Solution)
Model Training, Evaluation, and Tuning
Model Testing
Model Deployment (Serving)
Maintaining Models
Summary
Chapter 2. Data Is the First Step
Overview of Use Cases and Datasets Used in the Book
1. Retail: Product Pricing
2. Healthcare: Heart Disease Campaign
3. Energy: Utility Campaign
4. Insurance: Advertising Media Channel Sales Prediction
5. Financial: Fraud Detection
6. Energy: Power Production Prediction
7. Telecommunications: Customer Churn Prediction
8. Automotive: Improve Custom Model Performance
Data and File Types
Quantitative and Qualitative Data
Structured, Unstructured, and Semistructured Data
Data File Types
How Data Is Processed
An Overview of GitHub and Google’s Colab
Use GitHub to Create a Data Repository for Your Projects
Using Google’s Colaboratory for Low-Code AI Projects
Summary
Chapter 3. Machine Learning Libraries and Frameworks
No-Code AutoML
How AutoML Works
Machine Learning as a Service
Low-Code ML Frameworks
SQL ML Frameworks
Open Source ML Libraries
Summary
Chapter 4. Use AutoML to Predict Advertising Media Channel Sales
The Business Use Case: Media Channel Sales Prediction
Project Workflow
Project Dataset
Exploring the Dataset Using Pandas, Matplotlib, and Seaborn
Load Data into a Pandas DataFrame in a Google Colab Notebook
Explore the Advertising Dataset
Use AutoML to Train a Linear Regression Model
No-Code Using Vertex AI
Create a Managed Dataset in Vertex AI
Select the Model Objective
Build the Training Model
Evaluate Model Performance
Model Feature Importance (Attribution)
Get Predictions from Your Model
Summary
Chapter 5. Using AutoML to Detect Fraudulent Transactions
The Business Use Case: Fraud Detection for Financial Transactions
Project Workflow
Project Dataset
Exploring the Dataset Using Pandas, Matplotlib, and Seaborn
Loading Data into a Pandas DataFrame in a Google Colab Notebook
Exploring the Dataset
Exporting the Dataset
Classification Models and Metrics
Using AutoML to Train a Classification Model
Creating a Managed Dataset and Selecting the Model Objective
Exploring Dataset Statistics
Training the Model
Evaluating Model Performance
Model Feature Importances
Getting Predictions from Your Model
Summary
Chapter 6. Using BigQuery ML to Train a Linear Regression Model
The Business Use Case: Power Plant Production
Cleaning the Dataset Using SQL in BigQuery
Loading a Dataset into BigQuery
Exploring Data in BigQuery Using SQL
Linear Regression Models
Feature Selection and Correlation
Google Colaboratory
Plotting Feature Relationships to the Label
The CREATE MODEL Statement in BigQuery ML
Introducing Explainable AI
Explainable AI in BigQuery ML
Exercises
Neural Networks in BigQuery ML
Brief Overview of Neural Networks
Activation Functions and Nonlinearity
Training a Deep Neural Network in BigQuery ML
Exercises
Deep Dive: Using Cloud Shell to View Your Cloud Storage File
Summary
Chapter 7. Training Custom ML Models in Python
The Business Use Case: Customer Churn Prediction
Choosing Among No-Code, Low-Code, or Custom Code ML Solutions
Exploring the Dataset Using Pandas, Matplotlib, and Seaborn
Loading Data into a Pandas DataFrame in a Google Colab Notebook
Understanding and Cleaning the Customer Churn Dataset
Transforming Features Using Pandas and Scikit-Learn
Building a Logistic Regression Model Using Scikit-Learn
Logistic Regression
Training and Evaluating a Model in Scikit-Learn
Classification Evaluation Metrics
Serving Predictions with a Trained Model in Scikit-Learn
Pipelines in Scikit-Learn: An Introduction
Building a Neural Network Using Keras
Introduction to Keras
Training a Neural Network Classifier Using Keras
Building Custom ML Models on Vertex AI
Summary
Chapter 8. Improving Custom Model Performance
The Business Use Case: Used Car Auction Prices
Model Improvement in Scikit-Learn
Loading the Notebook with the Preexisting Model
Loading the Datasets and the Training-Validation-Test Data Split
Exploring the Scikit-Learn Linear Regression Model
Feature Engineering and Improving the Preprocessing Pipeline
Hyperparameter Tuning
Model Improvement in Keras
Introduction to Preprocessing Layers in Keras
Creating the Dataset and Preprocessing Layers for Your Model
Building a Neural Network Model
Hyperparameter Tuning in Keras
Hyperparameter Tuning in BigQuery ML
Loading and Transforming Car Auction Data
Training a Linear Regression Model and Using the TRANSFORM Clause
Configure a Hyperparameter Tuning Job in BigQuery ML
Options for Hyperparameter Tuning Large Models
Vertex AI Training and Tuning
Automatic Model Tuning with Amazon SageMaker
Azure Machine Learning
Summary
Chapter 9. Next Steps in Your AI Journey
Going Deeper into Data Science
Working with Unstructured Data
Generative AI
Explainable AI
ML Operations
Continuous Training and Evaluation
Summary
Index
About the Authors
Colophon