The Complete Beginner’s Guide to Understanding and Building Machine Learning Systems with Python Machine Learning with Python for Everyone will help you master the processes, patterns, and strategies you need to build effective learning systems, even if you’re an absolute beginner. If you can write some Python code, this book is for you, no matter how little college-level math you know. Principal instructor Mark E. Fenner relies on plain-English stories, pictures, and Python examples to communicate the ideas of machine learning. Mark begins by discussing machine learning and what it can do; introducing key mathematical and computational topics in an approachable manner; and walking you through the first steps in building, training, and evaluating learning systems. Step by step, you’ll fill out the components of a practical learning system, broaden your toolbox, and explore some of the field’s most sophisticated and exciting techniques. Whether you’re a student, analyst, scientist, or hobbyist, this guide’s insights will be applicable to every learning system you ever build or use. Understand machine learning algorithms, models, and core machine learning concepts Classify examples with classifiers, and quantify examples with regressors Realistically assess performance of machine learning systems Use feature engineering to smooth rough data into useful forms Chain multiple components into one system and tune its performance Apply machine learning techniques to images and text Connect the core concepts to neural networks and graphical models Leverage the Python scikit-learn library and other powerful tools Register your book for convenient access to downloads, updates, and/or corrections as they become available. See inside book for details.
Author(s): Mark E. Fenner
Series: Addison Wesley Data & Analytics Series
Edition: 1
Publisher: Addison-Wesley Professional/Pearson education
Year: 2020
Language: English
Commentary: TruePDF
Pages: 588
Tags: Machine Learning, Python (Programming Language)
Cover
Title Page
Copyright Page
Contents
Foreword
Preface
About the Author
Part I: First Steps
1 Let’s Discuss Learning
1.1 Welcome
1.2 Scope, Terminology, Prediction, and Data
1.2.1 Features
1.2.2 Target Values and Predictions
1.3 Putting the Machine in Machine Learning
1.4 Examples of Learning Systems
1.4.1 Predicting Categories: Examples of Classifiers
1.4.2 Predicting Values: Examples of Regressors
1.5 Evaluating Learning Systems
1.5.1 Correctness
1.5.2 Resource Consumption
1.6 A Process for Building Learning Systems
1.7 Assumptions and Reality of Learning
1.8 End-of-Chapter Material
1.8.1 The Road Ahead
1.8.2 Notes
2 Some Technical Background
2.1 About Our Setup
2.2 The Need for Mathematical Language
2.3 Our Software for Tackling Machine Learning
2.4 Probability
2.4.1 Primitive Events
2.4.2 Independence
2.4.3 Conditional Probability
2.4.4 Distributions
2.5 Linear Combinations, Weighted Sums, and Dot Products
2.5.1 Weighted Average
2.5.2 Sums of Squares
2.5.3 Sum of Squared Errors
2.6 A Geometric View: Points in Space
2.6.1 Lines
2.6.2 Beyond Lines
2.7 Notation and the Plus-One Trick
2.8 Getting Groovy, Breaking the Straight-Jacket, and Nonlinearity
2.9 NumPy versus “All the Maths”
2.9.1 Back to 1D versus 2D
2.10 Floating-Point Issues
2.11 EOC 53
2.11.1 Summary
2.11.2 Notes
3 Predicting Categories: Getting Started with Classification
3.1 Classification Tasks
3.2 A Simple Classification Dataset
3.3 Training and Testing: Don’t Teach to the Test
3.4 Evaluation: Grading the Exam
3.5 Simple Classifier #1: Nearest Neighbors, Long Distance Relationships, and Assumptions
3.5.1 Defining Similarity
3.5.2 The k in k-NN
3.5.3 Answer Combination
3.5.4 k-NN, Parameters, and Nonparametric Methods
3.5.5 Building a k-NN Classification Model
3.6 Simple Classifier #2: Naive Bayes, Probability, and Broken Promises
3.7 Simplistic Evaluation of Classifiers
3.7.1 Learning Performance
3.7.2 Resource Utilization in Classification
3.7.3 Stand-Alone Resource Evaluation
3.8 EOC
3.8.1 Sophomore Warning: Limitations and Open Issues
3.8.2 Summary
3.8.3 Notes
3.8.4 Exercises
4 Predicting Numerical Values: Getting Started with Regression
4.1 A Simple Regression Dataset
4.2 Nearest-Neighbors Regression and Summary Statistics
4.2.1 Measures of Center: Median and Mean
4.2.2 Building a k-NN Regression Model
4.3 Linear Regression and Errors
4.3.1 No Flat Earth: Why We Need Slope
4.3.2 Tilting the Field
4.3.3 Performing Linear Regression
4.4 Optimization: Picking the Best Answer
4.4.1 Random Guess
4.4.2 Random Step
4.4.3 Smart Step
4.4.4 Calculated Shortcuts
4.4.5 Application to Linear Regression
4.5 Simple Evaluation and Comparison of Regressors
4.5.1 Root Mean Squared Error
4.5.2 Learning Performance
4.5.3 Resource Utilization in Regression
4.6 EOC
4.6.1 Limitations and Open Issues
4.6.2 Summary
4.6.3 Notes
4.6.4 Exercises
Part II: Evaluation
5 Evaluating and Comparing Learners
5.1 Evaluation and Why Less Is More
5.2 Terminology for Learning Phases
5.2.1 Back to the Machines
5.2.2 More Technically Speaking . . .
5.3 Major Tom, There’s Something Wrong: Overfitting and Underfitting
5.3.1 Synthetic Data and Linear Regression
5.3.2 Manually Manipulating Model Complexity
5.3.3 Goldilocks: Visualizing Overfitting, Underfitting, and “Just Right”
5.3.4 Simplicity
5.3.5 Take-Home Notes on Overfitting
5.4 From Errors to Costs
5.4.1 Loss
5.4.2 Cost
5.4.3 Score
5.5 (Re)Sampling: Making More from Less
5.5.1 Cross-Validation
5.5.2 Stratification
5.5.3 Repeated Train-Test Splits
5.5.4 A Better Way and Shuffling
5.5.5 Leave-One-Out Cross-Validation
5.6 Break-It-Down: Deconstructing Error into Bias and Variance
5.6.1 Variance of the Data
5.6.2 Variance of the Model
5.6.3 Bias of the Model
5.6.4 All Together Now
5.6.5 Examples of Bias-Variance Tradeoffs
5.7 Graphical Evaluation and Comparison
5.7.1 Learning Curves: How Much Data Do We Need?
5.7.2 Complexity Curves
5.8 Comparing Learners with Cross-Validation
5.9 EOC
5.9.1 Summary
5.9.2 Notes
5.9.3 Exercises
6 Evaluating Classifiers
6.1 Baseline Classifiers
6.2 Beyond Accuracy: Metrics for Classification
6.2.1 Eliminating Confusion from the Confusion Matrix
6.2.2 Ways of Being Wrong
6.2.3 Metrics from the Confusion Matrix
6.2.4 Coding the Confusion Matrix
6.2.5 Dealing with Multiple Classes: Multiclass Averaging
6.2.6 F1
6.3 ROC Curves
6.3.1 Patterns in the ROC
6.3.2 Binary ROC
6.3.3 AUC: Area-Under-the-(ROC)-Curve
6.3.4 Multiclass Learners, One-versus-Rest, and ROC
6.4 Another Take on Multiclass: One-versus-One
6.4.1 Multiclass AUC Part Two: The Quest for a Single Value
6.5 Precision-Recall Curves
6.5.1 A Note on Precision-Recall Tradeoff
6.5.2 Constructing a Precision-Recall Curve
6.6 Cumulative Response and Lift Curves
6.7 More Sophisticated Evaluation of Classifiers: Take Two
6.7.1 Binary
6.7.2 A Novel Multiclass Problem
6.8 EOC
6.8.1 Summary
6.8.2 Notes
6.8.3 Exercises
7 Evaluating Regressors
7.1 Baseline Regressors
7.2 Additional Measures for Regression
7.2.1 Creating Our Own Evaluation Metric
7.2.2 Other Built-in Regression Metrics
7.2.3 R2
7.3 Residual Plots
7.3.1 Error Plots
7.3.2 Residual Plots
7.4 A First Look at Standardization
7.5 Evaluating Regressors in a More Sophisticated Way: Take Two
7.5.1 Cross-Validated Results on Multiple Metrics
7.5.2 Summarizing Cross-Validated Results
7.5.3 Residuals
7.6 EOC
7.6.1 Summary
7.6.2 Notes
7.6.3 Exercises
Part III: More Methods and Fundamentals
8 More Classification Methods
8.1 Revisiting Classification
8.2 Decision Trees
8.2.1 Tree-Building Algorithms
8.2.2 Let’s Go: Decision Tree Time
8.2.3 Bias and Variance in Decision Trees
8.3 Support Vector Classifiers
8.3.1 Performing SVC
8.3.2 Bias and Variance in SVCs
8.4 Logistic Regression
8.4.1 Betting Odds
8.4.2 Probabilities, Odds, and Log-Odds
8.4.3 Just Do It: Logistic Regression Edition
8.4.4 A Logistic Regression: A Space Oddity
8.5 Discriminant Analysis
8.5.1 Covariance
8.5.2 The Methods
8.5.3 Performing DA
8.6 Assumptions, Biases, and Classifiers
8.7 Comparison of Classifiers: Take Three
8.7.1 Digits
8.8 EOC
8.8.1 Summary
8.8.2 Notes
8.8.3 Exercises
9 More Regression Methods
9.1 Linear Regression in the Penalty Box: Regularization
9.1.1 Performing Regularized Regression
9.2 Support Vector Regression
9.2.1 Hinge Loss
9.2.2 From Linear Regression to Regularized Regression to Support Vector Regression
9.2.3 Just Do It — SVR Style
9.3 Piecewise Constant Regression
9.3.1 Implementing a Piecewise Constant Regressor
9.3.2 General Notes on Implementing Models
9.4 Regression Trees
9.4.1 Performing Regression with Trees
9.5 Comparison of Regressors: Take Three
9.6 EOC
9.6.1 Summary
9.6.2 Notes
9.6.3 Exercises
10 Manual Feature Engineering: Manipulating Data for Fun and Profit
10.1 Feature Engineering Terminology and Motivation
10.1.1 Why Engineer Features?
10.1.2 When Does Engineering Happen?
10.1.3 How Does Feature Engineering Occur?
10.2 Feature Selection and Data Reduction: Taking out the Trash
10.3 Feature Scaling
10.4 Discretization
10.5 Categorical Coding
10.5.1 Another Way to Code and the Curious Case of the Missing Intercept
10.6 Relationships and Interactions
10.6.1 Manual Feature Construction
10.6.2 Interactions
10.6.3 Adding Features with Transformers
10.7 Target Manipulations
10.7.1 Manipulating the Input Space
10.7.2 Manipulating the Target
10.8 EOC
10.8.1 Summary
10.8.2 Notes
10.8.3 Exercises
11 Tuning Hyperparameters and Pipelines
11.1 Models, Parameters, Hyperparameters
11.2 Tuning Hyperparameters
11.2.1 A Note on Computer Science and Learning Terminology
11.2.2 An Example of Complete Search
11.2.3 Using Randomness to Search for a Needle in a Haystack
11.3 Down the Recursive Rabbit Hole: Nested Cross-Validation
11.3.1 Cross-Validation, Redux
11.3.2 GridSearch as a Model
11.3.3 Cross-Validation Nested within Cross-Validation
11.3.4 Comments on Nested CV
11.4 Pipelines
11.4.1 A Simple Pipeline
11.4.2 A More Complex Pipeline
11.5 Pipelines and Tuning Together
11.6 EOC
11.6.1 Summary
11.6.2 Notes
11.6.3 Exercises
Part IV: Adding Complexity
12 Combining Learners
12.1 Ensembles
12.2 Voting Ensembles
12.3 Bagging and Random Forests
12.3.1 Bootstrapping
12.3.2 From Bootstrapping to Bagging
12.3.3 Through the Random Forest
12.4 Boosting
12.4.1 Boosting Details
12.5 Comparing the Tree-Ensemble Methods
12.6 EOC
12.6.1 Summary
12.6.2 Notes
12.6.3 Exercises
13 Models That Engineer Features for Us
13.1 Feature Selection
13.1.1 Single-Step Filtering with Metric-Based Feature Selection
13.1.2 Model-Based Feature Selection
13.1.3 Integrating Feature Selection with a Learning Pipeline
13.2 Feature Construction with Kernels
13.2.1 A Kernel Motivator
13.2.2 Manual Kernel Methods
13.2.3 Kernel Methods and Kernel Options
13.2.4 Kernelized SVCs: SVMs
13.2.5 Take-Home Notes on SVM and an Example
13.3 Principal Components Analysis: An Unsupervised Technique
13.3.1 A Warm Up: Centering
13.3.2 Finding a Different Best Line
13.3.3 A First PCA
13.3.4 Under the Hood of PCA
13.3.5 A Finale: Comments on General PCA
13.3.6 Kernel PCA and Manifold Methods
13.4 EOC
13.4.1 Summary
13.4.2 Notes
13.4.3 Exercises
14 Feature Engineering for Domains: Domain-Specific Learning
14.1 Working with Text
14.1.1 Encoding Text
14.1.2 Example of Text Learning
14.2 Clustering
14.2.1 k-Means Clustering
14.3 Working with Images
14.3.1 Bag of Visual Words
14.3.2 Our Image Data
14.3.3 An End-to-End System
14.3.4 Complete Code of BoVW Transformer
14.4 EOC
14.4.1 Summary
14.4.2 Notes
14.4.3 Exercises
15 Connections, Extensions, and Further Directions
15.1 Optimization
15.2 Linear Regression from Raw Materials
15.2.1 A Graphical View of Linear Regression
15.3 Building Logistic Regression from Raw Materials
15.3.1 Logistic Regression with Zero-One Coding
15.3.2 Logistic Regression with Plus-One Minus-One Coding
15.3.3 A Graphical View of Logistic Regression
15.4 SVM from Raw Materials
15.5 Neural Networks
15.5.1 A NN View of Linear Regression
15.5.2 A NN View of Logistic Regression
15.5.3 Beyond Basic Neural Networks
15.6 Probabilistic Graphical Models
15.6.1 Sampling
15.6.2 A PGM View of Linear Regression
15.6.3 A PGM View of Logistic Regression
15.7 EOC
15.7.1 Summary
15.7.2 Notes
15.7.3 Exercises
Appendix A: mlwpy.py Listing
Index
A
B
C
D
E
F
G
I
H
J
K
L
M
N
P
O
Q
R
S
T
U
V
W
X
Y
Z