Hands-On Machine Learning with R

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Hands-on Machine Learning with R provides a practical and applied approach to learning and developing intuition into today's most popular machine learning methods. This book serves as a practitioner's guide to the machine learning process and is meant to help the reader learn to apply the machine learning stack within R, which includes using various R packages such as glmnet, h2o, ranger, xgboost, keras, and others to effectively model and gain insight from their data. The book favors a hands-on approach, providing an intuitive understanding of machine learning concepts through concrete examples and just a little bit of theory. 

Throughout this book, the reader will be exposed to the entire machine learning process including feature engineering, resampling, hyperparameter tuning, model evaluation, and interpretation. The reader will be exposed to powerful algorithms such as regularized regression, random forests, gradient boosting machines, deep learning, generalized low rank models, and more! By favoring a hands-on approach and using real word data, the reader will gain an intuitive understanding of the architectures and engines that drive these algorithms and packages, understand when and how to tune the various hyperparameters, and be able to interpret model results. By the end of this book, the reader should have a firm grasp of R's machine learning stack and be able to implement a systematic approach for producing high quality modeling results.

Features:


Offers a practical and applied introduction to the most popular machine learning methods.

Takes readers through the entire modeling process; from data prep to hyperparameter tuning, model evaluation, and interpretation.

Introduces readers to a wide variety of packages that make up R's machine learning stack.

Uses a hands-on approach and real world data.

Author(s): Brad Boehmke; Brandon M. Greenwell
Series: Chapman & Hall/CRC The R Series
Publisher: CRC Press
Year: 2020

Language: English
Pages: xxiv+459

Cover
Half Title
Title Page
Copyright Page
Dedication
Table of Contents
Preface
I: Fundamentals
Chapter 1: Introduction to Machine Learning
1.1 Supervised learning
1.1.1 Regression problems
1.1.2 Classification problems
1.2 Unsupervised learning
1.3 Roadmap
1.4 The data sets
Chapter 2: Modeling Process
2.1 Prerequisites
2.2 Data splitting
2.2.1 Simple random sampling
2.2.2 Stratified sampling
2.2.3 Class imbalances
2.3 Creating models in R
2.3.1 Many formula interfaces
2.3.2 Many engines
2.4 Resampling methods
2.4.1 k-fold cross validation
2.4.2 Bootstrapping
2.4.3 Alternatives
2.5 Bias variance trade-off
2.5.1 Bias
2.5.2 Variance
2.5.3 Hyperparameter tuning
2.6 Model evaluation
2.6.1 Regression models
2.6.2 Classification models
2.7 Putting the processes together
Chapter 3: Feature & Target Engineering
3.1 Prerequisites
3.2 Target engineering
3.3 Dealing with missingness
3.3.1 Visualizing missing values
3.3.2 Imputation
3.4 Feature filtering
3.5 Numeric feature engineering
3.5.1 Skewness
3.5.2 Standardization
3.6 Categorical feature engineering
3.6.1 Lumping
3.6.2 One-hot & dummy encoding
3.6.3 Label encoding
3.6.4 Alternatives
3.7 Dimension reduction
3.8 Proper implementation
3.8.1 Sequential steps
3.8.2 Data leakage
3.8.3 Putting the process together
II: Supervised Learning
Chapter 4: Linear Regression
4.1 Prerequisites
4.2 Simple linear regression
4.2.1 Estimation
4.2.2 Inference
4.3 Multiple linear regression
4.4 Assessing model accuracy
4.5 Model concerns
4.6 Principal component regression
4.7 Partial least squares
4.8 Feature interpretation
4.9 Final thoughts
Chapter 5: Logistic Regression
5.1 Prerequisites
5.2 Why logistic regression
5.3 Simple logistic regression
5.4 Multiple logistic regression
5.5 Assessing model accuracy
5.6 Model concerns
5.7 Feature interpretation
5.8 Final thoughts
Chapter 6: Regularized Regression
6.1 Prerequisites
6.2 Why regularize?
6.2.1 Ridge penalty
6.2.2 Lasso penalty
6.2.3 Elastic nets
6.3 Implementation
6.4 Tuning
6.5 Feature interpretation
6.6 Attrition data
6.7 Final thoughts
Chapter 7: Multivariate Adaptive Regression Splines
7.1 Prerequisites
7.2 The basic idea
7.2.1 Multivariate adaptive regression splines
7.3 Fitting a basic MARS model
7.4 Tuning
7.5 Feature interpretation
7.6 Attrition data
7.7 Final thoughts
Chapter 8: K-Nearest Neighbors
8.1 Prerequisites
8.2 Measuring similarity
8.2.1 Distance measures
8.2.2 Preprocessing
8.3 Choosing k
8.4 MNIST example
8.5 Final thoughts
Chapter 9: Decision Trees
9.1 Prerequisites
9.2 Structure
9.3 Partitioning
9.4 How deep?
9.4.1 Early stopping
9.4.2 Pruning
9.5 Ames housing example
9.6 Feature interpretation
9.7 Final thoughts
Chapter 10: Bagging
10.1 Prerequisites
10.2 Why and when bagging works
10.3 Implementation
10.4 Easily parallelize
10.5 Feature interpretation
10.6 Final thoughts
Chapter 11: Random Forests
11.1 Prerequisites
11.2 Extending bagging
11.3 Out-of-the-box performance
11.4 Hyperparameters
11.4.1 Number of trees
11.4.2 mtry
11.4.3 Tree complexity
11.4.4 Sampling scheme
11.4.5 Split rule
11.5 Tuning strategies
11.6 Feature interpretation
11.7 Final thoughts
Chapter 12: Gradient Boosting
12.1 Prerequisites
12.2 How boosting works
12.2.1 A sequential ensemble approach
12.2.2 Gradient descent
12.3 Basic GBM
12.3.1 Hyperparameters
12.3.2 Implementation
12.3.3 General tuning strategy
12.4 Stochastic GBMs
12.4.1 Stochastic hyperparameters
12.4.2 Implementation
12.5 XGBoost
12.5.1 XGBoost hyperparameters
12.5.2 Tuning strategy
12.6 Feature interpretation
12.7 Final thoughts
Chapter 13: Deep Learning
13.1 Prerequisites
13.2 Why deep learning
13.3 Feedforward DNNs
13.4 Network architecture
13.4.1 Layers and nodes
13.4.2 Activation
13.5 Backpropagation
13.6 Model training
13.7 Model tuning
13.7.1 Model capacity
13.7.2 Batch normalization
13.7.3 Regularization
13.7.4 Adjust learning rate
13.8 Grid search
13.9 Final thoughts
Chapter 14: Support Vector Machines
14.1 Prerequisites
14.2 Optimal separating hyperplanes
14.2.1 The hard margin classifier
14.2.2 The soft margin classifier
14.3 The support vector machine
14.3.1 More than two classes
14.3.2 Support vector regression
14.4 Job attrition example
14.4.1 Class weights
14.4.2 Class probabilities
14.5 Feature interpretation
14.6 Final thoughts
Chapter 15: Stacked Models
15.1 Prerequisites
15.2 The Idea
15.2.1 Common ensemble methods
15.2.2 Super learner algorithm
15.2.3 Available packages
15.3 Stacking existing models
15.4 Stacking a grid search
15.5 Automated machine learning
Chapter 16: Interpretable Machine Learning
16.1 Prerequisites
16.2 The idea
16.2.1 Global interpretation
16.2.2 Local interpretation
16.2.3 Model-specific vs. model-agnostic
16.3 Permutation-based feature importance
16.3.1 Concept
16.3.2 Implementation
16.4 Partial dependence
16.4.1 Concept
16.4.2 Implementation
16.4.3 Alternative uses
16.5 Individual conditional expectation
16.5.1 Concept
16.5.2 Implementation
16.6 Feature interactions
16.6.1 Concept
16.6.2 Implementation
16.6.3 Alternatives
16.7 Local interpretable model-agnostic explanations
16.7.1 Concept
16.7.2 Implementation
16.7.3 Tuning
16.7.4 Alternative uses
16.8 Shapley values
16.8.1 Concept
16.8.2 Implementation
16.8.3 XGBoost and built-in Shapley values
16.9 Localized step-wise procedure
16.9.1 Concept
16.9.2 Implementation
16.10 Final thoughts
III: Dimension Reduction
Chapter 17: Principal Components Analysis
17.1 Prerequisites
17.2 The idea
17.3 Finding principal components
17.4 Performing PCA in R
17.5 Selecting the number of principal components
17.5.1 Eigenvalue criterion
17.5.2 Proportion of variance explained criterion
17.5.3 Scree plot criterion
17.6 Final thoughts
Chapter 18: Generalized Low Rank Models
18.1 Prerequisites
18.2 The idea
18.3 Finding the lower ranks
18.3.1 Alternating minimization
18.3.2 Loss functions
18.3.3 Regularization
18.3.4 Selecting k
18.4 Fitting GLRMs in R
18.4.1 Basic GLRM model
18.4.2 Tuning to optimize for unseen data
18.5 Final thoughts
Chapter 19: Autoencoders
19.1 Prerequisites
19.2 Undercomplete autoencoders
19.2.1 Comparing PCA to an autoencoder
19.2.2 Stacked autoencoders
19.2.3 Visualizing the reconstruction
19.3 Sparse autoencoders
19.4 Denoising autoencoders
19.5 Anomaly detection
19.6 Final thoughts
IV: Clustering
Chapter 20: K-means Clustering
20.1 Prerequisites
20.2 Distance measures
20.3 Defining clusters
20.4 k-means algorithm
20.5 Clustering digits
20.6 How many clusters?
20.7 Clustering with mixed data
20.8 Alternative partitioning methods
20.9 Final thoughts
Chapter 21: Hierarchical Clustering
21.1 Prerequisites
21.2 Hierarchical clustering algorithms
21.3 Hierarchical clustering in R
21.3.1 Agglomerative hierarchical clustering
21.3.2 Divisive hierarchical clustering
21.4 Determining optimal clusters
21.5 Working with dendrograms
21.6 Final thoughts
Chapter 22: Model-based Clustering
22.1 Prerequisites
22.2 Measuring probability and uncertainty
22.3 Covariance types
22.4 Model selection
22.5 My basket example
22.6 Final thoughts
Bibliography
Index