Build Machine Learning models with a sound statistical understanding. About This Book - Learn about the statistics behind powerful predictive models with p-value, ANOVA, and F- statistics. - Implement statistical computations programmatically for supervised and unsupervised learning through K-means clustering. - Master the statistical aspect of Machine Learning with the help of this example-rich guide to R and Python. Who This Book Is For This book is intended for developers with little to no background in statistics, who want to implement Machine Learning in their systems. Some programming knowledge in R or Python will be useful. What You Will Learn - Understand the Statistical and Machine Learning fundamentals necessary to build models - Understand the major differences and parallels between the statistical way and the Machine Learning way to solve problems - Learn how to prepare data and feed models by using the appropriate Machine Learning algorithms from the more-than-adequate R and Python packages - Analyze the results and tune the model appropriately to your own predictive goals - Understand the concepts of required statistics for Machine Learning - Introduce yourself to necessary fundamentals required for building supervised & unsupervised deep learning models - Learn reinforcement learning and its application in the field of artificial intelligence domain In Detail Complex statistics in Machine Learning worry a lot of developers. Knowing statistics helps you build strong Machine Learning models that are optimized for a given problem statement. This book will teach you all it takes to perform complex statistical computations required for Machine Learning. You will gain information on statistics behind supervised learning, unsupervised learning, reinforcement learning, and more. Understand the real-world examples that discuss the statistical side of Machine Learning and familiarize yourself with it. You will also design programs for performing tasks such as model, parameter fitting, regression, classification, density collection, and more. By the end of the book, you will have mastered the required statistics for Machine Learning and will be able to apply your new skills to any sort of industry problem. Style and approach This practical, step-by-step guide will give you an understanding of the Statistical and Machine Learning fundamentals you'll need to build models.
Author(s): Pratap Dangeti
Year: 2017
Cover
Copyright
Credits
About the Author
About the Reviewer
www.PacktPub.com
Customer Feedback
Table of Contents
Preface
Chapter 1: Journey from Statistics to Machine Learning
Statistical terminology for model building and validation
Machine learning
Major differences between statistical modeling and machine learning
Steps in machine learning model development and deployment
Statistical fundamentals and terminology for model building and validation
Bias versus variance trade-off
Train and test data
Machine learning terminology for model building and validation
Linear regression versus gradient descent
Machine learning losses
When to stop tuning machine learning models
Train, validation, and test data
Cross-validation
Grid search
Machine learning model overview
Summary
Chapter 2: Parallelism of Statistics and Machine Learning
Comparison between regression and machine learning models
Compensating factors in machine learning models
Assumptions of linear regression
Steps applied in linear regression modeling
Example of simple linear regression from first principles
Example of simple linear regression using the wine quality data
Example of multilinear regression - step-by-step methodology of model building
Backward and forward selection
Machine learning models - ridge and lasso regression
Example of ridge regression machine learning
Example of lasso regression machine learning model
Regularization parameters in linear regression and ridge/lasso regression
Summary
Chapter 3: Logistic Regression Versus Random Forest
Maximum likelihood estimation
Logistic regression – introduction and advantages
Terminology involved in logistic regression
Applying steps in logistic regression modeling
Example of logistic regression using German credit data
Random forest
Example of random forest using German credit data
Grid search on random forest
Variable importance plot
Comparison of logistic regression with random forest
Summary
Chapter 4: Tree-Based Machine Learning Models
Introducing decision tree classifiers
Terminology used in decision trees
Decision tree working methodology from first principles
Comparison between logistic regression and decision trees
Comparison of error components across various styles of models
Remedial actions to push the model towards the ideal region
HR attrition data example
Decision tree classifier
Tuning class weights in decision tree classifier
Bagging classifier
Random forest classifier
Random forest classifier - grid search
AdaBoost classifier
Gradient boosting classifier
Comparison between AdaBoosting versus gradient boosting
Extreme gradient boosting - XGBoost classifier
Ensemble of ensembles - model stacking
Ensemble of ensembles with different types of classifiers
Ensemble of ensembles with bootstrap samples using a single type of classifier
Summary
Chapter 5: K-Nearest Neighbors and Naive Bayes
K-nearest neighbors
KNN voter example
Curse of dimensionality
Curse of dimensionality with 1D, 2D, and 3D example
KNN classifier with breast cancer Wisconsin data example
Tuning of k-value in KNN classifier
Naive Bayes
Probability fundamentals
Joint probability
Understanding Bayes theorem with conditional probability
Naive Bayes classification
Laplace estimator
Naive Bayes SMS spam classification example
Summary
Chapter 6: Support Vector Machines and Neural Networks
Support vector machines working principles
Maximum margin classifier
Support vector classifier
Support vector machines
Kernel functions
SVM multilabel classifier with letter recognition data example
Maximum margin classifier - linear kernel
Polynomial kernel
RBF kernel
Artificial neural networks - ANN
Activation functions
Forward propagation and backpropagation
Optimization of neural networks
Stochastic gradient descent - SGD
Momentum
Nesterov accelerated gradient - NAG
Adagrad
Adadelta
RMSprop
Adaptive moment estimation - Adam
Limited-memory broyden-fletcher-goldfarb-shanno - L-BFGS optimization algorithm
Dropout in neural networks
ANN classifier applied on handwritten digits using scikit-learn
Introduction to deep learning
Solving methodology
Deep learning software
Deep neural network classifier applied on handwritten digits using Keras
Summary
Chapter 7: Recommendation Engines
Content-based filtering
Cosine similarity
Collaborative filtering
Advantages of collaborative filtering over content-based filtering
Matrix factorization using the alternating least squares algorithm for collaborative filtering
Evaluation of recommendation engine model
Hyperparameter selection in recommendation engines using grid search
Recommendation engine application on movie lens data
User-user similarity matrix
Movie-movie similarity matrix
Collaborative filtering using ALS
Grid search on collaborative filtering
Summary
Chapter 8: Unsupervised Learning
K-means clustering
K-means working methodology from first principles
Optimal number of clusters and cluster evaluation
The elbow method
K-means clustering with the iris data example
Principal component analysis - PCA
PCA working methodology from first principles
PCA applied on handwritten digits using scikit-learn
Singular value decomposition - SVD
SVD applied on handwritten digits using scikit-learn
Deep auto encoders
Model building technique using encoder-decoder architecture
Deep auto encoders applied on handwritten digits using Keras
Summary
Chapter 9: Reinforcement Learning
Introduction to reinforcement learning
Comparing supervised, unsupervised, and reinforcement learning in detail
Characteristics of reinforcement learning
Reinforcement learning basics
Category 1 - value based
Category 2 - policy based
Category 3 - actor-critic
Category 4 - model-free
Category 5 - model-based
Fundamental categories in sequential decision making
Markov decision processes and Bellman equations
Dynamic programming
Algorithms to compute optimal policy using dynamic programming
Grid world example using value and policy iteration algorithms with basic Python
Monte Carlo methods
Comparison between dynamic programming and Monte Carlo methods
Key advantages of MC over DP methods
Monte Carlo prediction
The suitability of Monte Carlo prediction on grid-world problems
Modeling Blackjack example of Monte Carlo methods using Python
Temporal difference learning
Comparison between Monte Carlo methods and temporal difference learning
TD prediction
Driving office example for TD learning
SARSA on-policy TD control
Q-learning - off-policy TD control
Cliff walking example of on-policy and off-policy of TD control
Applications of reinforcement learning with integration of machine learning and deep learning
Automotive vehicle control - self-driving cars
Google DeepMind's AlphaGo
Robo soccer
Further reading
Summary
Index