Author(s): George Kyriakides; Konstantinos G Margaritis
Year: 2019
Language: English
Pages: 298
Cover
Title Page
Copyright and Credits
About Packt
Contributors
Table of Contents
Preface
Section 1: Introduction and Required Software Tools
Chapter 1: A Machine Learning Refresher
Technical requirements
Learning from data
Popular machine learning datasets
Diabetes
Breast cancer
Handwritten digits
Supervised and unsupervised learning
Supervised learning
Unsupervised learning
Dimensionality reduction
Performance measures
Cost functions
Mean absolute error
Mean squared error
Cross entropy loss
Metrics
Classification accuracy
Confusion matrix
Sensitivity, specificity, and area under the curve
Precision, recall, and the F1 score
Evaluating models
Machine learning algorithms
Python packages
Supervised learning algorithms
Regression
Support vector machines
Neural networks
Decision trees
K-Nearest Neighbors
K-means
Summary
Chapter 2: Getting Started with Ensemble Learning
Technical requirements
Bias, variance, and the trade-off
What is bias?
What is variance?
Trade-off
Ensemble learning
Motivation
Identifying bias and variance
Validation curves
Learning curves
Ensemble methods
Difficulties in ensemble learning
Weak or noisy data
Understanding interpretability
Computational cost
Choosing the right models
Summary
Section 2: Non-Generative Methods
Chapter 3: Voting
Technical requirements
Hard and soft voting
Hard voting
Soft voting
Python implementation
Custom hard voting implementation
Analyzing our results using Python
Using scikit-learn
Hard voting implementation
Soft voting implementation
Analyzing our results
Summary
Chapter 4: Stacking
Technical requirements
Meta-learning
Stacking
Creating metadata
Deciding on an ensemble's composition
Selecting base learners
Selecting the meta-learner
Python implementation
Stacking for regression
Stacking for classification
Creating a stacking regressor class for scikit-learn
Summary
Section 3: Generative Methods
Chapter 5: Bagging
Technical requirements
Bootstrapping
Creating bootstrap samples
Bagging
Creating base learners
Strengths and weaknesses
Python implementation
Implementation
Parallelizing the implementation
Using scikit-learn
Bagging for classification
Bagging for regression
Summary
Chapter 6: Boosting
Technical requirements
AdaBoost
Weighted sampling
Creating the ensemble
Implementing AdaBoost in Python
Strengths and weaknesses
Gradient boosting
Creating the ensemble
Further reading
Implementing gradient boosting in Python
Using scikit-learn
Using AdaBoost
Using gradient boosting
XGBoost
Using XGBoost for regression
Using XGBoost for classification
Other boosting libraries
Summary
Chapter 7: Random Forests
Technical requirements
Understanding random forest trees
Building trees
Illustrative example
Extra trees
Creating forests
Analyzing forests
Strengths and weaknesses
Using scikit-learn
Random forests for classification
Random forests for regression
Extra trees for classification
Extra trees regression
Summary
Section 4: Clustering
Chapter 8: Clustering
Technical requirements
Consensus clustering
Hierarchical clustering
K-means clustering
Strengths and weaknesses
Using scikit-learn
Using voting
Using OpenEnsembles
Using graph closure and co-occurrence linkage
Graph closure
Co-occurrence matrix linkage
Summary
Section 5: Real World Applications
Chapter 9: Classifying Fraudulent Transactions
Technical requirements
Getting familiar with the dataset
Exploratory analysis
Evaluation methods
Voting
Testing the base learners
Optimizing the decision tree
Creating the ensemble
Stacking
Bagging
Boosting
XGBoost
Using random forests
Comparative analysis of ensembles
Summary
Chapter 10: Predicting Bitcoin Prices
Technical requirements
Time series data
Bitcoin data analysis
Establishing a baseline
The simulator
Voting
Improving voting
Stacking
Improving stacking
Bagging
Improving bagging
Boosting
Improving boosting
Random forests
Improving random forest
Summary
Chapter 11: Evaluating Sentiment on Twitter
Technical requirements
Sentiment analysis tools
Stemming
Getting Twitter data
Creating a model
Classifying tweets in real time
Summary
Chapter 12: Recommending Movies with Keras
Technical requirements
Demystifying recommendation systems
Neural recommendation systems
Using Keras for movie recommendations
Creating the dot model
Creating the dense model
Creating a stacking ensemble
Summary
Chapter 13: Clustering World Happiness
Technical requirements
Understanding the World Happiness Report
Creating the ensemble
Gaining insights
Summary
Another Book You May Enjoy
Index