Link to the GitHub Repository containing the code examples and additional material:
https://github.com/rasbt/python-machi...
Many of the most innovative breakthroughs and exciting new technologies can be attributed to applications of machine learning. We are living in an age where data comes in abundance, and thanks to the self-learning algorithms from the field of machine learning, we can turn this data into knowledge. Automated speech recognition on our smart phones, web search engines, e-mail spam filters, the recommendation systems of our favorite movie streaming services – machine learning makes it all possible.
Thanks to the many powerful open-source libraries that have been developed in recent years, machine learning is now right at our fingertips. Python provides the perfect environment to build machine learning systems productively.
This book will teach you the fundamentals of machine learning and how to utilize these in real-world applications using Python. Step-by-step, you will expand your skill set with the best practices for transforming raw data into useful information, developing learning algorithms efficiently, and evaluating results.
You will discover the different problem categories that machine learning can solve and explore how to classify objects, predict continuous outcomes with regression analysis, and find hidden structures in data via clustering. You will build your own machine learning system for sentiment analysis and finally, learn how to embed your model into a web app to share with the world
Author(s): Sebastian Raschka; Vahid Mirjalili
Edition: 3
Year: 2019
Cover
Copyright
Packt Page
Contributors
Table of Contents
Preface
Chapter 1: Giving Computers the Ability to Learn from Data
Building intelligent machines to transform data into knowledge
The three different types of machine learning
Making predictions about the future with supervised learning
Classification for predicting class labels
Regression for predicting continuous outcomes
Solving interactive problems with reinforcement learning
Discovering hidden structures with unsupervised learning
Finding subgroups with clustering
Dimensionality reduction for data compression
Introduction to the basic terminology and notations
Notation and conventions used in this book
Machine learning terminology
A roadmap for building machine learning systems
Preprocessing – getting data into shape
Training and selecting a predictive model
Evaluating models and predicting unseen data instances
Using Python for machine learning
Installing Python and packages from the Python Package Index
Using the Anaconda Python distribution and package manager
Packages for scientific computing, data science, and machine learning
Summary
Chapter 2: Training Simple Machine Learning Algorithms for Classification
Artificial neurons – a brief glimpse into the early history of machine learning
The formal definition of an artificial neuron
The perceptron learning rule
Implementing a perceptron learning algorithm in Python
An object-oriented perceptron API
Training a perceptron model on the Iris dataset
Adaptive linear neurons and the convergence of learning
Minimizing cost functions with gradient descent
Implementing Adaline in Python
Improving gradient descent through feature scaling
Large-scale machine learning and stochastic gradient descent
Summary
Chapter 3: A Tour of Machine Learning Classifiers Using scikit-learn
Choosing a classification algorithm
First steps with scikit-learn – training a perceptron
Modeling class probabilities via logistic regression
Logistic regression and conditional probabilities
Learning the weights of the logistic cost function
Converting an Adaline implementation into an algorithm for logistic regression
Training a logistic regression model with scikit-learn
Tackling overfitting via regularization
Maximum margin classification with support vector machines
Maximum margin intuition
Dealing with a nonlinearly separable case using slack variables
Alternative implementations in scikit-learn
Solving nonlinear problems using a kernel SVM
Kernel methods for linearly inseparable data
Using the kernel trick to find separating hyperplanes in a high-dimensional space
Decision tree learning
Maximizing IG – getting the most bang for your buck
Building a decision tree
Combining multiple decision trees via random forests
K-nearest neighbors – a lazy learning algorithm
Summary
Chapter 4: Building Good Training Datasets – Data Preprocessing
Dealing with missing data
Identifying missing values in tabular data
Eliminating training examples or features with missing values
Imputing missing values
Understanding the scikit-learn estimator API
Handling categorical data
Categorical data encoding with pandas
Mapping ordinal features
Encoding class labels
Performing one-hot encoding on nominal features
Partitioning a dataset into separate training and test datasets
Bringing features onto the same scale
Selecting meaningful features
L1 and L2 regularization as penalties against model complexity
A geometric interpretation of L2 regularization
Sparse solutions with L1 regularization
Sequential feature selection algorithms
Assessing feature importance with random forests
Summary
Chapter 5: Compressing Data via Dimensionality Reduction
Unsupervised dimensionality reduction via principal component analysis
The main steps behind principal component analysis
Extracting the principal components step by step
Total and explained variance
Feature transformation
Principal component analysis in scikit-learn
Supervised data compression via linear discriminant analysis
Principal component analysis versus linear discriminant analysis
The inner workings of linear discriminant analysis
Computing the scatter matrices
Selecting linear discriminants for the new feature subspace
Projecting examples onto the new feature space
LDA via scikit-learn
Using kernel principal component analysis for nonlinear mappings
Kernel functions and the kernel trick
Implementing a kernel principal component analysis in Python
Example 1 – separating half-moon shapes
Example 2 – separating concentric circles
Projecting new data points
Kernel principal component analysis in scikit-learn
Summary
Chapter 6: Learning Best Practices for Model Evaluation and Hyperparameter Tuning
Streamlining workflows with pipelines
Loading the Breast Cancer Wisconsin dataset
Combining transformers and estimators in a pipeline
Using k-fold cross-validation to assess model performance
The holdout method
K-fold cross-validation
Debugging algorithms with learning and validation curves
Diagnosing bias and variance problems with learning curves
Addressing over- and underfitting with validation curves
Fine-tuning machine learning models via grid search
Tuning hyperparameters via grid search
Algorithm selection with nested cross-validation
Looking at different performance evaluation metrics
Reading a confusion matrix
Optimizing the precision and recall of a classification model
Plotting a receiver operating characteristic
Scoring metrics for multiclass classification
Dealing with class imbalance
Summary
Chapter 7: Combining Different Models for Ensemble Learning
Learning with ensembles
Combining classifiers via majority vote
Implementing a simple majority vote classifier
Using the majority voting principle to make predictions
Evaluating and tuning the ensemble classifier
Bagging – building an ensemble of classifiers from bootstrap samples
Bagging in a nutshell
Applying bagging to classify examples in the Wine dataset
Leveraging weak learners via adaptive boosting
How boosting works
Applying AdaBoost using scikit-learn
Summary
Chapter 8: Applying Machine Learning to Sentiment Analysis
Preparing the IMDb movie review data for text processing
Obtaining the movie review dataset
Preprocessing the movie dataset into a more convenient format
Introducing the bag-of-words model
Transforming words into feature vectors
Assessing word relevancy via term frequency-inverse document frequency
Cleaning text data
Processing documents into tokens
Training a logistic regression model for document classification
Working with bigger data – online algorithms and out-of-core learning
Topic modeling with Latent Dirichlet Allocation
Decomposing text documents with LDA
LDA with scikit-learn
Summary
Chapter 9: Embedding a Machine Learning Model into a Web Application
Serializing fitted scikit-learn estimators
Setting up an SQLite database for data storage
Developing a web application with Flask
Our first Flask web application
Form validation and rendering
Setting up the directory structure
Implementing a macro using the Jinja2 templating engine
Adding style via CSS
Creating the result page
Turning the movie review classifier into a web application
Files and folders – looking at the directory tree
Implementing the main application as app.py
Setting up the review form
Creating a results page template
Deploying the web application to a public server
Creating a PythonAnywhere account
Uploading the movie classifier application
Updating the movie classifier
Summary
Chapter 10: Predicting Continuous Target Variables with Regression Analysis
Introducing linear regression
Simple linear regression
Multiple linear regression
Exploring the Housing dataset
Loading the Housing dataset into a data frame
Visualizing the important characteristics of a dataset
Looking at relationships using a correlation matrix
Implementing an ordinary least squares linear regression model
Solving regression for regression parameters with gradient descent
Estimating the coefficient of a regression model via scikit-learn
Fitting a robust regression model using RANSAC
Evaluating the performance of linear regression models
Using regularized methods for regression
Turning a linear regression model into a curve – polynomial regression
Adding polynomial terms using scikit-learn
Modeling nonlinear relationships in the Housing dataset
Dealing with nonlinear relationships using random forests
Decision tree regression
Random forest regression
Summary
Chapter 11: Working with Unlabeled Data – Clustering Analysis
Grouping objects by similarity using k-means
K-means clustering using scikit-learn
A smarter way of placing the initial cluster centroids using k-means++
Hard versus soft clustering
Using the elbow method to find the optimal number of clusters
Quantifying the quality of clustering via silhouette plots
Organizing clusters as a hierarchical tree
Grouping clusters in bottom-up fashion
Performing hierarchical clustering on a distance matrix
Attaching dendrograms to a heat map
Applying agglomerative clustering via scikit-learn
Locating regions of high density via DBSCAN
Summary
Chapter 12: Implementing a Multilayer Artificial Neural Network from Scratch
Modeling complex functions with artificial neural networks
Single-layer neural network recap
Introducing the multilayer neural network architecture
Activating a neural network via forward propagation
Classifying handwritten digits
Obtaining and preparing the MNIST dataset
Implementing a multilayer perceptron
Training an artificial neural network
Computing the logistic cost function
Developing your understanding of backpropagation
Training neural networks via backpropagation
About the convergence in neural networks
A few last words about the neural network implementation
Summary
Chapter 13: Parallelizing Neural Network Training with TensorFlow
TensorFlow and training performance
Performance challenges
What is TensorFlow?
How we will learn TensorFlow
First steps with TensorFlow
Installing TensorFlow
Creating tensors in TensorFlow
Manipulating the data type and shape of a tensor
Applying mathematical operations to tensors
Split, stack, and concatenate tensors
Building input pipelines using tf.data – the TensorFlow Dataset API
Creating a TensorFlow Dataset from existing tensors
Combining two tensors into a joint dataset
Shuffle, batch, and repeat
Creating a dataset from files on your local storage disk
Fetching available datasets from the tensorflow_datasets library
Building an NN model in TensorFlow
The TensorFlow Keras API (tf.keras)
Building a linear regression model
Model training via the .compile() and .fit() methods
Building a multilayer perceptron for classifying flowers in the Iris dataset
Evaluating the trained model on the test dataset
Saving and reloading the trained model
Choosing activation functions for multilayer neural networks
Logistic function recap
Estimating class probabilities in multiclass classification via the softmax function
Broadening the output spectrum using a hyperbolic tangent
Rectified linear unit activation
Summary
Chapter 14: Going Deeper – The Mechanics of TensorFlow
The key features of TensorFlow
TensorFlow's computation graphs: migrating to TensorFlow v2
Understanding computation graphs
Creating a graph in TensorFlow v1.x
Migrating a graph to TensorFlow v2
Loading input data into a model: TensorFlow v1.x style
Loading input data into a model: TensorFlow v2 style
Improving computational performance with function decorators
TensorFlow Variable objects for storing and updating model parameters
Computing gradients via automatic differentiation and GradientTape
Computing the gradients of the loss with respect to trainable variables
Computing gradients with respect to non-trainable tensors
Keeping resources for multiple gradient computations
Simplifying implementations of common architectures via the Keras API
Solving an XOR classification problem
Making model building more flexible with Keras' functional API
Implementing models based on Keras' Model class
Writing custom Keras layers
TensorFlow Estimators
Working with feature columns
Machine learning with pre-made Estimators
Using Estimators for MNIST handwritten digit classification
Creating a custom Estimator from an existing Keras model
Summary
Chapter 15: Classifying Images with Deep Convolutional Neural Networks
The building blocks of CNNs
Understanding CNNs and feature hierarchies
Performing discrete convolutions
Discrete convolutions in one dimension
Padding inputs to control the size of the output feature maps
Determining the size of the convolution output
Performing a discrete convolution in 2D
Subsampling layers
Putting everything together – implementing a CNN
Working with multiple input or color channels
Regularizing an NN with dropout
Loss functions for classification
Implementing a deep CNN using TensorFlow
The multilayer CNN architecture
Loading and preprocessing the data
Implementing a CNN using the TensorFlow Keras API
Configuring CNN layers in Keras
Constructing a CNN in Keras
Gender classification from face images using a CNN
Loading the CelebA dataset
Image transformation and data augmentation
Training a CNN gender classifier
Summary
Chapter 16: Modeling Sequential Data Using Recurrent Neural Networks
Introducing sequential data
Modeling sequential data—order matters
Representing sequences
The different categories of sequence modeling
RNNs for modeling sequences
Understanding the RNN looping mechanism
Computing activations in an RNN
Hidden-recurrence versus output-recurrence
The challenges of learning long-range interactions
Long short-term memory cells
Implementing RNNs for sequence modeling in TensorFlow
Project one: predicting the sentiment of IMDb movie reviews
Preparing the movie review data
Embedding layers for sentence encoding
Building an RNN model
Building an RNN model for the sentiment analysis task
Project two: character-level language modeling in TensorFlow
Preprocessing the dataset
Building a character-level RNN model
Evaluation phase: generating new text passages
Understanding language with the Transformer model
Understanding the self-attention mechanism
A basic version of self-attention
Parameterizing the self-attention mechanism with query, key, and value weights
Multi-head attention and the Transformer block
Summary
Chapter 17: Generative Adversarial Networks for Synthesizing New Data
Introducing generative adversarial networks
Starting with autoencoders
Generative models for synthesizing new data
Generating new samples with GANs
Understanding the loss functions of the generator and discriminator networks in a GAN model
Implementing a GAN from scratch
Training GAN models on Google Colab
Implementing the generator and the discriminator networks
Defining the training dataset
Training the GAN model
Improving the quality of synthesized images using a convolutional and Wasserstein GAN
Transposed convolution
Batch normalization
Implementing the generator and discriminator
Dissimilarity measures between two distributions
Using EM distance in practice for GANs
Gradient penalty
Implementing WGAN-GP to train the DCGAN model
Mode collapse
Other GAN applications
Summary
Chapter 18: Reinforcement Learning for Decision Making in Complex Environments
Introduction: learning from experience
Understanding reinforcement learning
Defining the agent-environment interface of a reinforcement learning system
The theoretical foundations of RL
Markov decision processes
The mathematical formulation of Markov decision processes
Visualization of a Markov process
Episodic versus continuing tasks
RL terminology: return, policy, and value function
The return
Policy
Value function
Dynamic programming using the Bellman equation
Reinforcement learning algorithms
Dynamic programming
Policy evaluation – predicting the value function with dynamic programming
Improving the policy using the estimated value function
Policy iteration
Value iteration
Reinforcement learning with Monte Carlo
State-value function estimation using MC
Action-value function estimation using MC
Finding an optimal policy using MC control
Policy improvement – computing the greedy policy from the action-value function
Temporal difference learning
TD prediction
On-policy TD control (SARSA)
Off-policy TD control (Q-learning)
Implementing our first RL algorithm
Introducing the OpenAI Gym toolkit
Working with the existing environments in OpenAI Gym
A grid world example
Implementing the grid world environment in OpenAI Gym
Solving the grid world problem with Q-learning
Implementing the Q-learning algorithm
A glance at deep Q-learning
Training a DQN model according to the Q-learning algorithm
Implementing a deep Q-learning algorithm
Chapter and book summary
Other Books You May Enjoy
Index