Deploy deep learning solutions in production with ease using TensorFlow. You'll also develop the mathematical understanding and intuition required to invent new deep learning architectures and solutions on your own.
Pro Deep Learning with TensorFlow provides practical, hands-on expertise so you can learn deep learning from scratch and deploy meaningful deep learning solutions. This book will allow you to get up to speed quickly using TensorFlow and to optimize different deep learning architectures.
All of the practical aspects of deep learning that are relevant in any industry are emphasized in this book. You will be able to use the prototypes demonstrated to build new deep learning applications. The code presented in the book is available in the form of iPython notebooks and scripts which allow you to try out examples and extend them in interesting ways.
You will be equipped with the mathematical foundation and scientific knowledge to pursue research in this field and give back to the community.
What You'll Learn
Understand full stack deep learning using TensorFlow and gain a solid mathematical foundation for deep learning
Deploy complex deep learning solutions in production using TensorFlow
Carry out research on deep learning and perform experiments using TensorFlow
Who This Book Is For
Data scientists and machine learning professionals, software developers, graduate students, and open source enthusiasts
Author(s): Santanu Pattanayak
Publisher: Apress
Year: 2017
Language: English
Pages: 398
Contents
About the Author
About the Technical Reviewer
Acknowledgments
Introduction
Chapter 1: Mathematical Foundations
Linear Algebra
Vector
Scalar
Matrix
Tensor
Matrix Operations and Manipulations
Addition of Two Matrices
Subtraction of Two Matrices
Product of Two Matrices
Transpose of a Matrix
Dot Product of Two Vectors
Matrix Working on a Vector
Linear Independence of Vectors
Rank of a Matrix
Identity Matrix or Operator
Determinant of a Matrix
Interpretation of Determinant
Inverse of a Matrix
Norm of a Vector
Pseudo Inverse of a Matrix
Unit Vector in the Direction of a Specific Vector
Projection of a Vector in the Direction of Another Vector
Eigen Vectors
Characteristic Equation of a Matrix
Power Iteration Method for Computing Eigen Vector
Calculus
Differentiation
Gradient of a Function
Successive Partial Derivatives
Hessian Matrix of a Function
Maxima and Minima of Functions
Rules for Maxima and Minima for a Univariate Function
Local Minima and Global Minima
Positive Semi-Definite and Positive Definite
Convex Set
Convex Function
Non-convex Function
Multivariate Convex and Non-convex Functions Examples
Taylor Series
Probability
Unions, Intersection, and Conditional Probability
Chain Rule of Probability for Intersection of Event
Mutually Exclusive Events
Independence of Events
Conditional Independence of Events
Bayes Rule
Probability Mass Function
Probability Density Function
Expectation of a Random Variable
Variance of a Random Variable
Skewness and Kurtosis
Covariance
Correlation Coefficient
Some Common Probability Distribution
Uniform Distribution
Normal Distribution
Multivariate Normal Distribution
Bernoulli Distribution
Binomial Distribution
Poisson Distribution
Likelihood Function
Maximum Likelihood Estimate
Hypothesis Testing and p Value
Formulation of Machine-Learning Algorithm and Optimization Techniques
Supervised Learning
Linear Regression as a Supervised Learning Method
Linear Regression Through Vector Space Approach
Classification
Hyperplanes and Linear Classifiers
Unsupervised Learning
Optimization Techniques for Machine Learning
Gradient Descent
Gradient Descent for a Multivariate Cost Function
Contour Plot and Contour Lines
Steepest Descent
Stochastic Gradient Descent
Newton’s Method
Linear Curve
Negative Curvature
Positive Curvature
Constrained Optimization Problem
A Few Important Topics in Machine Learning
Dimensionality Reduction Methods
Principal Component Analysis
When Will PCA Be Useful in Data Reduction?
How Do You Know How Much Variance Is Retained by the Selected Principal Components?
Singular Value Decomposition
Regularization
Regularization Viewed as a Constraint Optimization Problem
Summary
Chapter 2: Introduction to Deep-Learning Concepts and TensorFlow
Deep Learning and Its Evolution
Perceptrons and Perceptron Learning Algorithm
Geometrical Interpretation of Perceptron Learning
Limitations of Perceptron Learning
Need for Non-linearity
Hidden Layer Perceptrons’ Activation Function for Non-linearity
Different Activation Functions for a Neuron/Perceptron
Linear Activation Function
Binary Threshold Activation Function
Sigmoid Activation Function
SoftMax Activation Function
Rectified Linear Unit(ReLU) Activation Function
Tanh Activation Function
Learning Rule for Multi-Layer Perceptrons Network
Backpropagation for Gradient Computation
Generalizing the Backpropagation Method for Gradient Computation
Deep Learning Versus Traditional Methods
TensorFlow
Common Deep-Learning Packages
TensorFlow Installation
TensorFlow Basics for Development
Gradient-Descent Optimization Methods from a Deep-Learning Perspective
Elliptical Contours
Non-convexity of Cost Functions
Saddle Points in the High-Dimensional Cost Functions
Learning Rate in Mini-batch Approach to Stochastic Gradient Descent
Optimizers in TensorFlow
GradientDescentOptimizer
Usage
AdagradOptimizer
Usage
RMSprop
Usage
AdadeltaOptimizer
Usage
AdamOptimizer
Usage
MomentumOptimizer and Nesterov Algorithm
Usage
Epoch, Number of Batches, and Batch Size
XOR Implementation Using TensorFlow
TensorFlow Computation Graph for XOR network
Linear Regression in TensorFlow
Multi-class Classification with SoftMax Function Using Full-Batch Gradient Descent
Multi-class Classification with SoftMax Function Using Stochastic Gradient Descent
GPU
Summary
Chapter 3: Convolutional Neural Networks
Convolution Operation
Linear Time Invariant (LTI) / Linear Shift Invariant (LSI) Systems
Convolution for Signals in One Dimension
Analog and Digital Signals
2D and 3D signals
2D Convolution
Two-dimensional Unit Step Function
2D Convolution of a Signal with an LSI System Unit Step Response
2D Convolution of an Image to Different LSI System Responses
Common Image-Processing Filters
Mean Filter
Median Filter
Gaussian Filter
Gradient-based Filters
Sobel Edge-Detection Filter
Identity Transform
Convolution Neural Networks
Components of Convolution Neural Networks
Input Layer
Convolution Layer
TensorFlow Usage
Pooling Layer
TensorFlow Usage
Backpropagation Through the Convolutional Layer
Backpropagation Through the Pooling Layers
Weight Sharing Through Convolution and Its Advantages
Translation Equivariance
Translation Invariance Due to Pooling
Dropout Layers and Regularization
Convolutional Neural Network for Digit Recognition on the MNIST Dataset
Convolutional Neural Network for Solving Real-World Problems
Batch Normalization
Different Architectures in Convolutional Neural Networks
LeNet
AlexNet
VGG16
ResNet
Transfer Learning
Guidelines for Using Transfer Learning
Transfer Learning with Google’s InceptionV3
Transfer Learning with Pre-trained VGG16
Summary
Chapter 4: Natural Language Processing Using Recurrent Neural Networks
Vector Space Model (VSM)
Vector Representation of Words
Word2Vec
Continuous Bag of Words (CBOW)
Continuous Bag of Words Implementation in TensorFlow
Skip-Gram Model for Word Embedding
Skip-gram Implementation in TensorFlow
Global Co-occurrence Statistics–based Word Vectors
GloVe
Word Analogy with Word Vectors
Introduction to Recurrent Neural Networks
Language Modeling
Predicting the Next Word in a Sentence Through RNN Versus Traditional Methods
Backpropagation Through Time (BPTT)
Vanishing and Exploding Gradient Problem in RNN
Solution to Vanishing and Exploding Gradients Problem in RNNs
Gradient Clipping
Smart Initialization of the Memory-to-Memory Weight Connection Matrix and ReLU units
Long Short-Term Memory (LSTM)
LSTM in Reducing Exploding- and Vanishing -Gradient Problems
MNIST Digit Identification in TensorFlow Using Recurrent Neural Networks
Next-Word Prediction and Sentence Completion in TensorFlow Using Recurrent Neural Networks
Gated Recurrent Unit (GRU)
Bidirectional RNN
Summary
Chapter 5: Unsupervised Learning with Restricted Boltzmann Machines and Auto-encoders
Boltzmann Distribution
Bayesian Inference: Likelihood, Priors, and Posterior Probability Distribution
Markov Chain Monte Carlo Methods for Sampling
Metropolis Algorithm
Restricted Boltzmann Machines
Training a Restricted Boltzmann Machine
Gibbs Sampling
Block Gibbs Sampling
Burn-in Period and Generating Samples in Gibbs Sampling
Using Gibbs Sampling in Restricted Boltzmann Machines
Contrastive Divergence
A Restricted Boltzmann Implementation in TensorFlow
Collaborative Filtering Using Restricted Boltzmann Machines
Deep Belief Networks (DBNs)
Auto-encoders
Feature Learning Through Auto-encoders for Supervised Learning
Kullback-Leibler (KL) Divergence
Sparse Auto-encoders
Sparse Auto-Encoder Implementation in TensorFlow
Denoising Auto-Encoder
A Denoising Auto-Encoder Implementation in TensorFlow
PCA and ZCA Whitening
Summary
Chapter 6: Advanced Neural Networks
Image Segmentation
Binary Thresholding Method Based on Histogram of Pixel Intensities
Otsu’s Method
Watershed Algorithm for Image Segmentation
Image Segmentation Using K-means Clustering
Semantic Segmentation
Sliding-Window Approach
Fully Convolutional Network (FCN)
Fully Convolutional Network with Downsampling and Upsampling
Unpooling
Max Unpooling
Transpose Convolution
U-Net
Semantic Segmentation in TensorFlow with Fully Connected Neural Networks
Image Classification and Localization Network
Object Detection
R-CNN
Fast and Faster R-CNN
Generative Adversarial Networks
Maximin and Minimax Problem
Zero-sum Game
Minimax and Saddle Points
GAN Cost Function and Training
Vanishing Gradient for the Generator
TensorFlow Implementation of a GAN Network
TensorFlow Models’ Deployment in Production
Summary
Index