"It is like a voyage of discovery, seeking not for new territory but new knowledge. It should appeal to those with a good sense of adventure," Dr. Frederick Sanger.
I hope every reader enjoys this voyage in deep learning and find their adventure.
Think of deep learning as an art of cooking. One way to cook is to follow a recipe. But when we learn how the food, the spices, and the fire behave, we make our creation. And an understanding of the "how" transcends the creation.
Likewise, an understanding of the "how" transcends deep learning. In this spirit, this book presents the deep learning constructs, their fundamentals, and how they behave. Baseline models are developed alongside, and concepts to improve them are exemplified.
Author(s): Chitta Ranjan Ph.D
Publisher: Independently published
Year: 2023
Language: English
Pages: 428
Preface
Acknowledgment
Website
Introduction
Examples of Application
Rare Diseases
Fraud Detection
Network Intrusion Detection
Detecting Emergencies
Click vis-à-vis churn prediction
Failures in Manufacturing
A Working Example
Problem Motivation
Paper Manufacturing Process
Data Description
Machine Learning vs. Deep Learning
In this Book
Rare Event Prediction
Rare Event Problem
Underlying Statistical Process
Problem definition
Objective
Loss function
Accuracy measures
Challenges
High-dimensional Multivariate Time Series
Early Prediction
Imbalanced Data
Setup
TensorFlow
Prerequisites
Install Python
Install Virtual Environment
TensorFlow 2x Installation
Testing
Sheet Break Problem Dataset
Multi-layer Perceptrons
Background
Fundamentals of MLP
Initialization and Data Preparation
Imports and Loading Data
Data Pre-processing
Curve Shifting
Data Splitting
Features Scaling
MLP Modeling
Sequential
Input Layer
Dense Layer
Output Layer
Model Summary
Model Compile
Model Fit
Results Visualization
Dropout
What is Co-Adaptation?
What Is Dropout?
Dropout Layer
Class Weights
Activation
What is Vanishing and Exploding Gradients?
Cause Behind Vanishing and Exploding Gradients
Gradients and Story of Activations
Self-normalization
Selu Activation
Novel Ideas Implementation
Activation Customization
Metrics Customization
Models Evaluation
Rules-of-thumb
Exercises
Long Short Term Memory Networks
Background
Fundamentals of LSTM
Input to LSTM
LSTM Cell
State Mechanism
Cell Operations
Activations in LSTM
Parameters
Iteration Levels
Stabilized Gradient
LSTM Layer and Network Structure
Input Processing
Stateless versus Stateful
Return Sequences vs Last Output
Initialization and Data Preparation
Imports and Data
Temporalizing the Data
Data Splitting
Scaling Temporalized Data
Baseline Model—A Restricted Stateless LSTM
Input layer
LSTM layer
Output layer
Model Summary
Compile and Fit
Model Improvements
Unrestricted LSTM Network
Dropout and Recurrent Dropout
Go Backwards
Bi-directional
Longer Lookback/Timesteps
History of LSTMs
Summary
Rules-of-thumb
Exercises
Convolutional Neural Networks
Background
The Concept of Convolution
Convolution Properties
Parameter Sharing
Weak Filters
Equivariance to Translation
Pooling
Regularization via Invariance
Modulating between Equivariance and Invariance
Multi-channel Input
Kernels
Convolutional Variants
Padding
Stride
Dilation
1x1 Convolution
Convolutional Network
Structure
Conv1D, Conv2D, and Conv3D
Convolution Layer Output Size
Pooling Layer Output Size
Parameters
Multivariate Time Series Modeling
Convolution on Time Series
Imports and Data Preparation
Baseline
Learn Longer-term Dependencies
Multivariate Time Series Modeled as Image
Conv1D and Conv2D Equivalence
Neighborhood Model
Summary Statistics for Pooling
Definitions
(Minimal) Sufficient Statistics
Complete Statistics
Ancillary Statistics
Pooling Discoveries
Reason behind Max-Pool Superiority
Preserve Convolution Distribution
Maximum Likelihood Estimators for Pooling
Uniform Distribution
Normal Distribution
Gamma Distribution
Weibull Distribution
Advanced Pooling
Adaptive Distribution Selection
Complete Statistics for Exponential Family
Multivariate Distribution
History of Pooling
Rules-of-thumb
Exercises
Autoencoders
Background
Architectural Similarity between PCA and Autoencoder
Encoding—Projection to Lower Dimension
Decoding—Reconstruction to Original Dimension
Autoencoder Family
Undercomplete
Overcomplete
Denoising Autoencoder (DAE)
Contractive Autoencoder (CAE)
Sparse Autoencoder
Anomaly Detection with Autoencoders
Anomaly Detection Approach
Data Preparation
Model Fitting
Diagnostics
Inferencing
Feed-forward MLP on Sparse Encodings
Sparse Autoencoder Construction
MLP Classifier on Encodings
Temporal Autoencoder
LSTM Autoencoder
Convolutional Autoencoder
Autoencoder Customization
Well-posed Autoencoder
Model Construction
Orthonormal Weights
Sparse Covariance
Rules-of-thumb
Exercises
Appendices
Appendix Importance of Nonlinear Activation
Appendix Curve Shifting
Appendix Simple Plots
Appendix Backpropagation Gradients
Appendix Data Temporalization
Appendix Stateful LSTM
Appendix Null-Rectified Linear Unit
Appendix 11 Convolutional Network
Appendix CNN: Visualization for Interpretation
Appendix Multiple (Maximum and Range) Pooling Statistics in a Convolution Network
Appendix Convolutional Autoencoder-Classifier
Appendix Oversampling
SMOTE