Math for Deep Learning provides the essential math you need to understand deep learning discussions, explore more complex implementations, and better use the deep learning toolkits.
With Math for Deep Learning, you'll learn the essential mathematics used by and as a background for deep learning.
You’ll work through Python examples to learn key deep learning related topics in probability, statistics, linear algebra, differential calculus, and matrix calculus as well as how to implement data flow in a neural network, backpropagation, and gradient descent. You’ll also use Python to work through the mathematics that underlies those algorithms and even build a fully-functional neural network.
In addition you’ll find coverage of gradient descent including variations commonly used by the deep learning community: SGD, Adam, RMSprop, and Adagrad/Adadelta.
Author(s): Ronald T. Kneusel
Edition: 1
Publisher: No Starch Press
Year: 2021
Language: English
Commentary: Vector PDF
Pages: 344
City: San Francisco, CA
Tags: Machine Learning; Neural Networks; Deep Learning; Python; Statistics; Gradient Descent; scikit-learn; NumPy; matplotlib; Linear Algebra; Probability Theory; Calculus; Backpropagation
Brief Contents
Contents in Detail
Foreword
Acknowledgments
Introduction
Who Is This Book For?
About This Book
Chapter 1: Setting the Stage
Installing the Toolkits
Linux
macOS
Windows
NumPy
Defining Arrays
Data Types
2D Arrays
Zeros and Ones
Advanced Indexing
Reading and Writing to Disk
SciPy
Matplotlib
Scikit-Learn
Summary
Chapter 2: Probability
Basic Concepts
Sample Space and Events
Random Variables
Humans Are Bad at Probability
The Rules of Probability
Probability of an Event
Sum Rule
Product Rule
Sum Rule Revisited
The Birthday Paradox
Conditional Probability
Total Probability
Joint and Marginal Probability
Joint Probability Tables
Chain Rule for Probability
Summary
Chapter 3: More Probability
Probability Distributions
Histograms and Probabilities
Discrete Probability Distributions
Continuous Probability Distributions
Central Limit Theorem
The Law of Large Numbers
Bayes' Theorem
Cancer or Not Redux
Updating the Prior
Bayes' Theorem in Machine Learning
Summary
Chapter 4: Statistics
Types of Data
Nominal Data
Ordinal Data
Interval Data
Ratio Data
Using Nominal Data in Deep Learning
Summary Statistics
Means and Median
Measures of Variation
Quantiles and Box Plots
Missing Data
Correlation
Pearson Correlation
Spearman Correlation
Hypothesis Testing
Hypotheses
The t-test
The Mann-Whitney U Test
Summary
Chapter 5: Linear Algebra
Scalars, Vectors, Matrices, and Tensors
Scalars
Vectors
Matrices
Tensors
Arithmetic with Tensors
Array Operations
Vector Operations
Matrix Multiplication
Kronecker Product
Summary
Chapter 6: More Linear Algebra
Square Matrices
Why Square Matrices?
Transpose, Trace, and Powers
Special Square Matrices
The Identity Matrix
Determinants
Inverses
Symmetric, Orthogonal, and Unitary Matrices
Definiteness of a Symmetric Matrix
Eigenvectors and Eigenvalues
Finding Eigenvalues and Eigenvectors
Vector Norms and Distance Metrics
L-Norms and Distance Metrics
Covariance Matrices
Mahalanobis Distance
Kullback-Leibler Divergence
Principal Component Analysis
Singular Value Decomposition and Pseudoinverse
SVD in Action
Two Applications
Summary
Chapter 7: Differential Calculus
Slope
Derivatives
A Formal Definition
Basic Rules
Rules for Trigonometric Functions
Rules for Exponentials and Logarithms
Minima and Maxima of Functions
Partial Derivatives
Mixed Partial Derivatives
The Chain Rule for Partial Derivatives
Gradients
Calculating the Gradient
Visualizing the Gradient
Summary
Chapter 8: Matrix Calculus
The Formulas
A Vector Function by a Scalar Argument
A Scalar Function by a Vector Argument
A Vector Function by a Vector
A Matrix Function by a Scalar
A Scalar Function by a Matrix
The Identities
A Scalar Function by a Vector
A Vector Function by a Scalar
A Vector Function by a Vector
A Scalar Function by a Matrix
Jacobians and Hessians
Concerning Jacobians
Concerning Hessians
Some Examples of Matrix Calculus Derivatives
Derivative of Element-Wise Operations
Derivative of the Activation Function
Summary
Chapter 9: Data Flow in Neural Networks
Representing Data
Traditional Neural Networks
Deep Convolutional Networks
Data Flow in Traditional Neural Networks
Data Flow in Convolutional Neural Networks
Convolution
Convolutional Layers
Pooling Layers
Fully Connected Layers
Data Flow Through a Convolutional Neural Network
Summary
Chapter 10: Backpropagation
What Is Backpropagation?
Backpropagation by Hand
Calculating the Partial Derivatives
Translating into Python
Training and Testing the Model
Backpropagation for Fully Connected Networks
Backpropagating the Error
Calculating Partial Derivatives of the Weights and Biases
A Python Implementation
Using the Implementation
Computational Graphs
Summary
Chapter 11: Gradient Descent
The Basic Idea
Gradient Descent in One Dimension
Gradient Descent in Two Dimensions
Stochastic Gradient Descent
Momentum
What Is Momentum?
Momentum in 1D
Momentum in 2D
Training Models with Momentum
Nesterov Momentum
Adaptive Gradient Descent
RMSprop
Adagrad and Adadelta
Adam
Some Thoughts About Optimizers
Summary
Epilogue
Appendix: Going Further
Probability and Statistics
Linear Algebra
Calculus
Deep Learning
Index