Understanding Deep Learning

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

An authoritative, accessible, and up-to-date treatment of deep learning that strikes a pragmatic middle ground between theory and practice. Deep learning is a fast-moving field with sweeping relevance in today’s increasingly digital world. Understanding Deep Learning provides an authoritative, accessible, and up-to-date treatment of the subject, covering all the key topics along with recent advances and cutting-edge concepts. Many deep learning texts are crowded with technical details that obscure fundamentals, but Simon Prince ruthlessly curates only the most important ideas to provide a high density of critical information in an intuitive and digestible form. From machine learning basics to advanced models, each concept is presented in lay terms and then detailed precisely in mathematical form and illustrated visually. The result is a lucid, self-contained textbook suitable for anyone with a basic background in applied mathematics. Up-to-date treatment of deep learning covers cutting-edge topics not found in existing texts, such as transformers and diffusion models Short, focused chapters progress in complexity, easing students into difficult concepts Pragmatic approach straddling theory and practice gives readers the level of detail required to implement naive versions of models Streamlined presentation separates critical ideas from background context and extraneous detail Minimal mathematical prerequisites, extensive illustrations, and practice problems make challenging material widely accessible Programming exercises offered in accompanying Python Notebooks

Author(s): Simone Prince
Publisher: Independently Published
Year: 2023

Language: English
Pages: 541

Preface
Acknowledgements
Introduction
Supervised learning
Unsupervised learning
Reinforcement learning
Ethics
Structure of book
Other books
How to read this book
Supervised learning
Supervised learning overview
Linear regression example
Summary
Shallow neural networks
Neural network example
Universal approximation theorem
Multivariate inputs and outputs
Shallow neural networks: general case
Terminology
Summary
Deep neural networks
Composing neural networks
From composing networks to deep networks
Deep neural networks
Matrix notation
Shallow vs. deep neural networks
Summary
Loss functions
Maximum likelihood
Recipe for constructing loss functions
Example 1: univariate regression
Example 2: binary classification
Example 3: multiclass classification
Multiple outputs
Cross-entropy loss
Summary
Fitting models
Gradient descent
Stochastic gradient descent
Momentum
Adam
Training algorithm hyperparameters
Summary
Gradients and initialization
Problem definitions
Computing derivatives
Toy example
Backpropagation algorithm
Parameter initialization
Example training code
Summary
Measuring performance
Training a simple model
Sources of error
Reducing error
Double descent
Choosing hyperparameters
Summary
Regularization
Explicit regularization
Implicit regularization
Heuristics to improve performance
Summary
Convolutional networks
Invariance and equivariance
Convolutional networks for 1D inputs
Convolutional networks for 2D inputs
Downsampling and upsampling
Applications
Summary
Residual networks
Sequential processing
Residual connections and residual blocks
Exploding gradients in residual networks
Batch normalization
Common residual architectures
Why do nets with residual connections perform so well?
Summary
Transformers
Processing text data
Dot-product self-attention
Extensions to dot-product self-attention
Transformers
Transformers for natural language processing
Encoder model example: BERT
Decoder model example: GPT3
Encoder-decoder model example: machine translation
Transformers for long sequences
Transformers for images
Summary
Graph neural networks
What is a graph?
Graph representation
Graph neural networks, tasks, and loss functions
Graph convolutional networks
Example: graph classification
Inductive vs. transductive models
Example: node classification
Layers for graph convolutional networks
Edge graphs
Summary
Unsupervised learning
Taxonomy of unsupervised learning models
What makes a good generative model?
Quantifying performance
Summary
Generative Adversarial Networks
Discrimination as a signal
Improving stability
Progressive growing, minibatch discrimination, and truncation
Conditional generation
Image translation
StyleGAN
Summary
Normalizing flows
1D example
General case
Invertible network layers
Multi-scale flows
Applications
Summary
Variational autoencoders
Latent variable models
Nonlinear latent variable model
Training
ELBO properties
Variational approximation
The variational autoencoder
The reparameterization trick
Applications
Summary
Diffusion models
Overview
Encoder (forward process)
Decoder model (reverse process)
Training
Reparameterization of loss function
Implementation
Summary
Reinforcement learning
Markov decision processes, returns, and policies
Expected return
Tabular reinforcement learning
Fitted Q-learning
Policy gradient methods
Actor-critic methods
Offline reinforcement learning
Summary
Why does deep learning work?
The case against deep learning
Factors that influence fitting performance
Properties of loss functions
Factors that determine generalization
Do we need so many parameters?
Do networks have to be deep?
Summary
Deep learning and ethics
Value alignment
Intentional misuse
Other social, ethical, and professional issues
Case study
The value-free ideal of science
Responsible AI research as a collective action problem
Ways forward
Summary
Notation
Mathematics
Functions
Binomial coefficients
Vector, matrices, and tensors
Special types of matrix
Matrix calculus
Probability
Random variables and probability distributions
Expectation
Normal probability distribution
Sampling
Distances between probability distributions
Bibliography
Index