Discover the essential building blocks of the most common forms of deep belief networks. At each step this book provides intuitive motivation, a summary of the most important equations relevant to the topic, and concludes with highly commented code for threaded computation on modern CPUs as well as massive parallel processing on computers with CUDA-capable video display cards.
The first of three in a series on C++ and CUDA C deep learning and belief nets, Deep Belief Nets in C++ and CUDA C: Volume 1 shows you how the structure of these elegant models is much closer to that of human brains than traditional neural networks; they have a thought process that is capable of learning abstract concepts built from simpler primitives. As such, you'll see that a typical deep belief net can learn to recognize complex patterns by optimizing millions of parameters, yet this model can still be resistant to overfitting.
All the routines and algorithms presented in the book are available in the code download, which also contains some libraries of related routines.
What You Will Learn
Employ deep learning using C++ and CUDA C
Work with supervised feedforward networks
Implement restricted Boltzmann machines
Use generative samplings
Discover why these are important
Who This Book Is For
Those who have at least a basic knowledge of neural networks and some prior programming experience, although some C++ and CUDA C is recommended.
Author(s): Timothy Masters
Series: Deep Belief Nets
Edition: 1
Publisher: Apress
Year: 2018
Table of Contents
About the Author
About the Technical Reviewer
Chapter 1: Introduction
Review of Multiple-Layer Feedforward Networks
What Are Deep Belief Nets, and Why Do We Like Them?
Chapter 2: Supervised Feedforward Networks
Backpropagation of Errors
SoftMax Outputs for Classification
Code for Gradient Calculation
Weight Penalties
Multithreading Gradient Computation
Gradient Computation with CUDA
Basic Architecture
A Simple Example
Initialization
Hidden Neuron Activation
Output Neuron Activation
SoftMax Output
Output Delta
Output Gradient
Gradient of the First Hidden Layer
Gradient of Subsequent Hidden Layers
Fetching the Gradient
Mean Squared Error by Reduction
Log Likelihood by Reduction
Putting It All Together
Basic Training Algorithms
Simulated Annealing for Starting Weights
Singular Value Decomposition for Optimal Output Weights
Stochastic Gradient Descent
Conjugate Gradient Optimization
Chapter 3: Restricted Boltzmann Machines
What Is a Restricted Boltzmann Machine?
Reconstruction Error
Maximum Likelihood Training, Sort Of
Contrastive Divergence
Weight Penalties
Encouraging Sparsity
Finding Initial Weights
Hidden Neuron Bias
Visible Neuron Bias
Code for Reconstruction Error
Multithreading Initial Weight Selection
Stochastic Gradient Descent Basic Principles
The Core Algorithm
Dividing Epochs into Batches
Shuffling Epochs
Updating the Learning Rate and Momentum
Determining Convergence
Code for Multithreaded RBM Training
CUDA Code for RBM Training
Initialization and Cache Line Matching
Fetching Training Cases
Visible-to-Hidden Layer
Hidden-to-Visible Layer
Gradient Length and Dot Product by Reduction
Updating the Input Bias
Updating the Hidden Neuron Bias
Updating the Weights
Putting It All Together
Timing
Updating Weights Analysis
Visible-to-Hidden Analysis
Hidden-to-Visible Analysis
Advanced Training and Future Versions
Chapter 4: Greedy Training
Generative Sampling
Chapter 5: DEEP Operating Manual
Menu Options
File Menu Options
Test Menu Options
Display Menu Options
The “Read a database” Option
The “Read MNIST image” Option
The “Read MNIST labels” Option
The “Write activation file” Option
The “Clear all data” Option
Model Architecture
Database Inputs and Targets
RBM Training Params
Supervised Training Params
Train
Test
Analyze
Receptive Field
Generative Sample
The DEEP.LOG File
Index