"ImageNet Bundle:The complete deep learning for computer vision experience. In this bundle, I demonstrate how to train large-scale neural networks on the massive ImageNet dataset. You just can't beat this bundle if you want to master deep learning for computer vision." [trouvé sur la page de l'éditeur].
Author(s): Adrian Rosebrock
Year: 2017
Language: English
Pages: 321
Tags: coding, programming, computer science, mathematics, logic, math, maths, AI, deep learning, artificial intelligence, machine learning
1 Introduction
2 Introduction
3 Training Networks Using Multiple GPUs
3.1 How Many GPUs Do I Need?
3.2 Performance Gains Using Multiple GPUs
3.3 Summary
4 What Is ImageNet?
4.1 The ImageNet Dataset
4.1.1 ILSVRC
4.2 Obtaining ImageNet
4.2.1 Requesting Access to the ILSVRC Challenge
4.2.2 Downloading Images Programmatically
4.2.3 Using External Services
4.2.4 ImageNet Development Kit
4.2.5 ImageNet Copyright Concerns
4.3 Summary
5 Preparing the ImageNet Dataset
5.1 Understanding the ImageNet File Structure
5.1.1 ImageNet “test” Directory
5.1.2 ImageNet “train” Directory
5.1.3 ImageNet “val” Directory
5.1.4 ImageNet “ImageSets” Directory
5.1.5 ImageNet “DevKit” Directory
5.2 Building the ImageNet Dataset
5.2.1 Your First ImageNet Configuration File
5.2.2 Our ImageNet Helper Utility
5.2.3 Creating List and Mean Files
5.2.4 Building the Compact Record Files
5.3 Summary
6 Training AlexNet on ImageNet
6.1 Implementing AlexNet
6.2 Training AlexNet
6.2.1 What About Training Plots?
6.2.2 Implementing the Training Script
6.3 Evaluating AlexNet
6.4 AlexNet Experiments
6.4.1 AlexNet: Experiment #1
6.4.2 AlexNet: Experiment #2
6.4.3 AlexNet: Experiment #3
6.5 Summary
7 Training VGGNet on ImageNet
7.1 Implementing VGGNet
7.2 Training VGGNet
7.3 Evaluating VGGNet
7.4 VGGNet Experiments
7.5 Summary
8 Training GoogLeNet on ImageNet
8.1 Understanding GoogLeNet
8.1.1 The Inception Module
8.1.2 GoogLeNet Architecture
8.1.3 Implementing GoogLeNet
8.1.4 Training GoogLeNet
8.2 Evaluating GoogLeNet
8.3 GoogLeNet Experiments
8.3.1 GoogLeNet: Experiment #1
8.3.2 GoogLeNet: Experiment #2
8.3.3 GoogLeNet: Experiment #3
8.4 Summary
9 Training ResNet on ImageNet
9.1 Understanding ResNet
9.2 Implementing ResNet
9.3 Training ResNet
9.4 Evaluating ResNet
9.5 ResNet Experiments
9.5.1 ResNet: Experiment #1
9.5.2 ResNet: Experiment #2
9.5.3 ResNet: Experiment #3
9.6 Summary
10 Training SqueezeNet on ImageNet
10.1 Understanding SqueezeNet
10.1.1 The Fire Module
10.1.2 SqueezeNet Architecture
10.1.3 Implementing SqueezeNet
10.2 Training SqueezeNet
10.3 Evaluating SqueezeNet
10.4 SqueezeNet Experiments
10.4.1 SqueezeNet: Experiment #1
10.4.2 SqueezeNet: Experiment #2
10.4.3 SqueezeNet: Experiment #3
10.4.4 SqueezeNet: Experiment #4
10.5 Summary
11 Case Study: Emotion Recognition
11.1 The Kaggle Facial Expression Recognition Challenge
11.1.1 The FER13 Dataset
11.1.2 Building the FER13 Dataset
11.2 Implementing a VGG-like Network
11.3 Training Our Facial Expression Recognizer
11.3.1 EmotionVGGNet: Experiment #1
11.3.2 EmotionVGGNet: Experiment #2
11.3.3 EmotionVGGNet: Experiment #3
11.3.4 EmotionVGGNet: Experiment #4
11.4 Evaluating our Facial Expression Recognizer
11.5 Emotion Detection in Real-time
11.6 Summary
12 Case Study: Correcting Image Orientation
12.1 The Indoor CVPR Dataset
12.1.1 Building the Dataset
12.2 Extracting Features
12.3 Training an Orientation Correction Classifier
12.4 Correcting Orientation
12.5 Summary
13 Case Study: Vehicle Identification
13.1 The Stanford Cars Dataset
13.1.1 Building the Stanford Cars Dataset
13.2 Fine-tuning VGG on the Stanford Cars Dataset
13.2.1 VGG Fine-tuning: Experiment #1
13.2.2 VGG Fine-tuning: Experiment #2
13.2.3 VGG Fine-tuning: Experiment #3
13.3 Evaluating our Vehicle Classifier
13.4 Visualizing Vehicle Classification Results
13.5 Summary
14 Case Study: Age and Gender Prediction
14.1 The Ethics of Gender Identification in Machine Learning
14.2 The Adience Dataset
14.2.1 Building the Adience Dataset
14.3 Implementing Our Network Architecture
14.4 Measuring “One-off” Accuracy
14.5 Training Our Age and Gender Predictor
14.6 Evaluating Age and Gender Prediction
14.7 Age and Gender Prediction Results
14.7.1 Age Results
14.7.2 Gender Results
14.8 Visualizing Results
14.8.1 Visualizing Results from Inside Adience
14.8.2 Understanding Face Alignment
14.8.3 Applying Age and Gender Prediction to Your Own Images
14.9 Summary
15 Faster R-CNNs
15.1 Object Detection and Deep Learning
15.1.1 Measuring Object Detector Performance
15.2 The (Faster) R-CNN Architecture
15.2.1 A Brief History of R-CNN
15.2.2 The Base Network
15.2.3 Anchors
15.2.4 Region Proposal Network (RPN)
15.2.5 Region of Interest (ROI) Pooling
15.2.6 Region-based Convolutional Neural Network
15.2.7 The Complete Training Pipeline
15.3 Summary
16 Training a Faster R-CNN From Scratch
16.1 The LISA Traffic Signs Dataset
16.2 Installing the TensorFlow Object Detection API
16.3 Training Your Faster R-CNN
16.3.1 Project Directory Structure
16.3.2 Configuration
16.3.3 A TensorFlow Annotation Class
16.3.4 Building the LISA + TensorFlow Dataset
16.3.5 A Critical Pre-Training Step
16.3.6 Configuring the Faster R-CNN
16.3.7 Training the Faster R-CNN
16.3.8 Suggestions When Working with the TFOD API
16.3.9 Exporting the Frozen Model Graph
16.3.10 Faster R-CNN on Images and Videos
16.4 Summary
17 Single Shot Detectors (SSDs)
17.1 Understanding Single Shot Detectors (SSDs)
17.1.1 Motivation
17.1.2 Architecture
17.1.3 MultiBox, Priors, and Fixed Priors
17.1.4 Training Methods
17.2 Summary
18 Training a SSD From Scratch
18.1 The Vehicle Dataset
18.2 Training Your SSD
18.2.1 Directory Structure and Configuration
18.2.2 Building the Vehicle Dataset
18.2.3 Training the SSD
18.2.4 SSD Results
18.2.5 Potential Problems and Limitations
18.3 Summary
19 Conclusions
19.1 Where to Now?