Synthetic Data for Deep Learning

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This is the first book on synthetic data for deep learning, and its breadth of coverage may render this book as the default reference on synthetic data for years to come. The book can also serve as an introduction to several other important subfields of machine learning that are seldom touched upon in other books. Machine learning as a discipline would not be possible without the inner workings of optimization at hand. The book includes the necessary sinews of optimization though the crux of the discussion centers on the increasingly popular tool for training deep learning models, namely synthetic data. It is expected that the field of synthetic data will undergo exponential growth in the near future. This book serves as a comprehensive survey of the field.  

In the simplest case, synthetic data refers to computer-generated graphics used to train computer vision models. There are many more facets of synthetic data to consider. In the section on basic computer vision, the book discusses fundamental computer vision problems, both low-level (e.g., optical flow estimation) and high-level (e.g., object detection and semantic segmentation), synthetic environments and datasets for outdoor and urban scenes (autonomous driving), indoor scenes (indoor navigation), aerial navigation, and simulation environments for robotics. Additionally, it touches upon applications of synthetic data outside computer vision (in neural programming, bioinformatics, NLP, and more). It also surveys the work on improving synthetic data development and alternative ways to produce it such as GANs.

The book introduces and reviews several different approaches to synthetic data in various domains of machine learning, most notably the following fields: domain adaptation for making synthetic data more realistic and/or adapting the models to be trained on synthetic data and differential privacy for generating synthetic data with privacy guarantees. This discussion is accompanied by an introduction into generative adversarial networks (GAN) and an introduction to differential privacy.

Author(s): Sergey I. Nikolenko
Series: Springer Optimization and Its Applications, 174
Publisher: Springer
Year: 2021

Language: English
Pages: 360
City: Cham

Preface
Contents
Acronyms
1 Introduction: The Data Problem
1.1 Are Machine Learning Models Hitting a Wall?
1.2 One-Shot Learning and Beyond: Less Data for More Classes
1.3 Weakly Supervised Training: Trading Labels for Computation
1.4 Machine Learning Without Data: Leaving Moore's Law in the Dust
1.5 Why Synthetic Data?
1.6 The Plan
2 Deep Learning and Optimization
2.1 The Deep Learning Revolution
2.2 A (Very) Brief Introduction to Machine Learning
2.3 Introduction to Deep Learning
2.4 First-Order Optimization in Deep Learning
2.5 Adaptive Gradient Descent Algorithms
2.6 Conclusion
3 Deep Neural Networks for Computer Vision
3.1 Computer Vision and Convolutional Neural Networks
3.2 Modern Convolutional Architectures
3.3 Case Study: Neural Architectures for Object Detection
3.4 Data Augmentations: The First Step to Synthetic Data
3.5 Conclusion
4 Generative Models in Deep Learning
4.1 Introduction to Generative Models
4.2 Taxonomy of Generative Models in Deep Learning and Tractable …
4.3 Approximate Explicit Density Models: VAE
4.4 Generative Adversarial Networks
4.5 Loss Functions in GANs
4.6 GAN-Based Architectures
4.7 Case Study: GAN-Based Style Transfer
4.8 Conclusion
5 The Early Days of Synthetic Data
5.1 Line Drawings: The First Steps of Computer Vision
5.2 Synthetic Data as a Testbed for Quantitative Comparisons
5.3 ALVINN: A Self-Driving Neural Network in 1989
5.4 Early Simulation Environments: Robots and the Critique of Simulation
5.5 Case Study: MOBOT and The Problems of Simulation
5.6 Conclusion
6 Synthetic Data for Basic Computer Vision Problems
6.1 Introduction
6.2 Low-Level Computer Vision
6.3 Datasets of Basic Objects
6.4 Case Study: Object Detection With Synthetic Data
6.5 Other High-Level Computer Vision Problems
6.6 Synthetic People
6.7 Other Vision-Related Tasks: OCR and Visual Reasoning
6.8 Conclusion
7 Synthetic Simulated Environments
7.1 Introduction
7.2 Urban and Outdoor Environments: Learning to Drive
7.3 Datasets and Simulators of Indoor Scenes
7.4 Robotic Simulators
7.5 Vision-Based Applications in Unmanned Aerial Vehicles
7.6 Computer Games as Virtual Environments
7.7 Conclusion
8 Synthetic Data Outside Computer Vision
8.1 Synthetic System Logs for Fraud and Intrusion Detection
8.2 Synthetic Data for Neural Programming
8.3 Synthetic Data in Bioinformatics
8.4 Synthetic Data in Natural Language Processing
8.5 Conclusion
9 Directions in Synthetic Data Development
9.1 Domain Randomization
9.2 Improving CGI-Based Generation
9.3 Compositing Real Data to Produce Synthetic Datasets
9.4 Synthetic Data Produced by Generative Models
10 Synthetic-to-Real Domain Adaptation and Refinement
10.1 Synthetic-to-Real Domain Adaptation and Refinement
10.2 Case Study: GAN-Based Refinement for Gaze Estimation
10.3 Refining Synthetic Data with GANs
10.4 Making Synthetic Data from Real with GANs
10.5 Domain Adaptation at the Feature/Model Level
10.6 Domain Adaptation for Control and Robotics
10.7 Case Study: GAN-Based Domain Adaptation for Medical Imaging
10.8 Conclusion
11 Privacy Guarantees in Synthetic Data
11.1 Why is Privacy Important?
11.2 Introduction to Differential Privacy
11.3 Differential Privacy in Deep Learning
11.4 Differential Privacy Guarantees for Synthetic Data Generation
11.5 Case Study: Synthetic Data in Economics, Healthcare, and Social Sciences
11.6 Conclusion
12 Promising Directions for Future Work
12.1 Procedural Generation of Synthetic Data
12.2 From Domain Randomization to the Generation Feedback Loop
12.3 Improving Domain Adaptation with Domain Knowledge
12.4 Additional Modalities for Domain Adaptation Architectures
12.5 Conclusion
Appendix References