One of the grand challenges of artificial intelligence is to enable computers to interpret 3D scenes and objects from imagery. This book organizes and introduces major concepts in 3D scene and object representation and inference from still images, with a focus on recent efforts to fuse models of geometry and perspective with statistical machine learning. The book is organized into three sections: (1) Interpretation of Physical Space; (2) Recognition of 3D Objects; and (3) Integrated 3D Scene Interpretation. The first discusses representations of spatial layout and techniques to interpret physical scenes from images. The second section introduces representations for 3D object categories that account for the intrinsically 3D nature of objects and provide robustness to change in viewpoints. The third section discusses strategies to unite inference of scene geometry and object pose and identity into a coherent scene interpretation. Each section broadly surveys important ideas from cognitive science and artificial intelligence research, organizes and discusses key concepts and techniques from recent work in computer vision, and describes a few sample approaches in detail. Newcomers to computer vision will benefit from introductions to basic concepts, such as single-view geometry and image classification, while experts and novices alike may find inspiration from the book's organization and discussion of the most recent ideas in 3D scene understanding and 3D object recognition. Specific topics include: mathematics of perspective geometry; visual elements of the physical scene, structural 3D scene representations; techniques and features for image and region categorization; historical perspective, computational models, and datasets and machine learning techniques for 3D object recognition; inferences of geometrical attributes of objects, such as size and pose; and probabilistic and feature-passing approaches for contextual reasoning about 3D objects and scenes. Table of Contents: Background on 3D Scene Models / Single-view Geometry / Modeling the Physical Scene / Categorizing Images and Regions / Examples of 3D Scene Interpretation / Background on 3D Recognition / Modeling 3D Objects / Recognizing and Understanding 3D Objects / Examples of 2D 1/2 Layout Models / Reasoning about Objects and Scenes / Cascades of Classifiers / Conclusion and Future Directions
Author(s): Derek Hoiem, Silvio Savarese
Series: Synthesis Lectures on Artificial Inetlligence and Machine Learning
Edition: 1
Publisher: Morgan & Claypool Publishers
Year: 2011
Language: English
Pages: 171
Tags: Информатика и вычислительная техника;Обработка медиа-данных;Обработка изображений;
Preface......Page 13
Acknowledgments......Page 17
Figure Credits......Page 19
Interpretation of Physical Space from an Image......Page 25
Theories of Vision......Page 27
Depth and Surface Perception......Page 28
A Well-Organized Scene......Page 29
Early Computer Vision and AI......Page 30
Modern Computer Vision......Page 31
Perspective Projection with Pinhole Camera: 3D to 2D......Page 33
3D Measurement from a 2D Image......Page 35
Automatic Estimation of Vanishing Points......Page 37
Summary of Key Concepts......Page 39
Elements......Page 41
Representations of Scene Space......Page 43
Retinotopic Maps......Page 45
Highly Structured 3D Models......Page 46
Loosely Structured Models: 3D Point Clouds and Meshes......Page 48
Summary......Page 49
Overview of Image Labeling......Page 51
Creating Regions......Page 53
Classifiers......Page 54
Datasets......Page 56
Color......Page 57
Texture......Page 58
Gradient-based......Page 59
Region Shape......Page 60
Summary......Page 61
Intuition......Page 63
Approach to Estimate Surface Layout......Page 64
3D Reconstruction using the Surface Layout......Page 66
Make3D: Depth from an Image......Page 68
Predicting Depth and Orientation......Page 70
Local Constraints and Priors......Page 71
The Room as a Box......Page 72
Algorithm......Page 73
Results......Page 75
Summary......Page 77
Recognition of 3D Objects from an Image......Page 79
The Geon Theory......Page 81
2D-view specific templates......Page 82
Aspect graphs......Page 84
Early Computational Models......Page 85
Overview......Page 87
Single instance 3D category models......Page 88
Single instance 2D view-template models......Page 89
Single instance 3D models......Page 91
Mixture of Single-View Models......Page 92
2-1/2D Layout Models......Page 93
2-1/2D Layout by ISM models......Page 94
2-1/2D Layout by view-invariant parts......Page 95
3D Layout Models......Page 96
3D Layout Models constructed upon 3D prototypes......Page 98
3D Layout Models without 3D prototypes......Page 99
Datasets......Page 101
Supervision and Initialization......Page 103
Modeling, Learning and Inference Strategies......Page 104
Linkage Structure of Canonical Parts......Page 107
View-morphing models......Page 109
Learning the model......Page 112
Detection and viewpoint classification......Page 114
Results......Page 115
Conclusions......Page 119
Integrated 3D Scene Interpretation......Page 121
Object Size......Page 123
Appearance Features......Page 127
Interaction Between Objects and Scene via Object Scale and Pose......Page 129
Occlusion......Page 132
Summary......Page 133
Intrinsic Image Representation......Page 135
Contextual Interactions......Page 137
Experiments......Page 138
Feedback-Enabled Cascaded Classification Models......Page 143
Algorithm......Page 144
Experiments......Page 145
Summary......Page 146
Conclusion and Future Directions......Page 147
Bibliography......Page 149
Authors' Biographies......Page 171