Author(s): Marius Leordeanu
Series: Advances in Computer Vision and Pattern Recognition
Publisher: Springer
Year: 2020
Preface
Acknowledgements
Endorsements
Contents
1 Unsupervised Visual Learning: From Pixels to Seeing
1.1 What Does It Mean to See?
1.2 What Is Unsupervised Visual Learning?
1.3 Visual Learning in Space and Time
1.3.1 Current Trends in Unsupervised Learning
1.3.2 Relation to Gestalt Psychology
1.4 Principles of Unsupervised Learning
1.4.1 Object Versus Context
1.4.2 Learning with Highly Probable Positive Features
1.5 Unsupervised Learning for Graph Matching
1.5.1 Graph Matching: Problem Formulation
1.5.2 Spectral Graph Matching
1.5.3 Integer Projected Fixed Point Method for Graph Matching
1.5.4 Learning Graph Matching
1.5.5 Supervised Learning for Graph Matching
1.5.6 Unsupervised Learning for Graph Matching
1.6 Unsupervised Clustering Meets Classifier Learning
1.6.1 Integer Projected Fixed Point Method for Graph Clustering
1.6.2 Feature Selection as a Graph Clustering Problem
1.7 Unsupervised Learning for Object Segmentation in Video
1.8 Space-Time Graph
1.8.1 Optimization Algorithm
1.8.2 Learning Unsupervised Segmentation over Multiple Teacher-Student Generations
1.8.3 Concluding Remarks
1.9 Next Steps
References
2 Unsupervised Learning of Graph and Hypergraph Matching
2.1 Introduction
2.1.1 Relation to Principles of Unsupervised Learning
2.2 Graph Matching
2.3 Hypergraph Matching
2.4 Solving Graph Matching
2.4.1 Spectral Matching
2.4.2 Integer Projected Fixed Point Algorithm
2.5 Theoretical Analysis
2.6 Solving Hypergraph Matching
2.7 Learning Graph Matching
2.7.1 Theoretical Analysis
2.7.2 Supervised Learning for Graph Matching
2.7.3 Unsupervised and Semi-supervised Learning for Graph Matching
2.7.4 Learning Pairwise Conditional Random Fields
2.8 Learning Hypergraph Matching
2.9 Experiments on Graph Matching
2.9.1 Learning with Unlabeled Correspondences
2.9.2 Learning for Different Graph Matching Algorithms
2.9.3 Experiments on Conditional Random Fields
2.10 Experiments on Hypergraph Matching
2.10.1 Synthetic Data
2.10.2 Experiments on Real Images
2.10.3 Matching People
2.10.4 Supervised Versus Unsupervised Learning
2.11 Conclusions and Future Work
2.11.1 Efficient Optimization
2.11.2 Higher Order Relationships
References
3 Unsupervised Learning of Graph and Hypergraph Clustering
3.1 Introduction
3.2 Problem Formulation
3.3 Algorithm: IPFP for Hypergraph Clustering
3.4 Theoretical Analysis
3.4.1 Computational Complexity
3.5 Learning Graph and Hypergraph Clustering
3.6 Experiments on Third-Order Hypergraph Clustering
3.6.1 Line Clustering
3.6.2 Affine-Invariant Matching
3.7 Conclusions and Future Work
References
4 Feature Selection Meets Unsupervised Learning
4.1 Introduction
4.1.1 Relation to Principles of Unsupervised Learning
4.1.2 Why Labeling the Features and Not the Samples
4.2 Mathematical Formulation
4.2.1 Supervised Learning
4.2.2 Unsupervised Case: Labeling the Features Not the Samples
4.2.3 Intuition
4.3 Feature Selection and Learning by Clustering with IPFP
4.3.1 Theoretical Analysis
4.4 Experimental Analysis
4.4.1 Comparative Experiments
4.4.2 Additional Comparisons with SVM
4.5 The Effect of Limited Training Data
4.5.1 Estimating Feature Signs from Limited Data
4.5.2 Varying the Amount of Unsupervised Data
4.6 Intuition Regarding the Selected Features
4.6.1 Location Distribution of Selected Features
4.7 Concluding Remarks and Future Work
References
5 Unsupervised Learning of Object Segmentation in Video with Highly Probable Positive Features
5.1 From Simple Features to Unsupervised Segmentation in Video
5.2 A Simple Approach to Unsupervised Image Segmentation
5.2.1 A Fast Color Segmentation Algorithm
5.3 VideoPCA: Unsupervised Background Subtraction in Video
5.3.1 Soft Foreground Segmentation with VideoPCA
5.4 Unsupervised Segmentation in Video Using HPP Features
5.4.1 Learning with Highly Probable Positive Features
5.4.2 Descriptor Learning with IPFP
5.4.3 Combining Appearance and Motion
5.5 Experimental Analysis
5.5.1 Tests on YouTube-Objects Dataset
5.5.2 Tests on SegTrack V2 Dataset
5.5.3 Computation Time
5.6 Conclusions and Future Work
References
6 Coupling Appearance and Motion: Unsupervised Clustering for Object Segmentation Through Space and Time
6.1 Introduction
6.1.1 Relation to Principles of Unsupervised Learning
6.1.2 Scientific Context
6.2 Our Spectral Approach to Segmentation
6.2.1 Creating the Space-Time Graph
6.2.2 Segmentation as Spectral Clustering
6.2.3 Optimization by Power Iteration Method
6.3 Theoretical Properties
6.3.1 Convergence Analysis
6.3.2 Feature-Motion Matrix
6.4 Experimental Analysis
6.4.1 The Role of Segmentation Initialization
6.4.2 The Role of Node Features
6.4.3 The Role of Optical Flow
6.4.4 Complexity Analysis and Computational Cost
6.4.5 Results
6.5 Concluding Remarks
References
7 Unsupervised Learning in Space and Time over Several Generations of Teacher and Student Networks
7.1 Introduction
7.1.1 Relation to Unsupervised Learning Principles
7.2 Scientific Context
7.3 Learning over Multiple Teacher-Student Generations
7.4 Our Teacher-Student System Architecture
7.4.1 Student Pathway: Single-Image Segmentation
7.4.2 Teacher Pathway: Unsupervised Object Discovery
7.4.3 Unsupervised Soft Masks Selection
7.4.4 Implementation Pipeline
7.5 Experimental Analysis
7.5.1 Ablation Study
7.5.2 Tests on Foreground Segmentation
7.5.3 Tests on Transfer Learning
7.5.4 Concluding Remarks on Experiments
7.6 Concluding Discussion on Unsupervised Learning
7.7 Overall Conclusions and Future Work
7.7.1 Towards a Universal Visual Learning Machine
References
8 Unsupervised Learning Towards the Future
8.1 Introduction
8.2 Recurrent Graph Neural Networks in Space and Time
8.2.1 Scientific Context
8.2.2 Recurrent Space-Time Graph Model
8.2.3 Experiments: Learning Patterns of Movements and Shapes
8.2.4 Experiments: Learning Complex Human-Object Interactions
8.3 Putting Things Together
8.3.1 Agreements at the Geometric Level
8.3.2 Agreements at the Semantic Level
8.3.3 Agreements as Highly Probable Positive (HPP) Features
8.3.4 Motion Patterns as HPP Features
8.3.5 Learning over Multiple Teacher-Student Generations
8.3.6 Building Blocks of the Visual Story Network
8.4 The Dawn of the Visual Story Graph Neural Network
8.4.1 Classifiers Should Be Highly Interconnected
8.4.2 Relation to Adaptive Resonance Theory
8.4.3 Multiple Layers of Interpretation: Depth, Motion, and Meaning
8.4.4 Local Objects and Their Global Roles in the Story
8.4.5 Unsupervised Learning in the Visual Story Network
8.4.6 Learning Evolution over Multiple Generations
8.4.7 Learning New Categories
8.5 Visual Stories Towards Language and Beyond
8.5.1 Learning from Language
8.5.2 Unsupervised Learning by Surprise
8.5.3 Discover Itself
8.5.4 Dreams of Tomorrow
References
Appendix Index
Index