Pattern Recognition: 5th Asian Conference, ACPR 2019, Auckland, New Zealand, November 26–29, 2019, Revised Selected Papers, Part I (Lecture Notes in Computer Science, 12046)

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This two-volume set constitutes the proceedings of the 5th Asian Conference on ACPR 2019, held in Auckland, New Zealand, in November 2019.
The 9 full papers presented in this volume were carefully reviewed and selected from 14 submissions. They cover topics such as: classification; action and video and motion; object detection and anomaly detection; segmentation, grouping and shape; face and body and biometrics; adversarial learning and networks; computational photography; learning theory and optimization; applications, medical and robotics; computer vision and robot vision; pattern recognition and machine learning; multi-media and signal processing; and interaction.

Author(s): Shivakumara Palaiahnakote (editor), Gabriella Sanniti di Baja (editor), Liang Wang (editor), Wei Qi Yan (editor)
Publisher: Springer
Year: 2020

Language: English
Pages: 956

Preface
Organization
Contents – Part I
Contents – Part II
Classification
Integrating Domain Knowledge: Using Hierarchies to Improve Deep Classifiers
1 Introduction
2 Related Work
3 Method
3.1 Class Hierarchy
3.2 Probabilistic Model
3.3 Inference
3.4 Training
4 Experiments
4.1 Experimental Setup
4.2 Overall Improvement—ImageNet
4.3 Speedup—CIFAR-100
4.4 Fine-Grained Recognition—NABirds
4.5 Overview and Discussion
5 Conclusion
References
Label-Smooth Learning for Fine-Grained Visual Categorization
1 Introduction
2 Related Work
2.1 Fine-Grained Visual Categorization
2.2 Label-Smooth Leaning
3 Approach
3.1 Label-Smoothing Learning
3.2 Cross-Category Cross-Semantic Regularization
4 Experiments
4.1 Fine-Grained Visual Categorization (FGVC) Datasets
4.2 Implementation
4.3 Ablation Studies
4.4 Comparison with State-of-the-Art
5 Conclusion
References
ForestNet – Automatic Design of Sparse Multilayer Perceptron Network Architectures Using Ensembles of Randomized Trees
1 Introduction
2 Related Work
3 Model to Build a Sparse Multilayer Pereceptron Network
3.1 Building the Ensemble of Randomized Trees
3.2 Defining the SMLP Network Architecture
3.3 Initialization of the Weights
4 Experiment Settings
4.1 Hyperparameter Settings for Tree Ensembles
4.2 Hyperparameter Settings for Networks
5 Results and Discussion
5.1 Classification Accuracy
5.2 Amount of Connections
5.3 Amount of Hidden Neurons
5.4 Training and Validation Losses
5.5 Discussion
6 Conclusion and Future Work
References
Clustering-Based Adaptive Dropout for CNN-Based Classification
1 Introduction
2 The Proposed Algorithm
2.1 Clustering-Based Dropout with Adaptive Probability
2.2 Clustering Algorithm and Network Configuration
2.3 Implementation Details
3 Experimental Results
4 Conclusion
References
Action and Video and Motion
Real-Time Detection and Tracking Using Hybrid DNNs and Space-Aware Color Feature: From Algorithm to System
1 Introduction
2 Related Work
2.1 Multiple Object Tracking
2.2 Color Feature
3 Whole Framework
4 Detection
5 Matching and Tracking
5.1 Similarity Sorting-Based Matching
5.2 Space-Aware Color-Based Bounding Box Refinement
5.3 Retire-Recall Mechanism
5.4 Whole Matching and Tracking Algorithm
6 Experiment Evaluation
6.1 Overall Performance
6.2 Technique Effect
6.3 Platform-Specific Exploration
6.4 Discussion
7 Conclusion
References
Continuous Motion Numeral Recognition Using RNN Architecture in Air-Writing Environment
1 Introduction
2 Related Works
3 Proposed Methodology
3.1 Motion Tracking Method
3.2 Numeral Recognition
4 Results and Discussion
4.1 Data Collection
4.2 Experimental Setup
4.3 Result
4.4 Error Analysis
5 Conclusion
References
Using Motion Compensation and Matrix Completion Algorithm to Remove Rain Streaks and Snow for Video Sequences
1 Introduction
2 Challenges
3 The Proposed Method
3.1 Rain Streaks Detection
3.2 Rain Map Refinement
3.3 Rain Streaks Reconstruction
4 Experimental Results
4.1 Synthetic Video Sequences
4.2 Real Video Sequences
4.3 Challenge Tasks
5 Conclusion
References
Assessing the Impact of Video Compression on Background Subtraction
1 Introduction
2 Related Work
3 Problem Formulation
4 Study Design
4.1 Background Subtraction
4.2 Dataset
4.3 Quality Metrics for Videos
4.4 Performance Metric for Background Subtraction Algorithms
5 Results
5.1 Relation Between Percentage Drop in Quality and Performance
5.2 Identify the Encoding Parameters to Employ in Surveillance Scenarios
5.3 Predicting Performance Based on Video Quality
6 Conclusion
References
Object Detection and Anomaly Detection
Learning Motion Regularity for Temporal Video Segmentation and Anomaly Detection
1 Introduction
2 Related Work
3 Anomaly Detection Method
3.1 Temporal Video Segmentation
3.2 Anomaly Detection
4 Experiments
4.1 Dataset
4.2 Implementation Details
4.3 Results
4.4 Analysis of the Impact of Temporal Video Segmentation
5 Conclusion
References
A New Forged Handwriting Detection Method Based on Fourier Spectral Density and Variation
1 Introduction
2 Related Work
3 Proposed Method
3.1 Fourier Spectral Distributions
3.2 Spectral Density and Variance Based Features for Forged Handwriting Classification
4 Experimental Results
4.1 Experiments on Forged Handwriting Word Detection
4.2 Experiments on Forged Caption Text Detection in Video Images
4.3 Experiments on Forged IMEI Number Detection
5 Conclusion and Future Work
References
Robust Pedestrian Detection: Faster Deployments with Fusion of Models
1 Introduction
2 Related Works
3 Methods
3.1 State-of-the-Art Object Detectors
3.2 Non-maxima Suppression
3.3 Detectors Divergences
3.4 Missed Detections
3.5 Proposed Architecture for Fusion of Models
4 Experiments and Results
4.1 Datasets
4.2 Metrics
5 Conclusion
References
Perceptual Image Anomaly Detection
1 Introduction
2 Related Work
3 Perceptual Image Anomaly Detection
3.1 Relative-perceptual-L1 Loss
3.2 Training Objective
3.3 Gradient-Normalizing Weight Policy
4 Experiments
4.1 Results
4.2 Ablation Study
5 Conclusion
References
Segmentation, Grouping and Shape
Deep Similarity Fusion Networks for One-Shot Semantic Segmentation
1 Introduction
2 Related Work
3 Problem Setup
4 Method
4.1 Dataset and Metric
4.2 Deep Similarity Fusion Network
4.3 Similarity Feature Fusion Module
5 Experiments
5.1 Implement Details
5.2 Results
5.3 Ablation Study
6 Conclusion
References
Seeing Things in Random-Dot Videos
1 Introduction
2 Signal Modelling
3 Literature Review
4 The A Contrario Framework
5 A Contrario on Random-Dot Videos
5.1 A Merging Strategy for Temporal Integration
5.2 Designing the A Contrario Algorithm
6 A Priori Analysis of A Contrario Performance
7 Empirical Results of the A Contrario Algorithm
7.1 Static Edge Case
7.2 Dynamic Edge Case
8 Human Performance Versus the A Contrario Process
8.1 Evaluating the Visual Angle
8.2 Evaluating the Time Integration
9 Concluding Remarks
References
Boundary Extraction of Planar Segments from Clouds of Unorganised Points
1 Introduction
2 Hough Voting Analysis
2.1 Hough Space
2.2 Voting Analysis
3 Plane Boundary Extraction
3.1 Top and Bottom Boundary Detection
3.2 Left and Right Boundary Detection
4 Experimental Results
4.1 Test on Synthetic Point Clouds
4.2 Test on Real-World Point Clouds
5 Conclusions
References
Real-Time Multi-class Instance Segmentation with One-Time Deep Embedding Clustering
1 Introduction
2 Related Works
3 One-Time Clustering Method
3.1 Encoder-Decoder Network
3.2 Loss Function
3.3 One-Time Clustering Algorithm
4 Experiments
4.1 Cityscapes Dataset
4.2 Evaluation Metrics
4.3 Training Environment Setup
4.4 Result and Discussion
5 Conclusion
References
Face and Body and Biometrics
Multi-person Pose Estimation with Mid-Points for Human Detection under Real-World Surveillance
1 Introduction
2 Methodology
2.1 Generating Body Region Points and Mid-Points
2.2 Generating Core-of-Pose
2.3 Generating Pose Vectors and Human Bounding Boxes
3 Evaluation
4 Discussion
4.1 Associating Parts Rather Than Detecting the Whole Target
4.2 Training Mid-Points
4.3 Triangles in Core-of-Pose
5 Conclusion and Future Work
References
Gaze from Head: Gaze Estimation Without Observing Eye
1 Introduction
2 Eye-Head Coordination
3 Methods
4 Datasets
4.1 Real Dataset
4.2 VR Dataset
5 Experiment
5.1 Qualitative Evaluation
5.2 Quantitative Evaluation
5.3 Compatibility of the Real and VR Datasets
6 Conclusion
References
Interaction Recognition Through Body Parts Relation Reasoning
1 Introduction
2 Related Work
2.1 Action Recognition
2.2 Human Interaction Recognition
2.3 Relational Network
3 Relational Network Overview
4 Interaction Relational Network
5 Experiments
5.1 Datasets
5.2 Implementation Details
5.3 Experimental Results
6 Conclusion
References
DeepHuMS: Deep Human Motion Signature for 3D Skeletal Sequences
1 Introduction
2 Related Work
3 DeepHuMS: Our Method
4 Experiments
4.1 Datasets
4.2 Implementation Details
4.3 Evaluation Metrics
4.4 Comparison with State of the Art
4.5 Discussion
4.6 Limitations
5 Conclusion
References
Adversarial Learning and Networks
Towards a Universal Appearance for Domain Generalization via Adversarial Learning
1 Introduction
2 Related Work
3 Proposed Method
3.1 Problem Statement and Motivations
3.2 Network Architecture
3.3 Appearance Generalizer
3.4 Adversarial Learning
4 Experiment
4.1 Evaluation of Appearance Generalizer
4.2 Comparison with Existing Methods
4.3 Ablation Study
5 Conclusion
References
Pre-trained and Shared Encoder in Cycle-Consistent Adversarial Networks to Improve Image Quality
1 Introduction
2 Related Work
2.1 Cycle-Consistent Adversarial Networks
2.2 Super-Resolution
3 Proposed Method
3.1 Encoder-Decoder Block
3.2 Partly Frozen CycleGAN
4 Experiments
4.1 Training Setting
4.2 Training Epoch of ED-Block
4.3 Results
4.4 Evaluation and Discussion
5 Conclusions
References
MobileGAN: Compact Network Architecture for Generative Adversarial Network
1 Introduction
2 Related Work
2.1 MobileNetV1: Separable Convolution
2.2 MobileNetV2: Inverted Residuals
3 Proposed Method
3.1 Compact Generator Network
3.2 Asymmetric GAN Architecture
4 Experiments
5 Conclusions
References
Denoising and Inpainting of Sea Surface Temperature Image with Adversarial Physical Model Loss
1 Introduction
2 Related Work
2.1 Data Assimilation
2.2 Image Restoration by Inpainting
3 Our Method
3.1 Loss Functions
3.2 Using Spatio-Temporal SST Images
3.3 Generator Function
3.4 Network Architecture
4 Experiments
5 Conclusion
References
Computational Photography
Confidence Map Based 3D Cost Aggregation with Multiple Minimum Spanning Trees for Stereo Matching
1 Introduction
2 Framework
3 Proposed Method
3.1 3D Cost Aggregation with Forest Structure
3.2 Reliable Disparity Extraction with Distinctiveness Analysis
3.3 Anchored Rolling Refinement
4 Experiments
4.1 Parameter Setting
4.2 Evaluation Based on Middlebury 3.0 Benchmark
4.3 Parameter Analysis
5 Conclusion
References
Optical Flow Assisted Monocular Visual Odometry
1 Introduction
2 Related Work
2.1 Supervised Approaches
2.2 Unsupervised Approaches
3 Methodology
3.1 Overview
3.2 Flow Prediction Part
3.3 Motion Estimation Part
3.4 Loss Function
4 Experiments
4.1 Datasets
4.2 Training Setup
4.3 How Our Flow Prediction Module Help VO Learning
4.4 VO Performances
5 Conclusions
References
Coarse-to-Fine Deep Orientation Estimator for Local Image Matching
1 Introduction
2 Related Work
2.1 Hand-Crafted Keypoint Detectors
2.2 CNN-Based Methods
3 Proposed Method
3.1 Coarse Orientation Estimator
3.2 Fine Orientation Estimator
3.3 The Calculation of Final Orientation
4 Experiments
4.1 Experimental Settings
4.2 Accuracies over Different Loss Functions
4.3 Evaluation of Area Under the Precision-Recall Curve
4.4 Comparison with Other Local Feature Descriptors
5 Conclusion
References
Geometric Total Variation for Image Vectorization, Zooming and Pixel Art Depixelizing
1 Introduction
2 Geometric Total Variation
3 Contour Regularization
4 Raster Image Zooming with Smooth Contours
5 Experimental Evaluation and Discussion
6 Conclusion
References
Learning Theory and Optimization
Information Theory-Based Curriculum Learning Factory to Optimize Training
1 Introduction
2 Related Work
3 Proposed Method
3.1 Stage I: Assessing Content of Training Samples
3.2 Stage II: Sorting Batches of Samples
3.3 Stage III: Syllabus Evaluation
4 Experiments
4.1 Datasets
4.2 Training
4.3 Networks
5 Results and Analysis
5.1 Training Trends
5.2 Regularization Loss
5.3 Sparsity
5.4 Classification Results
5.5 Conclusion
References
A Factorization Strategy for Tensor Robust PCA
1 Introduction
2 Notations and Preliminaries
3 TriFac for Tensor Robust PCA
3.1 Model Formulation
3.2 Optimization Algorithm
4 Experiments
4.1 Synthetic Data Experiments
4.2 Real Data Experiments
5 Conclusion
References
Speeding up of the Nelder-Mead Method by Data-Driven Speculative Execution
1 Introduction
2 Related Work
2.1 Problem Setting of Hyperparameter Optimization
2.2 Parallel HPO Methods
2.3 The Nelder-Mead (NM) Method
3 Speculative Execution for the NM Method
3.1 The Future Iterates of the NM Method
3.2 The Frequency of Each Operation
3.3 The Data-Driven Speculative NM Method
4 Experimental Evaluation
4.1 Experimental Conditions
4.2 The Comparison with Parallel HPO Methods
4.3 The Comparison with Other Parallel NM Methods
5 Conclusions
References
Efficient Bayesian Optimization Based on Parallel Sequential Random Embeddings
1 Introduction
2 Related Works
3 Settings and Random Embedding
4 Proposed Method
5 Experiment
5.1 Experimental Method
5.2 Results and Discussion
6 Conclusion
References
Applications, Medical and Robotics
Fundus Image Classification and Retinal Disease Localization with Limited Supervision
1 Introduction
2 Related Work
3 Method
3.1 Revisiting Grad-CAM
3.2 Heatmap Selection by Physician Intervention (PI)
3.3 Guidance with Attention Mining
4 Experiments
4.1 Datasets and Experimental Settings
4.2 Performance on Classification
4.3 Physician Intervention Is Necessary
4.4 Qualitative Results on Lesion Localization
5 Conclusion
References
OSTER: An Orientation Sensitive Scene Text Recognizer with CenterLine Rectification
1 Introduction
2 Related Works
3 Methodology
3.1 Centerline Rectification Network
3.2 Orientation Sensitive Network
3.3 Attention Selective Sequence-to-Sequence Network
3.4 Training
4 Experiments
4.1 Datasets Setup
4.2 Implementation Details
4.3 Performance of the ASN
4.4 Performances of the CRN
4.5 Performances on Public Datasets
5 Conclusion
References
On Fast Point Cloud Matching with Key Points and Parameter Tuning
1 Introduction
2 Related Work
3 Keypoint Selection
3.1 ISS3D
3.2 Harris3D
3.3 SIFT
4 Methodology
4.1 Keypoint Evaluation
4.2 Registration Pipeline
4.3 Parameter Tuning
5 Evaluation and Discussion
5.1 Data Sets
5.2 Results
6 Conclusion
References
Detection of Critical Camera Configurations for Structure from Motion Using Random Forest
1 Introduction
2 Related Work
2.1 Model Selection
2.2 Conjugate Rotation
2.3 Quality Measure
3 Classification
3.1 Features
3.2 Detection
4 Evaluation
4.1 Datasets
4.2 Analysis of Parameters
4.3 Analysis of Features
4.4 Integration into a Structure from Motion Framework
5 Conclusion
References
Computer Vision and Robot Vision
Visual Counting of Traffic Flow from a Car via Vehicle Detection and Motion Analysis
1 Introduction
2 Vehicle Detection in Video Frame via YOLO3
3 Motion Trace Generation and Vehicle Counting
3.1 Vehicles Motion Characteristics in Driving Video
3.2 Counting Vehicles Without Tracking Traces
4 Dense Vehicle Counting for Traffic Flow Estimation
5 Experiments
5.1 Naturalistic Driving Data for Testing
5.2 Counting Accuracy Along a Route
6 Conclusion
References
Directly Optimizing IoU for Bounding Box Localization
1 Introduction
2 Related Work
3 Methodology
3.1 Smooth IoU Loss
4 Experiments
4.1 Evaluation and Discussion
5 Conclusion
References
Double Refinement Network for Room Layout Estimation
1 Introduction
2 Related Works
3 Heat Map Estimation via Double Refinement Net
3.1 Network Architecture
4 Post-processing
4.1 Key Points Coordinate Estimation
4.2 Layout Ranking
5 Results
6 Conclusions
References
Finger Tracking Based Tabla Syllable Transcription
1 Introduction
2 The Instrument
2.1 The Instrument Bols
3 Literature Survey
4 Dataset Used
5 Proposed Methodology
5.1 Feature Extraction and Processing
5.2 Feature Classification
6 Results and Discussions
7 Conclusion and Future Work
References
Dual Templates Siamese Tracking
1 Introduction
2 Related Works
2.1 SiameseFC Networks
2.2 Difference Hash Algorithm
3 The Proposed Approach
3.1 Dual Templates Siamese Networks
3.2 Similarity Estimate
3.3 Adaptive Gaussian Background Suppression
3.4 Algorithm Flow
4 Experiments
4.1 OTB Datasets
4.2 VOT Datasets
5 Conclusion
References
Structure Function Based Transform Features for Behavior-Oriented Social Media Image Classification
1 Introduction
2 Related Work
3 Proposed Method
3.1 Foreground and Background Region Detection
3.2 Structural Function Based Transform (SFBT) for Feature Extraction
3.3 Behavior Oriented Social Media Image Classification
4 Experimental Results
4.1 Experiments for Two Class Classification
4.2 Experiments for Multi-class Classification
5 Conclusion and Future Work
References
Single Image Reflection Removal Based on GAN with Gradient Constraint
1 Introduction
2 Related Work
2.1 Optimization Based Methods
2.2 Deep Learning Based Methods
3 Supporting Methods
3.1 Synthetic Reflection Image
3.2 Generative Adversarial Networks
4 Proposed Method
4.1 Network Model
4.2 Loss Functions for Generator
4.3 Loss Function for Discriminator
4.4 Training Dataset
4.5 Training
4.6 Rotate Averaging Process
5 Experimental Results
5.1 Results
5.2 The Effectiveness of Our Loss Function and Training Method
6 Conclusion
References
SSA-GAN: End-to-End Time-Lapse Video Generation with Spatial Self-Attention
1 Introduction
2 Related Work
2.1 Generative Adversarial Networks
2.2 Video Generation
2.3 Video Prediction
2.4 Self Attention Mechanism
3 Our Approach
3.1 Spatial Self-Attention GAN
3.2 Spatial Self-Attention Module
3.3 Spatial Self-Attention GAN Objectives
4 Implementation Details
5 Experiments
5.1 Datasets
5.2 Experiments on the Cloud Time-Lapse Dataset
5.3 Experiments on the Beach Dataset
5.4 Discussion
6 Conclusion
References
Semantic Segmentation of Railway Images Considering Temporal Continuity
1 Introduction
2 Semantic Segmentation Considering Temporal Continuity
2.1 Overview of the Proposed Method
2.2 Semantic Segmentation Using Multiple Frames
3 Experimental Evaluation
3.1 Class Settings of the Railway Environment
3.2 Datasets
3.3 Experiment
4 Discussion and Applications
4.1 Improvement of Semantic Segmentation Accuracy
4.2 Effects of Using Class Likelihoods
4.3 Accuracy of Flow Estimation
4.4 Possible Applications to the Railway Environment
5 Conclusion
References
Real Full Binary Neural Network for Image Classification and Object Detection
1 Introduction
2 Related Work
2.1 Design Efficient Model
2.2 Quantized Neural Network
3 Real Full Binary Neural Network
3.1 Binary Weight Filters
3.2 Network Architecture
4 Experiments
4.1 Image Classification
4.2 Detection Result
5 Conclusion
References
Progressive Scale Expansion Network with Octave Convolution for Arbitrary Shape Scene Text Detection
1 Introduction
2 Related Work
2.1 Scene Text Detection
2.2 Methods for Enlarging Receptive Fields
3 Proposed Method
3.1 Overall Pipeline
3.2 Octave Convolution
3.3 Label Generation
3.4 Loss Function
3.5 Inference
4 Experimental Results
4.1 Datasets
4.2 Implementation Details
4.3 Comparisons with State-of-the-Art Methods
5 Conclusion
References
SFLNet: Direct Sports Field Localization via CNN-Based Regression
1 Introduction
2 Related Works
3 Proposed Approach
3.1 SFLNet Architecture
3.2 SFLNet Training
4 Experimental Evaluation
4.1 Dataset
4.2 Implementation Details
4.3 Ablation/Parameter Study
4.4 Comparison to Baselines
5 Conclusion
References
Trained Model Fusion for Object Detection Using Gating Network
1 Introduction
2 Related Work
2.1 Ensemble of Experts
2.2 Gating Network
2.3 Object Detection
3 Proposed Method
3.1 Overall Structure
3.2 Gating Network for Object Detection
3.3 Training Gating Network
4 Experiments
4.1 Setup
4.2 Results and Discussion
5 Conclusion
Appendices
A UA-DETRAC Dataset
B Accuracy of Trained Model on Test Dataset
References
Category Independent Object Transfiguration with Domain Aware GAN
1 Introduction
2 Related Work
3 Preliminaries
4 Domain Region and Magnitude Aware GAN
4.1 Architecture
4.2 Loss Functions
5 Implementation Details
6 Experiments
6.1 Few Object Categories Training Data
6.2 Wide Variety of Object Categories Training Data
7 Conclusion
References
New Moments Based Fuzzy Similarity Measure for Text Detection in Distorted Social Media Images
1 Introduction
2 Related Work
3 Proposed Method
3.1 Text Detection for Distorted Images
3.2 Moments Based Fuzzy Logic Similarity Measure for Text Detection
4 Experimental Results
4.1 Experiments on Distorted Social Media Images
4.2 Experiments on Benchmark Dataset of Natural Scene Images
5 Conclusion and Future Work
References
Fish Detection Using Convolutional Neural Networks with Limited Training Data
1 Introduction
2 Related Work
2.1 Background Subtraction
2.2 Convolutional Neural Network
3 Approach
3.1 Network Architecture
3.2 Data Augmentation Pre-processing
3.3 Activation Function and Loss Function
3.4 Post-processing
4 Experiments
4.1 Data Collection
4.2 Results
5 Conclusion
References
A New U-Net Based License Plate Enhancement Model in Night and Day Images
1 Introduction
2 Related Work
3 Proposed Model
3.1 The Proposed U-Net Based Enhancement Model
3.2 Text Detection for the Enhancement Images
4 Experimental Results
4.1 Evaluating the Proposed Enhancement Model
4.2 Validating the Proposed Enhancement Model Through Text Detection
5 Conclusion and Future Work
References
Integration of Biologically Inspired Pixel Saliency Estimation and IPDA Filters for Multi-target Tracking
1 Introduction
1.1 Target Detection via Bio-inspired Vision
1.2 Multi-target Tracking in Cluttered Environments
1.3 The Tracking Filter: Integrated Probabilistic Data Association
1.4 Statement of Contribution
2 Data Collection
3 Methodology
3.1 Target Extraction
3.2 Dropout Reduction via Linear Prediction
4 Experimental Validation
5 Conclusion
References
The Effectiveness of Noise in Data Augmentation for Fine-Grained Image Classification
1 Introduction
2 Related Work
3 Methods
3.1 Data Augmentation from the Web
3.2 NTS Network
3.3 Maximum Entropy Learning
4 Experiments
4.1 Datasets
4.2 Implementations
4.3 Results
4.4 Ablation Studies
5 Conclusion
References
Attention Recurrent Neural Networks for Image-Based Sequence Text Recognition
1 Introduction
2 Related Work
2.1 Attention Based RNNs
2.2 Attention Models for Image-Based Sequence Text Recognition
3 ARNNS
3.1 The Attention Gate
3.2 Attention Recurrent Neural Networks
4 ARNNs for Image-Based Sequence Text Recognition
5 Experiments and Results
5.1 House Number Recognition on the SVHN Dataset
5.2 Handwritten Word Recognition on the IAM Off-Line Dataset
5.3 Scene Text Recognition Based on the CRNN Architecture
6 Conclusion
References
Nighttime Haze Removal with Glow Decomposition Using GAN
1 Introduction
2 Preliminary Knowledge
2.1 Conditional Generative Adversarial Network
2.2 Atrous Convolution
3 Single Image Nighttime Haze Removal
3.1 Nighttime Haze Optical Model
3.2 Glow Removal with Generative Adversarial Network
3.3 Computation of Atmospheric Light
3.4 Estimation of Transmission Map
3.5 Scene Recovery
4 Experiment Results
4.1 Qualitative Analysis
4.2 Quantitative Analysis
5 Conclusion
References
Finding Logo and Seal in Historical Document Images - An Object Detection Based Approach
1 Introduction
2 Related Work
3 Motivation and Dataset
3.1 Data Aquisitition and Annotation
4 Methodology
4.1 Object Detection - Brief Overview
4.2 YOLOv3-A Fully Connected Network
4.3 Experimental Framework
5 Experimental Results and Discussions
5.1 YOLOv3 Network Learning Details and Obtained Results
5.2 Results on Tobacco800 Dataset
6 Conclusion and Future Work
References
KIL: Knowledge Interactiveness Learning for Joint Depth Estimation and Semantic Segmentation
1 Introduction
2 Related Work
3 Framework
3.1 Network Arichtecture
3.2 Knowledge Interactiveness Learning Module
3.3 Network Training
4 Experiment
4.1 Ablation Study
4.2 Comparisions with the State-of-the-Art Methods
5 Conclusion
References
Road Scene Risk Perception for Intelligent Vehicles Using End-to-End Affordance Learning and Visual Reasoning
1 Introduction
2 Related Work on Road Scene Understanding
2.1 Modular Methods
2.2 End-to-End Learning
2.3 Affordance Learning
2.4 Visual Reasoning
3 Proposed Method
3.1 Road Scene Risk Perception Using End-to-End Affordance Learning
3.2 Weakly Supervised Visual Reasoning of Underlying Risk Factors
4 Experimental Results and Evaluation
4.1 Using Saliency Maps for Visual Reasoning
4.2 Different Evaluation Strategy to Assess Safety and Efficiency of the Model
4.3 Comparison with Other Studies
5 Applications of Road Scene Risk Perception
6 Conclusion and Future Work
References
Image2Height: Self-height Estimation from a Single-Shot Image
1 Introduction
2 Related Work
2.1 Geometric Approach with Dynamic Scene Information
2.2 Recognition-Based Approach
2.3 Visualization Techniques for CNNs
3 Approach
3.1 Model
3.2 Fine Tuning and Preprocessing
3.3 Visualization and Sensitivity Analysis
4 Experiments
4.1 Data Collection
4.2 Accuracy Evaluation
4.3 Visualization of Important Regions
4.4 Sensitivity Analysis
5 Discussions
6 Conclusions
References
Segmentation of Foreground in Image Sequence with Foveated Vision Concept
1 Introduction
2 Foreground Detection Framework
3 Intensity and Motion Features
4 Photometric and Textural Features
5 Result
6 Conclusion
References
Bi-direction Feature Pyramid Temporal Action Detection Network
1 Introduction
2 Related Work
3 Methods
3.1 Network
3.2 Train
3.3 Inference
4 Experiments
4.1 Dataset and Setup
4.2 Ablation Study
4.3 Compare with State-of-the-Art
5 Conclusion
References
Hand Segmentation for Contactless Palmprint Recognition
1 Introduction
2 Related Work
2.1 U-Net ch64uspsnet
2.2 TernausNet ch64ternausnet
2.3 PSPNet ch64pspnet
2.4 DeepLab v3 ch64deeplabv3
3 Proposed Method
3.1 Encoder
3.2 PPM
3.3 Decoder
3.4 Loss Function for Hand Segmentation
4 Experiments and Discussion
5 Conclusion
References
Selecting Discriminative Features for Fine-Grained Visual Classification
1 Introduction
2 Related Work
2.1 Fine-Grained Visual Classification
2.2 Part Localization
3 Proposed Method
3.1 Feature Augmentation
3.2 Optimization
4 Experiments
4.1 Datasets
4.2 Implementation Details
4.3 Comparison with State-of-the-Art Methods
5 Conclusions
References
Author Index