Explainable Deep Learning AI: Methods and Challenges presents the latest works of leading researchers in the XAI area, offering an overview of the XAI area, along with several novel technical methods and applications that address explainability challenges for Deep Learning AI systems. The book overviews XAI and then covers a number of specific technical works and approaches for Deep Learning, ranging from general XAI methods to specific XAI applications, and finally, with user-oriented evaluation approaches. It also explores the main categories of explainable AI – Deep Learning, which become the necessary condition in various applications of Artificial Intelligence (AI).
Artificial Intelligence (AI) techniques, especially those based on Deep Learning (DL), have become extremely effective on a very large variety of tasks, sometimes performing even better than human experts. However, they also have a number of problems: they generally operate in mostly opaque and/or intractable ways, their very good performance is only statistical and they can fail even on apparently obvious cases, they can make biased decisions, and they can be quite easily manipulated through adversarial attacks, to cite a few. These limitations prevent their adoption in applications of great economic or societal interest, especially for critical or sensible applications like autonomous driving, medical diagnosis, or loan approvals.
Considering this, a lot of research has been conducted in order to increase the trust-worthiness of DL-based AI systems by providing explanations understandable by human users for the decisions made by these systems. The aim of this book is to present recent and original contributions covering the main approaches in the domain of explainable DL, either for expert or for layman users. Two main types of approaches are presented: the “post hoc” or “model agnostic” ones, in which the operation of an already available “black box” system is modeled and explained, and the intrinsic ones, in which systems are specifically designed as “white boxes” with an interpretable mode of operation.
The groups of methods such as back-propagation and perturbation-based methods are explained, and the application to various kinds of data classification are presented.
Provides an overview of main approaches to Explainable Artificial Intelligence (XAI) in the Deep Learning realm, including the most popular techniques and their use, concluding with challenges and exciting future directions of XAI
Explores the latest developments in general XAI methods for Deep Learning
Explains how XAI for Deep Learning is applied to various domains like images, medicine and natural language processing (NLP)
Provides an overview of how XAI systems are tested and evaluated, specially with real users, a critical need in XAI
Author(s): Jenny Benois-Pineau, Romain Bourqui, Dragutin Petkovi
Publisher: Academic Press
Year: 2023
Language: English
Pages: 348
City: London
Front Cover
Explainable Deep Learning AI
Copyright
Contents
List of contributors
Preface
1 Introduction
2 Explainable deep learning: concepts, methods, and new developments
2.1 Introduction
2.2 Concepts
2.2.1 Explaining, interpreting, and understanding
2.2.2 Desiderata and explanation quality
2.2.3 Explaining linear models
2.2.4 Explaining signal vs. explaining noise
2.3 Methods
2.3.1 Overview of attribution methods
2.3.1.1 Perturbation-based methods
2.3.1.2 Gradient-based methods
2.3.1.3 Surrogate-based methods
2.3.1.4 Propagation-based methods
2.3.2 Other types of XAI methods
2.4 New developments
2.4.1 XAI-based model improvement
2.4.2 The neuralization trick
2.5 Limitations and future work
Acknowledgment
References
3 Compact visualization of DNN classification performances for interpretation and improvement
3.1 Introduction
3.2 Previous works
3.2.1 Visualization for the interpretation of deep neural networks
3.2.2 Hilbert curve in information visualization
3.3 Proposed method for compact visualization of DNN classification performances
3.3.1 Domain level
3.3.2 Abstraction level
3.3.3 Technique level
3.3.4 Algorithm level
3.3.5 Output interpretation
3.4 Experimental protocol
3.4.1 Scenarios
3.4.2 Implementation and execution infrastructure
3.5 Results and discussion
3.5.1 Visual analysis of method results
3.5.2 Improvement to simplification scenario
3.5.3 Discussion
3.5.4 Future work
3.6 Conclusion
Acknowledgments
References
4 Characterizing a scene recognition model by identifying the effect of input features via semantic-wise attribution
4.1 Introduction
4.1.1 Model interpretability
4.1.2 Perturbation methods
4.1.3 Scope of perturbation methods
4.1.4 Overview of the proposed interpretation method
4.2 Semantic-wise attribution
4.2.1 Preliminaries
Scene recognition model
Semantic segmentation model
4.2.2 Score deviation
Interpretation of score deviation
4.2.3 Perturbation strategies
4.2.4 Score deviation map
4.2.5 Class-wise statistics
Relevant semantic classes
Irrelevant semantic classes
Distracting semantic classes
4.3 Experimental results
4.3.1 Overview of the experiments
4.3.2 Score deviation maps
4.3.3 Relevant, irrelevant, and distracting semantic classes
4.3.3.1 Relevant semantic classes for scene prediction
4.3.3.2 Irrelevant semantic classes for scene prediction
4.3.3.3 Distracting semantic classes for scene prediction
4.4 Conclusions
Acknowledgments
References
5 A feature understanding method for explanation of image classification by convolutional neural networks
5.1 Introduction
5.2 Principles of white-box explanation methods
5.3 Explanation methods
5.3.1 Gradient backpropagation
5.3.2 SmoothGrad
5.3.3 Grad-CAM
5.3.4 Layer-wise Relevance Propagation (LRP)
5.3.5 Feature-based Explanation Method (FEM)
5.4 The proposed improvement – modified FEM
5.4.1 Squeeze-Excitation (SE) block
5.4.2 Modified FEM
5.4.3 Application of FEM and modified FEM for COVID-19 classification
5.4.4 Evaluation metrics for the evaluation of explanation maps
5.5 Experimental results
5.5.1 Dataset details
5.5.2 Binary classifier explanation maps
5.5.3 FEM vs. modified FEM
5.6 Conclusion
Acknowledgments
References
6 Explainable deep learning for decrypting disease signatures in multiple sclerosis
6.1 Introduction
6.2 State-of-the-art
6.2.1 EXplainable Artificial Intelligence (XAI)
6.2.1.1 Backpropagation
6.2.1.2 Guided backpropagation
6.2.1.3 Layerwise relevance propagation
6.2.2 EXplainable AI: application to multiple sclerosis
6.3 Materials and methods
6.3.1 Population, data acquisition, and image processing
6.3.2 3D-CNN network architecture
6.3.2.1 Confounding variables influence assessment
6.3.3 Convolutional neural networks visualization methods
6.3.4 Relevance heatmap analysis
6.4 Results
6.4.1 Qualitative assessment of the relevance heatmaps
6.4.2 Quantitative assessment of the heatmaps
6.5 Discussion
6.5.1 Limitations and future works
6.6 Conclusions
Acknowledgments
References
7 Explanation of CNN image classifiers with hiding parts
7.1 Introduction
7.2 Explanation methods
7.3 Recursive division approach
7.3.1 Division
7.3.2 Complementary images
7.3.3 RD algorithm
7.4 Quality of the model
7.5 Experimental modeling
7.5.1 CNN for The Oxford-IIIT Pet Dataset
7.5.2 CNN for food and crack datasets
7.5.2.1 Food-5K dataset
7.5.2.2 UEC FOOD 100/256 dataset
7.5.2.3 Crack dataset
7.5.3 CNN for image scene classification problem dataset
7.5.4 The quality of black-box model
7.5.5 Explanation success rate for different clustering methods
7.5.6 Quality of explanation: LIME vs RD
7.5.7 Time performance
7.5.8 Examples
7.6 Conclusion
Acknowledgments
References
8 Remove to improve?
8.1 Introduction
8.2 Previous work
8.2.1 Neural network visualization and understanding
8.2.2 Pruning
8.3 Definitions
8.4 Experiments
8.4.1 Experimental settings
8.4.2 Class-wise accuracy changes
8.4.3 Filters' contribution for each class recognition
8.4.4 Class-wise pruned filter similarity and semantic similarity
8.4.5 Groups of classes G
8.4.5.1 Nonoverlapping groups of classes
8.4.5.2 Overlapping groups of classes
Mapping to lower-dimensional space
8.4.6 Pruned filter similarity and semantic similarity between k-closest neighbors
8.4.7 Changes in pruned filter similarity with k-closest neighbors with pruning
8.5 Model selection
8.6 Discussion and conclusions
Glossary
Acknowledgments
References
9 Explaining CNN classifier using association rule mining methods on time-series
9.1 Introduction
9.2 Related work
9.3 Background
9.3.1 Classification
9.3.1.1 Convolutional Neural Network — CNN
9.3.2 Data preprocessing
9.3.3 Association Rule Mining — ARM
9.4 Methods
9.4.1 Scalable Bayesian Rule Lists — SBRL
9.4.2 Rule-based regularization method
9.4.3 Gini regularization method
9.5 Evaluation metrics
9.6 Experimental results
9.6.1 Datasets
9.6.2 Experimental setup
9.6.3 Results
9.7 Conclusion and future work
References
10 A methodology to compare XAI explanations on natural language processing
10.1 Introduction
10.1.1 Mathematical notations
10.2 Related works
10.2.1 Types of approaches
10.2.2 Local explanation with LIME and anchors
10.2.3 Attention mechanism
10.2.4 Evaluate explanations
10.3 Generating explanations
10.3.1 Yelp use case
10.3.2 LEGO use case
10.3.3 Generating human attention ground truth
10.4 Evaluation without end users
10.4.1 Quantitative analysis
10.4.2 Qualitative analysis
10.5 Psychometric user study
10.5.1 Experimental protocol
10.5.2 Collecting users' preferences
10.5.3 Computing IOU
10.5.4 Users' preference
10.5.5 Analysis
10.6 Conclusion
Glossary
References
11 Improving malware detection with explainable machine learning
11.1 Introduction
11.2 Background
11.2.1 Android
11.2.2 Android ransomware
11.2.3 Ransomware detection
11.3 Explanation methods
11.3.1 Gradient-based explanation methods
11.4 Explaining Android ransomware
11.4.1 Challenges
11.4.2 Approach
11.5 Experimental analysis
11.5.1 Setting
11.5.2 Explanation distribution
11.5.3 Explanation analysis
11.5.3.1 Evaluation by class
11.5.3.2 Evaluation by ransomware family
11.5.3.3 Evaluation by ransomware date
11.5.3.4 Evaluation with a reduced feature set
11.6 Discussion
11.6.1 Explanation baseline
11.6.2 Feature cardinality
11.6.3 Feature granularity
11.6.4 Feature robustness
11.7 Conclusion
References
12 Explainability in medical image captioning
12.1 Introduction
12.2 Related work
12.2.1 Medical image captioning
12.2.2 AI explainability
12.3 Methodology
12.3.1 Data preprocessing
12.3.2 Encoder–decoder with attention for caption prediction
12.3.3 Caption generation explainability
12.4 Experimental results
12.4.1 Dataset
12.4.2 Experimental setup
12.4.3 Evaluation metrics
12.4.4 Results
12.5 Conclusion and future work
Acknowledgment
References
13 User tests & techniques for the post-hoc explanation of deep learning
13.1 Introduction
13.1.1 What is an explanation? Pre-hoc versus post-hoc
13.1.2 Post-hoc explanations: four approaches
13.1.3 Example-based explanations: factual, counterfactual, and semifactual
13.1.4 Outline of chapter
13.2 Post-hoc explanations using factual examples
13.2.1 Factual explanations of images
13.2.2 Factual explanations of time series
13.2.3 User studies of post-hoc factual explanations
13.3 Counterfactual & semifactual explanations: images
13.3.1 PIECE: generating contrastive explanations for images
13.3.2 PIECE+: designing a better generative method
13.3.2.1 PIECE+: the method
13.3.2.2 Results: PIECE+
Sample explanations
Automatically selected counterfactuals
Conclusion: PIECE improvements
13.4 Contrastive explanations: time series
13.4.1 Native guide: generating contrastive explanations for time series
13.4.2 Extending native guide: using Gaussian noise
13.5 User studies on contrastive explanations
13.6 Conclusions
Acknowledgments
References
14 Theoretical analysis of LIME
14.1 Introduction
14.2 LIME for images
14.2.1 Overview of the method
14.2.2 Theoretical analysis
14.3 LIME for text data
14.3.1 Overview of the method
14.3.2 Theoretical analysis
14.4 LIME for tabular data
14.4.1 Overview of the method
14.4.2 Theoretical analysis
14.5 Conclusion
Acknowledgments
References
15 Conclusion
Index
Back Cover