Machine Learning Safety

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Machine Learning algorithms allow computers to learn without being explicitly programmed. Their application is now spreading to highly sophisticated tasks across multiple domains, such as medical diagnostics or fully autonomous vehicles. While this development holds great potential, it also raises new safety concerns, as Machine Learning has many specificities that make its behaviour prediction and assessment very different from that for explicitly programmed software systems. This book addresses the main safety concerns with regard to Machine Learning, including its susceptibility to environmental noise and adversarial attacks. Such vulnerabilities have become a major roadblock to the deployment of Machine Learning in safety-critical applications. The book presents up-to-date techniques for adversarial attacks, which are used to assess the vulnerabilities of Machine Learning models; formal verification, which is used to determine if a trained Machine Learning model is free of vulnerabilities; and adversarial training, which is used to enhance the training process and reduce vulnerabilities. This book addresses the safety and security perspective of Machine Learning, focusing on its vulnerability to environmental noise and various safety and security attacks. Machine learning has achieved human-level intelligence in long-standing tasks such as image classification, game playing, and natural language processing (NLP). However, like other complex software systems, it is not without any shortcomings, and a number of hidden issues have been identified in the past years. The vulnerability of machine learning has become a major roadblock to the deployment of Machine Learning in safety-critical applications. We will first cover falsification techniques to identify the safety vulnerabilities on various Machine Learning models, and then devolve them into different solutions to evaluate, verify, and reduce the vulnerabilities. The falsification is mainly done through various attacks such as robustness attacks, data poisoning attacks, etc. Compared with the popularity of attacks, solutions are less mature, and we consider solutions that have been broadly discussed and recognised (such as formal verification, adversarial training, and privacy enhancement), together with several new directions (such as testing, safety assurance, and reliability assessment). Specifically, this book includes four technical parts. Part I introduces basic concepts of Machine Learning, as well as the definitions of its safety and security issues. This is followed by the introduction of techniques to identify the safety and security issues in Machine Learning models (including both transitional Machine Learning models and Deep Learning models) in Part II. Then, we present in Part III two categories of safety solutions that can verify (i.e. determine with provable guarantees) the robustness of Deep Learning and that can enhance the robustness, generalisation, and privacy of Deep Learning. In Part IV, we discuss several extended safety solutions that consider either other Machine Learning models or other safety assurance techniques. We also include technical appendices. The book aims to improve the awareness of the readers, who are future developers of Machine Learning models, on the potential safety and security issues of Machine Learning models. More importantly, it includes up-to-date content regarding the safety solutions for dealing with safety and security issues. While these solution techniques are not sufficiently mature by now, we are expecting that they can be further developed, or can inspire new ideas and solutions, towards the ultimate goal of making Machine Learning safe. We hope this book can pave the way for the readers to become researchers and leaders in this new area of Machine Learning safety, and the readers will not only learn technical knowledge but also gain hands-on practical skills. Some source codes and teaching materials are made available at GitHub.

Author(s): Xiaowei Huang, Gaojie Jin, Wenjie Ruan
Series: Artificial Intelligence: Foundations, Theory, and Algorithms
Publisher: Springer
Year: 2023

Language: English
Pages: 319

Preface
Contents
Acronyms
Part I Safety Properties
1 Machine Learning Basics
1.1 What Is Machine Learning?
1.2 Data Representation
1.2.1 Representing Instances Using Feature Vectors
1.2.2 Feature Space
1.3 Datasets
1.3.1 Independent and Identically Distributed (i.i.d.) Assumption
1.3.2 Training, Test, and Validation Datasets
1.4 Hypothesis Space and Inductive Bias
1.5 Learning Tasks
1.5.1 Supervised Learning
1.5.2 Unsupervised Learning
1.5.3 Semi-supervised Learning
1.6 Learning Schemes
1.7 Density Estimation
1.7.1 Maximum Likelihood Estimation (MLE)
1.7.2 Maximum A Posteriori (MAP) Queries
1.7.3 Expected A Posteriori (EAP)
1.8 Ground Truth and Underlying Data Distribution
2 Model Evaluation Methods
2.1 Test Accuracy and Error
2.2 Accuracy w.r.t. Training Set Size
2.2.1 How to Plot the Curve?
2.3 Multiple Training/Test Partitions
2.3.1 Random Resampling
2.3.2 Cross Validation
2.4 Confusion Matrix
2.4.1 Binary Classification Problem
2.4.2 Other Accuracy Metrics
2.5 Receiver Operating Characteristic (ROC) Curve
2.5.1 How to Plot ROC Curve?
2.5.2 How to Read the ROC Curves?
2.6 Precision/Recall (PR) Curves
3 Safety and Security Properties
3.1 Generalisation Error
3.2 Uncertainty
Sources of Uncertainty for Classification
3.3 Robustness and Adversarial Examples
3.3.1 Measurement of Adversarial Examples
3.3.2 Sources of Adversarial Examples
3.3.3 Robustness
3.4 Poisoning and Backdoor Attacks
3.4.1 Data Poisoning Attack
3.4.2 Backdoor Attacks
3.4.3 Success Criteria of Attack
3.5 Model Stealing
3.6 Membership Inference and Model Inversion
3.6.1 Membership Inference
3.6.2 Model Inversion
3.7 Discussion: Attacker Knowledge and Attack Occasions
4 Practice
4.1 Installation of Experimental Environment
4.1.1 Use Pip
4.1.2 Create Conda Virtual Environment
4.1.3 Test Installation
4.2 Basic Python Operations
4.3 Visualising a Synthetic Dataset
4.4 Confusion Matrix
Exercises
Exercises
Part II Safety Threats
5 Decision Tree
5.1 Learning Algorithm
5.1.1 Determine Splitting Feature
Occam's Razor
Computational Complexity for the Optimal Tree
Heuristics for Greedy Search
5.1.1.1 Gain Ratio
5.1.2 Find Best Split
On Numeric Features
5.1.3 Stopping Criteria
5.2 Classification Probability
5.3 Robustness and Adversarial Attack
5.3.1 Is x' an Adversarial Example?
5.3.2 Is this Approach Complete?
5.3.3 Sub-Optimality
5.4 Backdoor Attack
5.4.1 General Idea
5.4.2 Algorithm
5.5 Data Poisoning Attack
5.6 Model Stealing
5.7 Membership Inference
5.8 Model Inversion
5.9 Practice
5.9.1 Decision Tree
5.9.2 Command to Run
5.9.2.1 Decision Tree Construction
5.9.2.2 Adversarial Attack on Decision Tree Construction
6 K-Nearest Neighbor
6.1 Basic Learning Algorithm
6.1.1 When to Consider?
6.1.2 Classification
6.1.3 Regression
6.1.4 Issues
6.1.5 Irrelevant Features in Instance-Based Learning
6.2 Speeding up K-nn
6.2.1 Reduction of Memory Usage
6.2.2 Reduction of Computational Time Through k-d Tree
6.3 Classification Probability
6.4 Robustness and Adversarial Attack
6.4.1 Is x' an Adversarial Example?
6.4.2 Is this Approach Complete?
6.4.3 Optimality
6.5 Poisoning Attack
6.6 Model Stealing
6.7 Membership Inference
6.8 Practice
7 Linear Regression
7.1 Linear Regression
7.1.1 Variant: Linear Regression with Bias
7.1.2 Variant: Linear Regression with Lasso Penalty
7.2 Linear Classification
7.2.1 What is the Corresponding Optimisation Problem?
7.2.2 Difficulty of Finding Optimal Solution
7.3 Logistic Regression
7.3.1 Sigmoid Function
7.3.2 Force σ(wTx(i)) into Probability Value
7.4 Robustness and Adversarial Attack
7.5 Poisoning Attack
7.6 Model Stealing
7.7 Membership Inference
7.8 Practice
8 Naive Bayes
8.1 Learning Algorithm
Bayes Theorem
Estimation of P(Y)
Difficulty of Estimating P(X1,…,Xn|Y) Directly
Assumption
Algorithm
For Continuous Features
8.2 Robustness and Adversarial Attack
8.2.1 Is x' an Adversarial Example?
8.2.2 Is this Approach Complete?
8.2.3 Sub-Optimality
8.3 Poisoning Attack
8.4 Model Stealing
8.5 Membership Inference
8.6 Model Inversion
8.7 Practice
9 Loss Function and Gradient Descent
9.1 Loss Functions
9.1.1 Mean Absolute Error
9.1.2 Root Mean Squared Error (RMSE)
9.1.3 Binary Cross Entropy Cost Function
9.1.4 Categorical Cross Entropy Cost Function
9.2 Gradient Descent
9.2.1 Derivative of a Function
9.2.2 Gradient
9.2.3 Critical Points
9.2.4 Gradient Descent on Linear Regression
9.2.5 Analytical Solution on Linear Regression
10 Deep Learning
10.1 Perceptron
10.1.1 Biological vs Artificial Neurons
10.1.2 Learning Algorithm of Perceptron
10.1.3 Expressivity of Perceptron
10.1.4 Linearly Separable Function
10.1.5 Linearly Inseparable Function
10.1.6 Multi-Layer Perceptron
10.1.7 Practice
10.1.7.1 Train a Perceptron
10.1.7.2 Display Class Regions for Boolean Functions
10.2 Functional View
10.2.1 Mappings Between High-Dimensional Spaces
10.2.2 Training Objective
10.2.3 Recurrent Neural Networks
10.2.4 Learning Representation and Features
10.2.4.1 Raw Digital Representation
10.2.4.2 Feature Extraction
10.2.4.3 Data Manifold
10.2.4.4 Difficulties of Simply Using Dimensionality Reduction or Kernel
10.2.4.5 End-to-End Learning of Feature Hierarchies
10.2.4.6 Why Learn the Features?
10.2.5 Practice
10.2.5.1 Train a Fully Connected Model
10.3 Forward and Backward Computation
10.3.1 Running Example
10.3.2 Forward Computation
10.3.3 Backward Computation
10.3.3.1 Weights of Output Neurons
10.3.3.2 Weights of Hidden Neurons
10.3.3.3 Weight Update
10.3.4 Regularisation as Constraints
10.3.4.1 Overfitting
10.3.4.2 Regularization as Hard Constraint
10.3.4.3 Regularization as Soft Constraint
10.3.5 Practice
10.4 Convolutional Neural Networks
10.4.1 Functional Layers
10.4.1.1 Fully-Connected Layer
10.4.1.2 Convolutional Layer
10.4.1.3 Zero-Padding
10.4.1.4 Pooling Layer
10.4.2 Activation Functions
10.4.2.1 ReLU
10.4.2.2 Sigmoid
10.4.2.3 Softmax
10.4.3 Data Preprocessing
10.4.3.1 Mean Normalization
10.4.3.2 Standardization or Normalization
10.4.3.3 Whitening
10.4.4 Practice
10.5 Regularisation Techniques
10.5.1 Ridge Regularisation
10.5.2 Lasso Regularisation
10.5.3 Dropout
10.5.4 Early Stopping
10.5.5 Batch-Normalisation
10.6 Uncertainty Estimation
10.6.1 Estimating Total Uncertainty
10.6.2 Separating Aleatoric and Epistemic Uncertainties
10.6.3 Estimating Posterior Distribution
10.6.3.1 Monte Carlo Sampling
10.6.3.2 Variational Inference
10.6.3.3 Laplace Approximation
10.6.4 Predicting Aleatoric Uncertainty for Regression Task
10.6.5 Measuring the Quality of Uncertainty Estimation
10.6.6 Confidence Calibration based Methods
10.6.6.1 Expected Calibration Error (ECE)
10.6.6.2 Maximum Calibration Error (MCE)
10.6.7 Selective Prediction Method
10.7 Robustness and Adversarial Attack
10.7.1 Limited-Memory BFGS Algorithm
10.7.2 Fast Gradient Sign Method
10.7.3 Jacobian Saliency Map Based Attack (JSMA)
10.7.4 DeepFool
10.7.5 Carlini & Wagner Attack
10.7.6 Adversarial Attacks by Natural Transformations
10.7.6.1 Rotation and Translation
10.7.6.2 Spatially Transformed Adversarial Examples
10.7.6.3 Towards Practical Verification of Machine Learning: The Case of Computer Vision Systems (VeriVis)
10.7.7 Input-Agnostic Adversarial Attacks
10.7.7.1 Universal Adversarial Perturbations
10.7.7.2 Generative Adversarial Perturbations
10.7.8 Practice
10.8 Poisoning Attack
10.8.1 Heuristic Method
10.8.1.1 Feature Collision
10.8.1.2 Convex Polytope
10.8.1.3 Hidden Trigger Backdoor
10.8.1.4 Clean Label Backdoor
10.8.2 An Alternating Optimisation Method
10.9 Model Stealing
10.9.1 An Iterative Algorithm
10.9.2 Initiating Surrogate Model
10.9.3 Synthesising Further Instances
10.9.4 Updating Surrogate Model
10.10 Membership Inference
10.10.1 Metric Based Method
10.10.1.1 Prediction Label Metric
10.10.1.2 Prediction Loss Metric
10.10.1.3 Prediction Entropy Metric
10.10.2 Binary Classifier Based Method
10.10.2.1 Predicting Whether New Sample x is in the Training Dataset of fvictim
10.10.2.2 Enhancement of fa Through Ensemble Method
10.10.2.3 White-Box Attack
10.10.2.4 Discussion
10.11 Model Inversion
10.11.1 Optimisation Based Method
10.11.2 Training Based Method
Exercises
Part III Safety Solutions
11 Verification of Deep Learning
11.1 Robustness Properties for Verification
11.1.1 Robustness as an Optimisation Problem
11.2 Reduction to Mixed Integer Linear Programming (MILP)
11.2.1 Reduction to MILP
11.2.2 Method One for Layers
11.2.3 Method Two for Layers
11.2.4 Optimisation Objective
11.2.5 Over-Approximation with Linear Programming
11.2.6 Computation of Lower and Upper Bounds Through Lipschitz Approximation
11.2.7 Approximation of Lipschitz Constant
11.2.8 Computation of Lower and Upper Bounds
11.3 Robustness Verification via Reachability Analysis
11.3.1 Lipschitz Continuity of Deep Learning
11.3.2 Reachability Analysis of Deep Learning
11.3.3 Confidence Reachability with Guarantees
11.3.3.1 One-Dimensional Case
11.3.3.2 Dynamically Improving the Lipschitz Constant
11.3.3.3 Multi-Dimensional Case
11.3.4 A Running Numerical Example
11.3.4.1 Constraint-Solver Based Approach
11.3.4.2 Abstract Interpretation Based Approach
11.3.4.3 Reachability Analysis by DeepGO
12 Enhancement to Safety and Security of Deep Learning
12.1 Robustness Enhancement Through Min-Max Optimisation
12.1.1 Normal Adversarial Training
12.1.2 State-of-the-Art Adversarial Training Technology
12.2 Generalisation Enhancement Through PAC Bayesian Theory
12.2.1 Weight Correlation in Fully Connected Neural Network (FCN)
12.2.2 Weight Correlation in Convolutional Neural Network (CNN)
12.3 Privacy Enhancement Through Differential Privacy
12.3.1 Differential Privacy
12.3.2 Private Algorithms for Training
12.3.3 Model Agnostic Private Learning
Exercises
Part IV Extended Safety Solutions
13 Deep Reinforcement Learning
13.1 Interaction of Agent with Environment
13.2 Training a Reinforcement Agent
13.3 Sufficiency of Training Data
13.4 Statistical Evaluation Methods
13.4.1 Evaluation of DRL Training Algorithm
13.4.2 Evaluation of DRL Model
13.5 Safety Properties Through Probabilistic Computational Tree Logic
13.6 Verification of Policy Generalisation
13.6.1 Construction of a DTMC Describing the Failure Process
13.6.1.1 Mapping MDP States onto DTMC States
13.6.1.2 Estimating Transition Probabilities
13.6.1.3 Construction of Failure Process DTMC
13.7 Verification of State-Based Policy Robustness
13.7.1 Modelling Noise
13.7.2 Example Properties
13.7.3 Expansion of State Space
13.7.4 Evaluation of Atomic Propositions
13.7.5 Probabilistic Model Checking Lifted with New Atomic Propositions
13.8 Verification of Temporal Policy Robustness
13.8.1 State Space
13.8.2 Transition Relation
13.8.3 LTL Model Checking
13.9 Addressing Sim-to-Real Challenge
13.9.1 Model Construction
13.9.2 Properties
14 Testing Techniques
14.1 A General Testing Framework
14.2 Coverage Metrics for Neural Networks
14.3 Test Case Generation
14.4 Discussion
15 Reliability Assessment
15.1 Reliability Assessment for Convolutional Neural Networks
15.2 Reliability Assessment for Deep Reinforcement Learning
16 Assurance of Machine Learning Lifecycle
16.1 Assurance on Data
Expected Outcome
16.2 Sufficiency of Training Data
VC-Dimension
Learning Curve
16.3 Optimised Model Construction and Training
16.3.1 Neural Architecture Search
16.3.2 Best Practice
16.3.3 Expected Outcome
16.4 Adequacy of Verification and Validation
16.4.1 Coverage of Machine Learning Behaviours
16.4.2 Coverage of Operational Use
16.4.3 ALARP (``As Low As Reasonably Practicable")
16.4.4 Expected Outcome
16.5 Assured Partnership of AI and Humans
16.5.1 Uncertainty Estimation
16.5.2 Explainability and Interpretability
16.5.3 Contextualisation
16.5.4 Expected Outcome
17 Probabilistic Graph Models for Feature Robustness
17.1 Definition of Probabilistic Graphical Models
17.2 A Running Example
Where Does Machine Learning Play a Role?
17.3 Abstraction of Neural Network as Probabilistic Graphical Model
Extraction of Hidden Features
Discretisation of Hidden Feature Space
Construction of Bayesian Network Abstraction
Preserved Property
Exercises
Part V Appendix: Mathematical Foundations and Competition
A Probability Theory
A.1 Random Variables
A.2 Joint and Conditional Distributions
A.2.1 Joint Probability P(X,Y)
A.2.2 Conditional Probability P(X|Y)
A.3 Independence and Conditional Independence
A.3.1 Independence
A.3.2 Conditional Independence
A.4 Querying Joint Probability Distributions
A.4.1 Probability Queries
A.4.2 Maximum a Posteriori (MAP) Queries
B Linear Algebra
B.1 Scalars, Vectors, Matrices, Tensors
B.1.1 Scalar
B.1.2 Vector
B.1.3 Matrix
B.1.4 Tensor
B.2 Matrix Operations
B.2.1 Transpose of a Matrix
B.2.2 Linear Transformation
B.2.3 Identity Matrix
B.2.4 Matrix Inverse
B.3 Norms
B.4 Variance, Covariance, and Covariance Matrix
B.4.1 Variance
B.4.2 Covariance
B.4.3 Covariance Matrix
C Probabilistic Graph Models
C.1 I-Maps
C.1.1 Naive Bayes and Joint Probability
C.1.2 Independencies in a Distribution
C.1.3 Markov Assumption
C.1.4 I-Map of Graph and Factorisation of Joint Distribution
C.1.5 Perfect Map
C.2 Reasoning Patterns
C.2.1 Causal Reasoning
C.2.2 Evidential Reasoning
C.2.3 Inter-Causal Reasoning
C.2.4 Practice
C.3 D-Separation
C.3.1 Four Local Triplets
C.3.2 Indirect Causal Effect X→Z→Y
C.3.3 Indirect Evidential Effect Y→Z→X
C.3.4 Common Cause XZ→Y
C.3.5 Common Effect X→ZY
C.3.6 General Case: Active Trail and D-Separation
C.3.7 I-Equivalence
C.4 Structure Learning
C.4.1 Criteria of Structure Learning
C.4.2 Overview of Structure Learning Algorithms
C.4.3 Local: Independence Tests
C.4.4 Global: Structure Scoring
C.4.5 Practice
D Competition: Resilience to Adversarial Attack
D.1 Submissions
D.2 Source Code
D.2.1 Load Packages
D.2.2 Define Competition ID
D.2.3 Set Training Parameters
D.2.4 Toggle GPU/CPU
D.2.5 Loading Dataset and Define Network Structure
D.2.6 Adversarial Attack
D.2.7 Evaluation Functions
D.2.8 Adversarial Training
D.2.9 Define Distance Metrics
D.2.10 Supplementary Code for Test Purpose
D.3 Implementation Actions
D.3.1 Sanity Check
D.4 Evaluation Criteria
D.5 Q&A
Glossary
References
Blank Page