With the increasing use of AI in high-stakes domains such as medicine, law, and defense, organizations spend a lot of time and money to make ML models trustworthy. Many books on the subject offer deep dives into theories and concepts. This guide provides a practical starting point to help development teams produce models that are secure, more robust, less biased, and more explainable.
Authors Yada Pruksachatkun, Matthew McAteer, and Subhabrata Majumdar translate best practices in the academic literature for curating datasets and building models into a blueprint for building industry-grade trusted ML systems. With this book, engineers and data scientists will gain a much-needed foundation for releasing trustworthy ML applications into a noisy, messy, and often hostile world.
You'll learn:
• Methods to explain ML models and their outputs to stakeholders
• How to recognize and fix fairness concerns and privacy leaks in an ML pipeline
• How to develop ML systems that are robust and secure against malicious attacks
• Important systemic considerations, like how to manage trust debt and which ML obstacles require human intervention
Author(s): Yada Pruksachatkun, Matthew Mcateer, Subhabrata Majumdar
Edition: 1
Publisher: O'Reilly Media
Year: 2023
Language: English
Commentary: Publisher's PDF
Pages: 300
City: Sebastopol, CA
Tags: Anonymity; Artificial Intelligence; Machine Learning; Security; Ethics; Privacy; Model Explainability; Robustness; Fairness
Copyright
Table of Contents
Preface
Implementing Machine Learning in Production
The Transformer Convergence
An Explosion of Large and Highly Capable ML Models
Why We Wrote This Book
Who This Book Is For
AI Safety and Alignment
Use of HuggingFace PyTorch for AI Models
Foundations
Conventions Used in This Book
Using Code Examples
O’Reilly Online Learning
How to Contact Us
Acknowledgments
Chapter 1. Privacy
Attack Vectors for Machine Learning Pipelines
Improperly Implemented Privacy Features in ML: Case Studies
Case 1: Apple’s CSAM
Case 2: GitHub Copilot
Case 3: Model and Data Theft from No-Code ML Tools
Definitions
Definition of Privacy
Proxies and Metrics for Privacy
Legal Definitions of Privacy
k-Anonymity
Types of Privacy-Invading Attacks on ML Pipelines
Membership Attacks
Model Inversion
Model Extraction
Stealing a BERT-Based Language Model
Defenses Against Model Theft from Output Logits
Privacy-Testing Tools
Methods for Preserving Privacy
Differential Privacy
Stealing a Differentially Privately Trained Model
Further Differential Privacy Tooling
Homomorphic Encryption
Secure Multi-Party Computation
SMPC Example
Further SMPC Tooling
Federated Learning
Conclusion
Chapter 2. Fairness and Bias
Case 1: Social Media
Case 2: Triaging Patients in Healthcare Systems
Case 3: Legal Systems
Key Concepts in Fairness and Fairness-Related Harms
Individual Fairness
Parity Fairness
Calculating Parity Fairness
Scenario 1: Language Generation
Scenario 2: Image Captioning
Fairness Harm Mitigation
Mitigation Methods in the Pre-Processing Stage
Mitigation Methods in the In-Processing Stage
Mitigation Methods in the Post-Processing Stage
Fairness Tool Kits
How Can You Prioritize Fairness in Your Organization?
Conclusion
Further Reading
Chapter 3. Model Explainability and Interpretability
Explainability Versus Interpretability
The Need for Interpretable and Explainable Models
A Possible Trade-off Between Explainability and Privacy
Evaluating the Usefulness of Interpretation or Explanation Methods
Definitions and Categories
“Black Box”
Global Versus Local Interpretability
Model-Agnostic Versus Model-Specific Methods
Interpreting GPT-2
Methods for Explaining Models and Interpreting Outputs
Inherently Explainable Models
Local Model-Agnostic Interpretability Methods
Global Model-Agnostic Interpretability Methods
Explaining Neural Networks
Saliency Mapping
Deep Dive: Saliency Mapping with CLIP
Adversarial Counterfactual Examples
Overcome the Limitations of Interpretability with a Security Mindset
Limitations and Pitfalls of Explainable and Interpretable Methods
Risks of Deceptive Interpretability
Conclusion
Chapter 4. Robustness
Evaluating Robustness
Non-Adversarial Robustness
Step 1: Apply Perturbations
Step 2: Defining and Applying Constraints
Deep Dive: Word Substitution with Cosine Similarity Constraints
Adversarial Robustness
Deep Dive: Adversarial Attacks in Computer Vision
Creating Adversarial Examples
Improving Robustness
Conclusion
Chapter 5. Secure and Trustworthy Data Generation
Case 1: Unsecured AWS Buckets
Case 2: Clearview AI Scraping Photos from Social Media
Case 3: Improperly Stored Medical Data
Issues in Procuring Real-World Data
Using the Right Data for the Modeling Goal
Consent
PII, PHI, and Secrets
Proportionality and Sampling Techniques
Undescribed Variation
Unintended Proxies
Failures of External Validity
Data Integrity
Setting Reasonable Expectations
Tools for Addressing Data Collection Issues
Synthetically Generated Data
DALL·E, GPT-3, and Synthetic Data
Improving Pattern Recognition with Synthetic Data
Deep Dive: Pre-Training a Model with a Process-Driven Synthetic Dataset
Facial Recognition, Pose Detection, and Human-Centric Tasks
Object Recognition and Related Tasks
Environment Navigation
Unity and Unreal Environments
Limitations of Synthetic Data in Healthcare
Limitations of Synthetic Data in NLP
Self-Supervised Learned Models Versus Giant Natural Datasets
Repurposing Quality Control Metrics for Security Purposes
Conclusion
Chapter 6. More State-of-the-Art Research Questions
Making Sense of Improperly Overhyped Research Claims
Shallow Human-AI Comparison Antipattern
Downplaying the Limitations of the Technique Antipattern
Uncritical PR Piece Antipattern
Hyperbolic or Just Plain Wrong Antipattern
Getting Past These Antipatterns
Quantized ML
Tooling for Quantized ML
Privacy, Bias, Interpretability, and Stability in Quantized ML
Diffusion-Based Energy Models
Homomorphic Encryption
Simulating Federated Learning
Quantum Machine Learning
Tooling and Resources for Quantum Machine Learning
Why QML Will Not Solve Your Regular ML Problems
Making the Leap from Theory to Practice
Chapter 7. From Theory to Practice
Part I: Additional Technical Factors
Causal Machine Learning
Sparsity and Model Compression
Uncertainty Quantification
Part II: Implementation Challenges
Motivating Stakeholders to Develop Trustworthy ML Systems
Trust Debts
Important Aspects of Trust
Evaluation and Feedback
Trustworthiness and MLOps
Conclusion
Chapter 8. An Ecosystem of Trust
Tooling
LiFT
Datasheets
Model Cards
DAG Cards
Human-in-the-Loop Steps
Oversight Guidelines
Stages of Assessment
The Need for a Cross-Project Approach
MITRE ATLAS
Benchmarks
AI Incident Database
Bug Bounties
Deep Dive: Connecting the Dots
Data
Pre-Processing
Model Training
Model Inference
Trust Components
Conclusion
Appendix A. Synthetic Data Generation Tools
Appendix B. Other Interpretability and Explainability Tool Kits
Interpretable or Fair Modeling Packages
Other Python Packages for General Explainability
Index
About the Authors
Colophon