Algorithms for Data and Computation Privacy

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This book introduces the state-of-the-art algorithms for data and computation privacy. It mainly focuses on searchable symmetric encryption algorithms and privacy preserving multi-party computation algorithms. This book also introduces algorithms for breaking privacy, and gives intuition on how to design algorithm to counter privacy attacks. Some well-designed differential privacy algorithms are also included in this book.

Driven by lower cost, higher reliability, better performance, and faster deployment, data and computing services are increasingly outsourced to clouds. In this computing paradigm, one often has to store privacy sensitive data at parties, that cannot fully trust and perform privacy sensitive computation with parties that again cannot fully trust. For both scenarios, preserving data privacy and computation privacy is extremely important. After the Facebook–Cambridge Analytical data scandal and the implementation of the General Data Protection Regulation by European Union, users are becoming more privacy aware and more concerned with their privacy in this digital world.

This book targets database engineers, cloud computing engineers and researchers working in this field. Advanced-level students studying computer science and electrical engineering will also find this book useful as a reference or secondary text.

Author(s): Alex X. Liu, Rui Li
Publisher: Springer
Year: 2021

Language: English
Pages: 404
City: Cham

Contents
Acronyms
Part I Privacy Preserving Queries
1 Range Queries over Encrypted Data
1.1 Introduction
1.1.1 Background and Motivation
1.1.2 Threat Model
1.1.3 Security Model
1.1.4 Summary and Limitation of Prior Art
1.1.5 Proposed Approach
1.1.6 Technical Challenges and Solutions
1.1.7 Key Contributions
1.2 Related Work
1.3 PBtree Construction
1.3.1 Prefix Encoding
1.3.2 Tree Construction
1.3.3 Node Randomization Using Bloom Filters
1.3.4 Trapdoor Computation
1.3.5 Query Processing
1.3.6 False Positive Analysis
1.4 PBtree Search Optimization
1.4.1 Traversal Width Optimization
1.4.2 Traversal Depth Optimization
1.5 PBtree Update
1.5.1 PBtree Insertion Algorithm
1.5.2 PBtree Modification Algorithm
1.5.3 PBtree Deletion Algorithm
1.6 Security Analysis
1.6.1 Security Model
1.6.2 Security Proof
1.7 Experimental Evaluation
1.7.1 Experimental Methodology
1.7.1.1 Data Sets
1.7.1.2 PBtree Types
1.7.1.3 Query Types
1.7.1.4 Implementation Details
1.7.2 Evaluation of PBtree Construction
1.7.3 Query Evaluation Performance
1.7.3.1 Prefix Query Evaluation
1.7.3.2 Range Query Evaluation
1.7.4 Experimental Results on Updating
1.7.4.1 Average Time of Updating
1.7.4.2 Communication Cost of Updating
1.8 Conclusions
References
2 Fast and Scalable Range and Keyword Query Processing Over Encrypted Data with Provable Adaptive Security
2.1 Introduction
2.1.1 Motivation and Problem Statement
2.1.2 Threat Model
2.1.3 Security Model
2.1.4 Limitation of Prior Art
2.1.5 Proposed Approach
2.1.6 Novelty and Advantages Over Prior Art
2.2 Related Work
2.3 Basic IBtree Algorithms
2.3.1 Index Element Encoding
2.3.2 IBF Construction
2.3.3 IBtree Construction
2.3.4 Trapdoor Computation
2.3.5 Query Processing
2.4 Optimized IBtree Algorithms
2.4.1 IBtree Traversal Width Minimization
2.4.2 IBtree Traversal Depth Minimization
2.4.3 IBtree Compression
2.5 Security Analysis
2.6 Experimental Evaluation
2.6.1 Experimental Methodology
2.6.2 Index Size
2.6.3 Index Construction Time
2.6.4 Query Processing Time
2.6.5 Compared with PBtree and KRB
2.7 Conclusions
References
3 Nearest Neighbor Queries over Encrypted Data
3.1 Introduction
3.2 Insecurity of ASPE
3.2.1 ASPE I and II
3.2.2 Attack Method
3.2.3 Experimental Results
3.3 Hardness Analysis
3.4 Conclusions
References
4 K-Nearest Neighbor Queries Over Encrypted Data
4.1 Introduction
4.1.1 Motivations
4.1.2 Problem Formulation
4.1.3 Service Model and Design Goals
4.1.4 Comparison with Prior Arts
4.1.5 Technical Challenges and Proposed Solutions
4.1.6 SecEQP Scheme Overview
4.1.7 Main Contributions
4.2 Space Encoding
4.2.1 Projection Function Introduction
4.2.2 Space Encoding via a Single Primitive Projection Function
4.2.3 Projection Function Composition Introduction
4.2.4 Space Encoding via Projection Function Composition
4.3 kNN Protocol for Plaintext Domain
4.3.1 kNN Protocol Design
4.3.2 Analysis of kNN Protocol Parameters
4.4 Transforming kNN to Secure kNN
4.4.1 Prefix-Free Encoding
4.4.2 Operation Transformation
4.4.3 Indistinguishable Bloom Filter Tree Based Secure Index
4.4.4 SkNN Protocol (SecEQP) Design
4.4.5 Security Analysis
4.5 Performance Evaluation
4.5.1 Parameters Settings
4.5.2 Datasets, Metrics, and Implementation
4.5.3 Experiment Results
4.5.4 Improve Result Accuracy
4.6 Related Work
4.7 Conclusions
References
5 Top-k Queries for Two-Tiered Sensor Networks
5.1 Introduction
5.1.1 Motivation
5.1.2 Problem Statement
5.1.3 Adversary and Security Model
5.1.4 Limitations of Prior Art
5.1.5 Technical Challenges and Proposed Approach
5.1.6 Key Contributions
5.2 Related Work
5.3 System Model and Assumptions
5.4 Sensor Data Pre-Processing: Mapping and Partitioning
5.4.1 Approximating Uniform Distribution
5.4.2 Data Partitioning for Integrity Verification
5.4.3 Embedding Intervals with Data
5.4.4 Index Selection
5.5 Privacy Preserving Index Generation
5.5.1 Prefix Encoding and Bloom Filter Indexing
5.5.2 Randomizing Bloom Filter Indexes
5.6 Trapdoor Computation and Query Processing
5.6.1 Top-k to Top-Range Query
5.6.2 Trapdoor Computation
5.6.3 Query Execution
5.6.4 Integrity Verification for Query Results
5.6.5 False Positive Rate Analysis
5.7 Security Analysis
5.8 Performance Evaluation
5.8.1 Experimental Setup
5.8.2 Summary for Experimental Results
5.8.3 Comparison with Prior Art
5.9 Conclusions
References
Part II Privacy Preserving Computation
6 Collaborative Enforcement of Firewall Policies in Virtual Private Networks
6.1 Introduction
6.1.1 Background and Motivation
6.1.2 Technical Challenges
6.1.3 Limitations of Prior Art
6.1.4 Our Solution
6.1.5 Key Contributions
6.2 Threat Model
6.3 Background
6.4 Oblivious Comparison
6.5 Bootstrapping Protocol
6.5.1 FDD Construction
6.5.2 Range Conversion
6.5.3 Prefix Numericalization
6.5.4 Applying XOR by MSU
6.5.5 Applying XOR and HMAC by IBM
6.6 Filtering Protocol
6.6.1 Address Translation
6.6.2 Prefix Membership Verification
6.6.3 Packet Preprocessing by IBM
6.6.4 Packet Preprocessing by The Third Party
6.6.5 Packet Processing by MSU
6.7 VGuard for Deep Packet Inspection
6.7.1 The Bootstrapping Protocol
6.7.2 The Filtering Protocol
6.8 Discussion
6.8.1 Firewall Updates
6.8.2 Decision Caching
6.8.3 Decision Obfuscation vs. Decision Encryption
6.8.4 Special Treatment of IP Addresses
6.8.5 Securing Keys of MSU
6.8.6 Stateful Firewalls
6.8.7 Statistical Analysis Attack and Countermeasures
6.8.8 Hash Collision
6.9 Related Work
6.9.1 Secure Function Evaluation
6.9.2 CDCF Framework
6.9.3 Secure Queries
6.10 Experimental Results
6.10.1 Efficiency on Real-Life Firewall Policies
6.10.2 Efficiency on Synthetic Firewall Policies
6.11 Concluding Remarks
References
7 Privacy Preserving Quantification of Cross-Domain Network Reachability
7.1 Introduction
7.1.1 Background and Motivation
7.1.2 Limitation of Prior Art
7.1.3 Cross-Domain Quantification of Reachability
7.1.4 Technical Challenges
7.1.5 Our Approach
7.1.6 Summary of Experimental Results
7.1.7 Key Contributions
7.2 Related Work
7.2.1 Network Reachability
7.2.2 Privacy Preserving Set Operation
7.2.3 Privacy Preserving Collaborative Firewall Enforcement in VPN
7.3 Problem Statement and Threat Model
7.3.1 Access Control Lists (ACLs)
7.3.2 Problem Statement
7.3.3 Threat Model
7.4 Privacy-Preserving Quantification of Network Reachability
7.4.1 Privacy-Preserving Range Intersection
7.4.1.1 [Pi]: Range Transformation
7.4.1.2 [Pi]: Range to Prefix Set Conversion
7.4.1.3 [Pj]: Prefix Family Generation
7.4.1.4 [Pi, Pj]: Prefix Numericalization
7.4.1.5 [Pi, Pj]: Private Set Intersection
7.4.2 ACL Preprocessing
7.4.3 ACL Encoding and Encryption
7.4.3.1 Encoding and Encryption of ACL Aj (1≤j ≤n-1)
7.4.3.2 Encoding and Encryption of ACL An
7.4.4 ACL Comparison
7.5 Incremental Updates of ACLs
7.5.1 Addition of Rules with Accept Decision
7.5.2 Addition of Rules with Discard Decision
7.5.3 Addition of New Routers
7.6 Stateful Firewalls
7.7 Security and Complexity Analysis
7.7.1 Security Analysis
7.7.2 Complexity Analysis
7.8 Protocol Optimization
7.9 Experimental Results
7.9.1 Efficiency on Real ACLs
7.9.2 Efficiency on Synthetic ACLs
7.9.3 Efficiency of Incremental Updates of ACLs
7.10 Conclusions
References
8 Cross-Domain Privacy-Preserving Cooperative Firewall Optimization
8.1 Introduction
8.1.1 Background and Motivation
8.1.2 Limitation of Prior Work
8.1.3 Cross-Domain Inter-Firewall Optimization
8.1.4 Technical Challenges and Our Approach
8.1.5 Key Contributions
8.2 Related Work
8.2.1 Firewall Redundancy Removal
8.2.2 Collaborative Firewall Enforcement in VPN
8.3 System and Threat Models
8.3.1 System Model
8.3.2 Threat Model
8.4 Privacy-Preserving Inter-Firewall Redundancy Removal
8.4.1 Privacy-Preserving Range Comparison
8.4.2 Processing Firewall FW1
8.4.3 Processing Firewall FW2
8.4.4 Single-Rule Coverage Redundancy Detection
8.4.5 Multi-Rule Coverage Redundancy Detection
8.4.6 Identification and Removal of Redundant Rules
8.5 Firewall Update After Optimization
8.6 Security and Complexity Analysis
8.6.1 Security Analysis
8.6.2 Complexity Analysis
8.7 Experimental Results
8.7.1 Evaluation Setup
8.7.2 Methodology
8.7.3 Effectiveness and Efficiency on Real Policies
8.7.4 Efficiency on Synthetic Policies
8.8 Conclusions and Future Work
References
9 Privacy Preserving String Matching for Cloud Computing
9.1 Introduction
9.1.1 Motivation
9.1.2 Problem Statement
9.1.3 Adversary and Security Model
9.1.4 Limitation of Prior Art
9.1.5 Proposed Approach
9.1.6 Technical Challenges and Solutions
9.1.7 Key Contributions
9.2 Related Work
9.3 Pattern Aware Secure Search Tree
9.3.1 String Pattern Matching
9.3.2 PASStree Structure
9.3.3 Preserving Privacy of Bloom Filters
9.3.4 Query Trapdoor Generation and Processing
9.4 PASStree+
9.4.1 Challenge in Search Optimization
9.4.2 Optimizing PASStree
9.5 Ranking Search Results
9.5.1 Recording Matching Positions
9.5.2 Ranking Algorithm
9.6 Security Analysis
9.6.1 Security Model
9.6.2 Security Proof
9.7 Performance Evaluation
9.7.1 Experimental Methodology
9.7.1.1 Data Sets
9.7.1.2 Implementation Details
9.7.1.3 Query Types
9.7.2 PASStree Construction and Size
9.7.3 Query Processing Speed and Accuracy
9.7.4 Ranking Precision
9.8 Conclusion and Future Work
References
10 Privacy Preserving Information Hub Identification in Social Networks
10.1 Introduction
10.1.1 Background and Motivation
10.1.2 Limitations of Prior Art
10.1.3 Proposed Solution
10.1.4 Results and Findings
10.1.5 Key Contributions
10.2 Related Work
10.3 Proposed Solution
10.3.1 Eigenvector Centrality
10.3.2 Motivation for Principal Component Centrality
10.3.3 Definition of PCC
10.3.4 Generalized PCC
10.3.5 Selection of Number of Eigenvectors
10.3.6 Decentralized Eigendecomposition Algorithm
10.4 Performance Evaluation
10.4.1 Data Sets
10.4.1.1 Facebook A and Facebook B
10.4.1.2 Twitter
10.4.2 Selection of PCC Parameter
10.4.3 Comparison with Ground Truth
10.4.3.1 Verification of Optimal PCC Parameter
10.4.3.2 Accuracy of PCC in Predicting Top-2000 Users
10.4.3.3 Accuracy of PCC in Predicting Top-k Users
10.4.3.4 Accuracy of PCC in Rank Prediction
10.5 Conclusions
References
Part III Differential Privacy
11 Publishing Social Network Data with Privacy Guarantees
11.1 Introduction
11.1.1 Background and Motivation
11.1.2 Problem Statement
11.1.3 Limitations of Prior Art
11.1.4 Proposed Approach
11.1.5 Technical Challenges
11.1.6 Key Contributions
11.2 Related Work
11.2.1 Differential Privacy
11.2.2 Differential Privacy in Data Publishing
11.3 Random Matrix Approach
11.3.1 Theoretical Guarantee on Differential Privacy
11.3.2 Theoretical Guarantee on Eigenvector Approximation
11.4 Experimental Results
11.4.1 Dataset
11.4.2 Node Clustering
11.4.3 Node Ranking
11.5 Utility Comparison
11.6 Conclusions
References
12 Predictable Privacy-Preserving Mobile Crowd Sensing
12.1 Introduction
12.2 Related Work
12.3 Privacy of MCS
12.3.1 Threat Model
12.3.2 Data Reconstruction Attack
12.4 Differentially Private Mechanisms for Privacy-Preserving MCS
12.4.1 Models and Definitions
12.4.1.1 Overview of the System Model
12.4.1.2 Differential Privacy
12.4.1.3 Sensitivity
12.4.1.4 Laplacian Random Variables and Vectors
12.4.2 The Basic Laplacian Mechanism
12.4.3 The Salus Algorithm
12.4.3.1 Enhanced Feature #1: Value Protection
12.4.3.2 Enhanced Feature #2: Trend Protection
12.4.3.3 The Final Salus Algorithm
12.4.3.4 Privacy Analysis of Salus Algorithm
12.5 Role User: The Privacy Quantification
12.5.1 Data Reconstruction Error: A Quantitative Analysis
12.5.2 Data Reconstruction Error: A Lower Bound
12.6 Role Application Publisher: The Utility Prediction
12.6.1 Average (AVG)
12.6.2 Histogram (HIST)
12.6.3 Classifiers (CLS)
12.7 The P3 Framework for Predictable Privacy-Preserving MCS
12.8 Performance Evaluation
12.8.1 Privacy Protection
12.8.2 System Overhead
12.8.3 Case Studies
12.8.3.1 Community Health Survey
12.8.3.2 Collaborative Emotion Classification
12.9 Conclusions
References
13 Differentially Private and Budget Limited Bandit Learning over Matroids
13.1 Introduction
13.1.1 Limitations of Prior Art
13.1.2 Proposed Approach
13.1.3 Advantages over Prior Art
13.2 Related Work
13.2.1 MAB Algorithms Without Budgets
13.2.2 MAB Algorithms with Budgets
13.2.3 Privacy-Aware Online Learning
13.3 Problem Statement
13.3.1 Matroids
13.3.2 DP-Aware BMAB Over Matroids
13.3.3 An Example in Crowdsourcing
13.4 Bounding the Optimal Policy
13.5 Algorithm Design
13.5.1 Ensuring Differential Privacy
13.5.2 The OPBM Algorithm
13.6 Regret Analysis
13.7 Performance Evaluation
13.7.1 Experimental Setup
13.7.2 Metrics
13.7.3 Regret Performance
13.7.4 Time Efficiency
13.8 Conclusion
Appendix: Missing Definitions, Lemmas and Proofs
Proof of Lemma 13.1
Proof of Theorem 13.1
Proof of Lemma 13.6
Proof of Lemma 13.8
Proof of Lemma 13.9
Proof of Lemma 13.10
Proof of Lemma 13.11
Proof of Theorem 13.4
Proof of Theorem 13.5
References
Part IV Breaking Privacy
14 Breaching Privacy in Encrypted Instant Messaging Networks
14.1 Introduction
14.1.1 Chapter Organization
14.2 Related Work
14.2.1 Mix Network De-anonymization
14.2.2 Social Network De-anonymization
14.3 Problem Description and Attack Scenarios
14.3.1 IM Service Architecture
14.3.2 Attack Scenarios
14.3.2.1 Scenario #1: Near IM Relay Servers
14.3.2.2 Scenario #2: Border Gateway
14.4 COLD: COmmunication Link De-anonymization
14.4.1 Architecture
14.4.2 Details
14.4.2.1 Discrete Wavelet Transform
14.4.2.2 Choosing the Optimal Number of Decomposition Levels
14.4.2.3 Coefficient Feature Vector
14.4.2.4 Correlation
14.4.2.5 Candidate Set Generation
14.4.3 Example
14.5 Experimental Results
14.5.1 Data Set
14.5.1.1 Input Data
14.5.1.2 Ground Truth Data
14.5.2 Evaluation Metrics
14.5.3 Results
14.5.4 Discussions
14.6 Evasion and Countermeasures
14.7 Conclusions
References