This unique volume is an introduction for computer scientists, including a formal study of theoretical algorithms for Big Data applications, which allows them to work on such algorithms in the future. It also serves as a useful reference guide for the general computer science population, providing a comprehensive overview of the fascinating world of such algorithms.To achieve these goals, the algorithmic results presented have been carefully chosen so that they demonstrate the important techniques and tools used in Big Data algorithms, and yet do not require tedious calculations or a very deep mathematical background.
Author(s): Feldman, Moran
Publisher: World Scientific Publishing Co. Pte. Ltd.
Year: 2021
Language: English
Commentary: Computers Algorithms For Big Data
Pages: 458
Tags: Computers Algorithms For Big Data
Contents
Preface
About the Author
Part I: Data Stream Algorithms
Chapter 1. Introduction to Data Stream Algorithms
1.1 The Data Stream Model
1.2 Evaluating Data Stream Algorithms
1.3 Bibliographic Notes
Exercise Solutions
Chapter 2. Basic Probability and Tail Bounds
2.1 Discrete Probability Spaces
2.2 Random Variables
2.3 Indicators and the Binomial Distribution
2.4 Tail Bounds
Exercise Solutions
Chapter 3. Estimation Algorithms
3.1 Morris’s Algorithm for Estimating the Length of the Stream
3.2 Improving the Estimation
3.3 Concluding Remarks
3.4 Bibliographic Notes
Exercise Solutions
Chapter 4. Reservoir Sampling
4.1 Uniform Sampling
4.2 Approximate Median and Quantiles
4.3 Weighted Sampling
4.4 Bibliographic Notes
Exercise Solutions
Chapter 5. Pairwise Independent Hashing
5.1 Pairwise Hash Functions Families
5.2 Simple Construction of a Pairwise Independent Hash Family
5.3 Advanced Constructions of Pairwise and k-wise Independent Hash Families
5.4 Bibliographic Notes
Exercise Solutions
Chapter 6. Counting Distinct Tokens
6.1 The AMS Algorithm
6.2 An Improved Algorithm
6.3 Impossibility Result
6.4 Bibliographic Notes
Exercise Solutions
Chapter 7. Sketches
7.1 Generalizations of the Data Stream Model
7.2 The Count-Min Sketch
7.3 The Count Sketch
7.4 Linear Sketches
7.5 Bibliographic Notes
Exercise Solutions
Chapter 8. Graph Data Stream Algorithms
8.1 Graph Data Stream Algorithms
8.2 Maximum Weight Matching
8.3 Triangle Counting
8.4 Bibliographic Notes
Exercise Solutions
Chapter 9. The Sliding Window Model
9.1 The Sliding Window Model
9.2 Graph Connectivity in the Sliding Window Model
9.3 Smooth Histograms
9.4 Bibliographic Notes
Exercise Solutions
Part II: Sublinear Time Algorithms
Chapter 10. Introduction to Sublinear Time Algorithms
10.1 A Naıve Example
10.2 Estimating the Diameter
10.3 Query Complexity
10.4 Bibliographic Notes
Exercise Solutions
Chapter 11. Property Testing
11.1 Formal View on Property Testing Algorithms
11.2 Testing a List of n Numbers for Being Free of Duplicates
11.3 The List Model and Testing Lists for Being Sorted
11.4 The Pixel Model and Testing for Half-Planes
11.5 Concluding Remark
11.6 Bibliographic Notes
Exercise Solutions
Chapter 12. Algorithms for Bounded Degree Graphs
12.1 Counting Connected Components
12.2 Minimum Weight Spanning Trees
12.3 Minimum Vertex Cover
12.4 Testing Whether a Graph is Connected
12.5 Bibliographic Notes
Exercise Solutions
Chapter 13. An Algorithm for Dense Graphs
13.1 Model
13.2 Algorithm for Testing Bipartiteness
13.3 Reducing the Number of Partitions to Check
13.4 Removing the Assumption
13.5 Bibliographic Notes
Exercise Solutions
Chapter 14. Algorithms for Boolean Functions
14.1 Model
14.2 Testing Linearity
14.3 Testing Monotonicity
14.4 Bibliographic Notes
Exercise Solutions
Part III: Map-Reduce
Chapter 15. Introduction to Map-Reduce
15.1 Some Details about Map-Reduce
15.2 Theoretical Model of Map-Reduce
15.3 Performance Measures
15.4 A Different Theoretical Model
15.5 Bibliographic Notes
Exercise Solutions
Chapter 16. Algorithms for Lists
16.1 Calculating Word Frequencies
16.2 Prefix Sums
16.3 Indexing
16.4 Bibliographic Notes
Exercise Solutions
Chapter 17. Graph Algorithms
17.1 Minimum Weight Spanning Tree
17.2 Listing Triangles
17.3 Bibliographic Notes
Exercise Solutions
Chapter 18. Locality-Sensitive Hashing
18.1 Main Idea
18.2 Examples of Locality-Sensitive Hash Functions Families
18.3 Amplifying Locality-Sensitive Hash Functions Families
18.4 Bibliographic Notes
Exercise Solutions
Index