This book examines the recent trend of extending data dependencies to adapt to rich data types in order to address variety and veracity issues in big data. Readers will be guided through the full range of rich data types where data dependencies have been successfully applied, including categorical data with equality relationships, heterogeneous data with similarity relationships, numerical data with order relationships, sequential data with timestamps, and graph data with complicated structures. The text will also discuss interesting constraints on ordering or similarity relationships contained in novel classes of data dependencies in addition to those in equality relationships, e.g., considered in functional dependencies (FDs). In addition to exploring the concepts of these data dependency notations, the book investigates the extension relationships between data dependencies, such as conditional functional dependencies (CFDs) that extend conventional functional dependencies (FDs). This forms in the book a family tree of extensions, mostly rooted in FDs, that help illuminate the expressive power of various data dependencies. Moreover, the book points to work on the discovery of dependencies from data, since data dependencies are often unlikely to be manually specified in a traditional way, given the huge volume and high variety in big data. It further outlines the applications of the extended data dependencies, in particular in data quality practice. Altogether, this book provides a comprehensive guide for readers to select proper data dependencies for their applications that have sufficient expressive power and reasonable discovery cost. Finally, the book concludes with several directions of future studies on emerging data.
Author(s): Shaoxu Song, Lei Chen
Series: Synthesis Lectures on Data Management
Publisher: Springer
Year: 2023
Language: English
Pages: 153
City: Cham
Acknowledgements
Contents
About theĀ Authors
1 Introduction
1.1 Background
1.2 Motivation
1.3 Categorization on Data Types
1.4 Perspectives on Data Dependencies
1.5 Related Studies in Dependency Survey
1.6 Organization
2 Categorical Data
[DELETE]
2.1 Functional Dependencies (FDs)
2.2 Equality Generating Dependencies (EGDs)
2.3 Soft Functional Dependencies (SFDs)
2.4 Probabilistic Functional Dependencies (PFDs)
2.5 Approximate Functional Dependencies (AFDs)
2.6 Numerical Dependencies (NUDs)
2.7 Conditional Functional Dependencies (CFDs)
2.8 Extended Conditional Functional Dependencies (eCFDs)
2.9 Multivalued Dependencies (MVDs)
2.10 Full Hierarchical Dependencies (FHDs)
2.11 Approximate Multivalued Dependencies (AMVDs)
2.12 Inclusion Dependencies (INDs)
2.13 Approximate Inclusion Dependencies (AINDs)
2.14 Conditional Inclusion Dependencies (CINDs)
2.15 Summary and Discussion
3 Heterogeneous Data
3.1 Metric Functional Dependencies (MFDs)
3.2 Neighborhood Dependencies (NEDs)
3.3 Differential Dependencies (DDs)
3.4 Conditional Differential Dependencies (CDDs)
3.5 Comparable Dependencies (CDs)
3.6 Probabilistic Approximate Constraints (PACs)
3.7 Fuzzy Functional Dependencies (FFDs)
3.8 Ontology Functional Dependencies (ONFDs)
3.9 Matching Dependencies (MDs)
3.10 Conditional Matching Dependencies (CMDs)
3.11 Summary and Discussion
4 Ordered Data
4.1 Ordered Functional Dependencies (OFDs)
4.2 Order Dependencies (ODs)
4.3 Band Order Dependencies (BODs)
4.4 Denial Constraints (DCs)
4.5 Sequential Dependencies (SDs)
4.6 Conditional Sequential Dependencies (CSDs)
4.7 Summary and Discussion
5 Temporal Data
[DELETE]
5.1 Temporal Functional Dependencies (TFDs)
5.2 Trend Dependencies (TDs)
5.3 Speed Constraints (SCs)
5.4 Multi-speed Constraints (MSCs)
5.5 Acceleration Constraints (ACs)
5.6 Temporal Constraints (TCs)
5.7 Petri Nets (PNs)
5.8 Summary and Discussion
6 Graph Data
[DELETE]
6.1 Neighborhood Constraints (NCs)
6.2 Node Label Constraints (NLCs)
6.3 Path Label Constraints (PLCs)
6.4 XML Functional Dependencies (XFDs)
6.5 XML Conditional Functional Dependencies (XCFDs)
6.6 Keys for Graph (GKs)
6.7 Graph-Patterns Association Rules (GPARs)
6.8 Functional Dependencies for Graph (GFDs)
6.9 Graph Entity Dependencies (GEDs)
6.10 Graph Association Rules (GARs)
6.11 Graph Differential Dependencies (GDDs)
6.12 Graph Denial Constraints (GDCs)
6.13 Temporal Dependencies for Graph (TGFDs)
6.14 Summary and Discussion
7 Conclusions and Directions
Index of Data Dependencies
References