This book provides a comprehensive overview of core concepts and technological foundations for continuous engineering of Web streams. It presents various systems and applications and includes real-world examples. Last not least, it introduces the readers to RSP4J, a novel open-source project that aims to gather community efforts in software engineering and empirical research.
The book starts with an introductory chapter that positions the work by explaining what motivates the design of specific techniques for processing data streams using Web technologies. Chapter 2 briefly summarizes the necessary background concepts and models needed to understand the remaining content of the book. Subsequently, chapter 3 focuses on processing RDF streams, taming data velocity in an open environment characterized by high data variety. It introduces query answering algorithms with RSP-QL and analytics functions over streaming data. Chapter 4 presents the life cycle of streaming linked data, it focuses on publishing streams on the Web as a prerequisite aspect to make data findable and accessible for applications. Chapter 5 touches on the problems of benchmarks and systems that analyze Web streams to foster technological progress. It surveys existing benchmarks and introduces guidelines that may support new practitioners in approaching the issue of continuous analytics. Finally, chapter 6 presents a list of examples and exercises that will help the reader to approach the area, get used to its practices and become confident in its technological possibilities.
Overall, this book is mainly written for graduate students and researchers in Web and stream data management. It collects research results and will guide the next generation of researchers and practitioners.
Author(s): Riccardo Tommasini, Pieter Bonte, Fabiano Spiga, Emanuele Della Valle
Publisher: Springer
Year: 2023
Language: English
Pages: 169
City: Cham
Foreword
Preface
Contents
Acronyms
1 General Introduction
1.1 Web Stream Processing at Glance
1.2 The ColorWave Running Example
1.3 RDF Stream Processing with RSP4J
1.3.1 Challenges and Requirements
1.3.2 RSP4J's Architecture
1.4 How to Read This Book
1.5 Outline
References
2 Preliminaries
2.1 Introduction
2.2 Data Velocity
2.2.1 Data Streams and Continuous Queries
2.2.2 Relational Stream Processing
2.2.3 Stream Processing Engines
2.2.4 Complex Event Processing
2.3 Data Variety
2.3.1 RDF and RDF Schema
2.3.2 Reasoning Techniques
2.3.3 Web Ontology Language
2.3.4 SPARQL Protocol and RDF Query Language
2.3.4.1 SPARQL Query Language
2.3.4.2 SPARQL Evaluation Semantics
2.4 Chapter Summary
References
3 Taming Variety and Velocity
3.1 Introduction
3.2 RDF Stream Processing
3.2.1 From CQL to RSP-QL
3.2.2 RSP-QL Query Language
3.2.3 Reporting Strategies
3.3 Reasoning over Web Streams
3.3.1 Problems with Standard Reasoning Techniques
3.3.1.1 Forward Chaining
3.3.1.2 Backward Chaining
3.3.2 Efficient Hierarchical Reasoning with C-Sprite
3.3.2.1 C-Sprite Under RSP-QL
3.3.2.2 A Data Structure for Efficient Hierarchical Reasoning
3.3.2.3 C-Sprite Algorithm
3.3.3 Incremental Maintenance Approaches
3.4 RSP4J: An API for RSP Engines
3.4.1 Querying
3.4.2 Streams
3.4.3 Operators
3.4.4 SDS and Time-Varying Graphs
3.4.5 Engine and Query Execution
3.4.6 Composing Operators Using RSP4J's Operator API
3.4.7 Reasoning in RSP4J
3.5 Chapter Summary
References
4 Streaming Linked Data Life Cycle
4.1 Introduction
4.2 Linked and FAIR Data
4.3 The Steps of the Life Cycle
4.3.1 Identify
4.3.2 Model
4.3.3 Shape
4.3.4 Annotate
4.3.5 Describe
4.3.6 Serve
4.3.7 Discovery and Access
4.3.8 Process
4.4 Chapter Summary
References
5 Web Stream Processing Systems and Benchmarks
5.1 Introduction
5.2 RSP Engines (And Their Alignment with RSP-QL)
5.2.1 Evaluation Time
5.2.2 C-SPARQL
5.2.3 CQELS
5.2.4 SPARQLstream
5.2.5 Strider
5.2.6 Comparison
5.2.7 Related Works
5.2.8 CEP Enabled RSP Engines
5.3 Benchmarking Web Stream Processing
5.3.1 Benchmarking Challenges and Key PerformanceIndicators
5.3.2 LSBench
5.3.3 SRBench and CSRBench
5.3.4 CityBench
5.3.5 Related Works
5.4 Methods and Tools
5.4.1 Heaven
5.4.2 CSRBench Oracle and YABench Driver
5.4.3 CityBench Testbed
5.4.4 RSPLab
5.4.5 Related Works
5.5 Chapter Summary
References
6 Exercise Book
6.1 Introduction
6.2 Location-Detection Example
6.2.1 The Setup
6.2.2 The Data
6.2.3 The Queries
6.2.4 Solving Query 1
6.2.5 Solving Query 2
6.2.6 Solving Query 3
6.3 Publishing and Processing Wild Streams
6.3.1 DBPedia Live
6.3.2 Wikimedia EventStreams
6.3.3 Global Database of Events, Language, and Tone(GDELT)
6.4 Chapter Summary
References