Data Spaces: Design, Deployment and Future Directions

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This open access book aims to educate data space designers to understand what is required to create a successful data space. It explores cutting-edge theory, technologies, methodologies, and best practices for data spaces for both industrial and personal data and provides the reader with a basis for understanding the design, deployment, and future directions of data spaces.

The book captures the early lessons and experience in creating data spaces. It arranges these contributions into three parts covering design, deployment, and future directions respectively.

  • The first part explores the design space of data spaces. The single chapters detail the organisational design for data spaces, data platforms, data governance federated learning, personal data sharing, data marketplaces, and hybrid artificial intelligence for data spaces.
  • The second part describes the use of data spaces within real-world deployments. Its chapters are co-authored with industry experts and include case studies of data spaces in sectors including industry 4.0, food safety, FinTech, health care, and energy.
  • The third and final part details future directions for data spaces, including challenges and opportunities for common European data spaces and privacy-preserving techniques for trustworthy data sharing.

The book is of interest to two primary audiences: first, researchers interested in data management and data sharing, and second, practitioners and industry experts engaged in data-driven systems where the sharing and exchange of data within an ecosystem are critical.



Author(s): Edward Curry, Simon Scerri, Tuomo Tuikka
Publisher: Springer
Year: 2022

Language: English
Pages: 366
City: Cham

Preface
Acknowledgments
Contents
About the Editors and Contributors
About the Editors
Contributors
Data Spaces: Design, Deployment, and Future Directions
1 Introduction
2 Data Ecosystems
3 Data Spaces
3.1 Data Spaces: A Platform for Data Sharing
3.1.1 Industrial Data Spaces (IDS)
3.1.2 Personal Data Spaces (PDS)
4 Common European Data Spaces
4.1 The Big Data Value PPP (BDV PPP)
4.2 Big Data Value Association
4.3 Data Platform Project Portfolio
5 Book Overview
5.1 Chapter Analysis
6 Summary
References
Part I Design
An Organizational Maturity Model for Data Spaces: A Data Sharing Wheel Approach
1 Introduction
2 Background and Context
2.1 Data Ecosystems
2.2 Data Value Chains and Data-Driven AI
2.3 High-Level Europe Opportunity and Challenges
3 Data Spaces and Organizational Capabilities
3.1 BDVA Data Sharing Value Wheel
3.2 Organizational Capabilities
3.3 Maturity Models
4 A Maturity Model for Data Spaces
4.1 Model Design Methodology
4.2 Capabilities
4.3 Maturity Curve
4.4 Assessment Approach
4.4.1 Defining the Scope and Goal
4.4.2 Assessment Data Collection and Analysis
4.4.3 Using the Assessment Results to Develop and Manage Capabilities
5 Illustrative Benchmarking Example
5.1 Benchmark Results
5.1.1 Capability Gap Analysis
5.1.2 Capability Importance
6 Conclusion
References
Data Platforms for Data Spaces
1 Introduction
2 Big Data Value Ecosystems
2.1 Data Spaces and Data Platforms
2.2 Gaia-X Ecosystem
3 Data Platform Project Portfolio
3.1 DataPorts Project
3.2 TheFSM Project
3.3 i3-MARKET Project
3.4 OpertusMundi
3.5 TRUSTS Project
3.6 smashHit Project
3.7 PimCity Project
3.8 KRAKEN Project
3.9 DataVaults Project
4 Comparison of Data Management Services
5 Key Enabling Technologies
5.1 Semantics
5.2 Blockchain and Smart Contracts
5.3 AI and Machine Learning
6 Common Challenges and Lessons Learned
6.1 AI and Machine Learning Challenges
6.2 Legal Challenges
6.3 Ethical Challenges
6.4 Sustainability Challenges
6.5 User Engagement Challenges
7 Conclusion
References
Technological Perspective of Data Governance in Data Space Ecosystems
1 Introduction
2 Data Governance: General Concepts
3 Big Data Life-Cycle
3.1 DataOps
4 Big Data Technologies Under Governance Perspective
4.1 Architectures and Paradigms
4.1.1 Architectures for Data Sharing
4.1.2 Architectures for Data Storage
4.1.3 Architectures for Data Processing
4.2 Current Tool Ecosystem for Big Data Governance
5 Conclusion and Future Challenges
References
Increasing Trust for Data Spaces with Federated Learning
1 Introduction
2 Industrial Data Platform, an Architecture Perspective
2.1 Client Connector
2.2 Micro-Services
3 Industrial Data Platform, a Legal Perspective
3.1 The Broader Policy Context
3.2 Data Sharing Platforms
3.3 Federated Learning as a Trust Enabler: Some Data Protection Considerations
4 Industrial Data Platform, Objective Data Value Estimation for Increased Trust in Data Spaces
5 Conclusion
References
KRAKEN: A Secure, Trusted, Regulatory-Compliant, and Privacy-Preserving Data Sharing Platform
1 KRAKEN Overview
2 Architectures for Data Platform
2.1 KRAKEN Data Platform Architecture Overview
2.2 Enabling Decentralized Privacy-Preserving Decision-Making Using Permissioned Blockchain Technology and SSI
3 Real-Time Data Sharing Using Streamr: A Decentralized Peer-to-Peer Network
4 Privacy, Trust, and Data Protection
5 Sharing by Design, Ownership, and Usage Control
5.1 User-Centric Data Sharing
6 Compliance with Data Protection Framework
6.1 Data Protection Principles and Their Implementation
6.1.1 Lawfulness, Fairness, and Transparency
6.1.2 Purpose Limitation, Data Minimization, and Storage Limitation
6.1.3 Accuracy, Integrity, and Confidentiality
6.1.4 Accountability
6.2 The Exercise of Data Subject Rights
6.3 The KRAKEN Approach Toward Data Monetization
7 Business Challenges
References
Connecting Data Spaces and Data Marketplaces and the Progress Toward the European Single Digital Market with Open-Source Software
1 Introduction
2 Challenges in Data Marketplace Design and Data Economy
2.1 Data Marketplace Openness and Fairness
2.2 High Demands on Security and Privacy
2.3 Data Marketplace Interoperability
3 Advancing the State of the Art on Security, Privacy, and Trust
3.1 Security
3.2 Data Privacy
3.3 Trust
4 The i3-MARKET Backplane Innovations for the Data Economy
4.1 Privacy and Data Protection
4.2 Trust and Security Platform
4.3 Secure Sharing of Personal Data and Industrial Data
4.4 Large-Scale Federated Data Platform
4.5 Policy and Regulation for Data Marketplace Backplane
5 i3-MARKET Backplane at a Glance
5.1 i3-MARKET High-Level Architecture
5.2 i3-MARKET Data Flow as Reference Implementation
6 Industrial Innovation for a Data-Driven European Ecosystem
6.1 Data Sharing/Brokerage/Trading Build on Existing Computing Platforms
6.2 Data Privacy in Industrial Data Marketplace Platforms
6.3 Industrial Data Marketplace Platforms
7 Conclusions
References
AI-Based Hybrid Data Platforms
1 Introduction
2 Brief Overview of Architectures, Frameworks, and Platforms
2.1 Reference Architectures for Big Data Processing
2.2 Component Frameworks
2.3 Data Platforms
3 Requirements for the GATE Data Platform
3.1 Requirements from Research Perspective
3.2 Data-Driven Requirements
3.3 Service Provisioning Requirements
3.4 Data Governance Requirements
4 Hybridization of Data Platforms
4.1 Multilevel and Service-Oriented Architectures
4.2 Levels of Intelligence
4.3 System Architecture
5 Implementation
5.1 Enabling Technologies
5.2 Data Services
5.3 Engineering Roadmap
6 City Digital Twin Pilot
7 Conclusion and Future Work
References
Part II Deployment
A Digital Twin Platform for Industrie 4.0
1 Introduction
2 User Stories
3 Architectural Style and Interprocess Communication
3.1 Architectural Style
3.2 Interprocess Communication
3.3 Time Series Integration
3.4 Time Series Messaging Experiment
4 A Digital Twin Platform for Industrie 4.0
4.1 The Continuous Deployment Layer
4.2 The Data Infrastructure Layer
4.3 The Business Services Layer
4.4 The Cross-Company Connectors
5 Digital Twin Controlled Manufacturing
6 Conclusion and Outlook
References
A Framework for Big Data Sovereignty: The European Industrial Data Space (EIDS)
1 Introduction
2 The European Industrial Data Space: Design Principles
3 EIDS-Based Infrastructure
3.1 HPC as Foundation for Resource-Intensive Big Data Applications in the EIDS
3.2 Open-Source-Based FIWARE's EIDS Connector
4 Data Space Commonalities: Semantics as Prerequisite for Data Economy
4.1 The IDS Information Model
4.2 Domain-Specific Ontologies in EIDS: QIF
5 Data Treasures and Big Data Services in EIDS: Big Data Applications and Use Cases
5.1 Types of Connection for Big Data Analytics Services and Platforms in the EIDS
5.2 Example of Integration of an Open-Source-Based Connector Within a Big Data Analytics Platform Connected: The CERTH Cognitive Analytics Platform
5.2.1 Manufacturing Scenario
5.2.2 Digital Business Process
5.2.3 EIDS Integration Approach
5.3 Example of Integration of the Open-Source-Based Connector with Manufacturing Supermarkets 4.0: Volkswagen Autoeuropa
5.3.1 Manufacturing Scenarios
5.3.2 Digital Business Process
5.3.3 EIDS Integration Approach
5.4 Example of Integration of the Domain-Specific Ontologies with Predictive Maintenance Processes: OTIS
5.4.1 Manufacturing Scenarios
5.4.2 Digital Business Process
5.4.3 EIDS Integration Approach
5.5 Example of Integration of EIDS-Based Infrastructure with Warehouse Management Processes—Gestamp
5.5.1 Manufacturing Scenarios
5.5.2 Digital Business Process
5.5.3 EIDS Integration Approach
5.6 Example of Integration of EIDS-Based Infrastructure with Logistics Processes: ASTI
5.6.1 Manufacturing Scenarios
5.6.2 Digital Business Process
5.6.3 EIDS Integration Approach
6 Certification as Base for Trust in the EIDS
6.1 IDS Evaluation Facility
6.2 Integration Camp
7 Data Space for the Future of Data Economy
References
Deploying a Scalable Big Data Platform to Enable a Food Safety Data Space
1 Introduction
2 Big Data Platform Architecture
3 Data Modeling
4 Data Standards Used
5 Software Stack Used in Data Platform
5.1 Data Ingestion Components
5.2 Collection and Processing Components
5.3 Storage Components
5.4 Data Processing Components
5.5 Data Enrichment Components
5.6 Monitoring Components
5.7 Intelligence Layer
6 Operational Instance of the Data Platform
7 Identifying Records in the Big Data Platform
7.1 Hash Function over Crawled Urls
7.2 Internal Identification Process
7.3 Remote Source Identification
8 Orchestrating the Big Data Platform
9 Monitoring the Big Data Platform
9.1 Ensure Uptime for Every Component Deployed to the Stack
9.2 Problems That Are Not Related to the Infrastructure
10 Performance and Scalability of the Data Platform
11 Discussion
12 Conclusions
References
Data Space Best Practices for Data Interoperability in FinTechs
1 Introduction
2 Challenges in Data Space Design
2.1 Data Fragmentation and Interoperability Barriers
2.2 Limitations for Cost-Effective Real-Time Analytics
2.3 Regulatory Barriers
2.4 Data Availability Barriers
2.5 Lack of a Blueprint Architectures for Big Data Applications
2.6 No Validated Business Models
3 Best Practices for Data Space Design and Implementation
3.1 Technical/Technological Developments
3.2 Development of Experimentation Infrastructures (Testbeds)
3.3 Validation of Novel Business Models
4 The INFINITECH Way to Design/Support FinTech Data Spaces
4.1 Technological Building Blocks for Big Data, IoT, and AI
4.2 Tailored Experimentation Infrastructures
4.3 Large-Scale Innovative Pilots in Finance and Insurance
4.4 Business Model Development and Validation
5 Technology Capabilities for Convergence and Interoperability
5.1 Semantic Interoperability and Analytics
5.2 INFINITECH Building Blocks for Big Data, IoT, and AI
6 Scalability and Security Considerations for FinTech and InsuranceTech
7 Conclusions
References
TIKD: A Trusted Integrated Knowledge Dataspace for Sensitive Data Sharing and Collaboration
1 Introduction
2 Use Case—Sensitive Data Sharing and Collaboration for Healthcare in the ARK-Virus Project
3 Related Work
4 Description of the TIKD
4.1 Knowledge Graph Integration
4.2 Security Control
4.2.1 Personal Data Handling
4.2.2 Data Classification
4.2.3 Access Control
4.2.4 Policy Specification
4.2.5 Policy Enforcement
4.2.6 Privacy Protecting User Logs
4.3 Data Interlinking
4.4 Data Sharing
4.5 Subgraph Sharing
5 Security and Privacy Evaluations of the ARK Platform
5.1 Security Evaluation
5.2 Privacy Information Evaluation
6 Conclusions
References
Toward an Energy Data Platform Design: Challenges and Perspectives from the SYNERGY Big Data Platform and AI Analytics Marketplace
1 Introduction
2 Data Platforms
2.1 Generic-Purpose Data Hubs and Marketplaces
2.2 Energy Data Hubs and Marketplaces
3 SYNERGY Reference Architecture
3.1 SYNERGY Cloud Infrastructure Layer
3.2 SYNERGY On-Premise Environments Layer
4 Discussion
5 Conclusions
References
Part III Future Directions
Privacy-Preserving Techniques for Trustworthy Data Sharing: Opportunities and Challenges for Future Research
1 Introduction
1.1 Data Sharing Now: A Legal Patchwork
1.2 Data Marketplaces
1.3 Data Governance Act (“DGA”)
2 Legal Perspective on Privacy-Preserving Techniques for Enhancing Trust in Data Sharing
2.1 What Is Trust?
2.2 The Role of Trust in Data Markets
2.3 Privacy-Preserving Techniques as a Means to Bring More Trust in Data Sharing
3 Methods for Privacy-Preserving Analytics
3.1 Homomorphic Encryption
3.2 Secure Multi-Party Computation
3.2.1 Private Set Intersection
4 Privacy-Preserving Technologies for Smart Contracts
4.1 Encrypted On-Chain Data with Homomorphic Encryption
4.2 Smart Contracts Based on Multi-party Computation
4.3 Secure Enclaves
5 Conclusion: Opportunities and Future Challenges
References
Common European Data Spaces: Challenges and Opportunities
1 Introduction
2 Data Spaces
3 Common European Data Spaces Vision
4 Challenges
4.1 Technical Challenges
4.2 Business and Organizational Challenges
4.3 Legal Compliance Challenges
4.4 National and Regional Challenges
5 Opportunities
5.1 Opportunities for Business
5.2 Opportunities for Citizens
5.3 Opportunities for Science
5.4 Opportunities for Government and Public Bodies
6 Call to Action
7 Conclusion
References