Data Mesh in Action

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Revolutionize the way your organization approaches data with a data mesh! This new decentralized architecture outpaces monolithic lakes and warehouses and can work for a company of any size. In Data Mesh in Action you will learn how to: • Implement a data mesh in your organization • Turn data into a data product • Move from your current data architecture to a data mesh • Identify data domains, and decompose an organization into smaller, manageable domains • Set up the central governance and local governance levels over data • Balance responsibilities between the two levels of governance • Establish a platform that allows efficient connection of distributed data products and automated governance Data Mesh in Action reveals how this groundbreaking architecture looks for both small startups and large enterprises. You won’t need any new technology—this book shows you how to start implementing a data mesh with flexible processes and organizational change. You’ll explore both an extended case study and multiple real-world examples. As you go, you’ll be expertly guided through discussions around Socio-Technical Architecture and Domain-Driven Design with the goal of building a sleek data-as-a-product system. Plus, dozens of workshop techniques for both in-person and remote meetings help you onboard colleagues and drive a successful transition. About the technology Business increasingly relies on efficiently storing and accessing large volumes of data. The data mesh is a new way to decentralize data management that radically improves security and discoverability. A well-designed data mesh simplifies self-service data consumption and reduces the bottlenecks created by monolithic data architectures. About the book Data Mesh in Action teaches you pragmatic ways to decentralize your data and organize it into an effective data mesh. You’ll start by building a minimum viable data product, which you’ll expand into a self-service data platform, chapter-by-chapter. You’ll love the book’s unique “sliders” that adjust the mesh to meet your specific needs. You’ll also learn processes and leadership techniques that will change the way you and your colleagues think about data. What's inside • Decompose an organization into manageable domains • Turn data into a data product • Set up central and local governance levels • Build a fit-for-purpose data platform • Improve management, initiation, and support techniques About the reader For data professionals. Requires no specific programming stack or data platform. About the author Jacek Majchrzak is a hands-on lead data architect. Dr. Sven Balnojan manages data products and teams. Dr. Marian Siwiak is a data scientist and a management consultant for IT, scientific, and technical projects.

Author(s): Jacek Majchrzak, Sven Balnojan, Marian Siwiak
Edition: 1
Publisher: Manning Publications
Year: 2023

Language: English
Commentary: Publisher's PDF
Pages: 328
City: Shelter Island, NY
Tags: Google Cloud Platform; Amazon Web Services; Distributed Systems; Apache Kafka; Software Architecture; Domain-Driven Design; Data Governance; Databricks; Data Mesh; Minimum Viable Product

Data Mesh in Action
brief contents
contents
foreword
preface
acknowledgments
about this book
Who should read this book?
How this book is organized: A road map
Part 1: Foundations
Part 2: The four principles in practice
Part 3: Infrastructure and technical architecture
How to use this book
The Messflix case study
liveBook discussion forum
about the authors
about the cover illustration
Part 1: Foundations
Chapter 1: The what and why of the data mesh
1.1 Data mesh 101
1.2 Why the data mesh?
1.2.1 Alternatives
1.2.2 Data warehouses and data lakes inside the data mesh
1.2.3 Data mesh benefits
1.3 Use case: A snow-shoveling business
1.4 Data mesh principles
1.4.1 Domain-oriented decentralized data ownership and architecture
1.4.2 Data as a product
1.4.3 Federated computational governance
1.4.4 Self-serve data infrastructure as a platform
1.5 Back to snow shoveling
1.6 Socio-technical architecture
1.6.1 Conway’s law
1.6.2 Team topologies
1.6.3 Cognitive load
1.7 Data mesh challenges
1.7.1 Technological challenges
1.7.2 Data management challenges
1.7.3 Organizational challenges
Chapter 2: Is a data mesh right for you?
2.1 Analyzing data mesh drivers
2.1.1 Business drivers
2.1.2 Organizational drivers
2.1.3 Domain-data drivers
2.1.4 Minor organizational drivers
2.1.5 Is a data mesh a good fit for me?
2.2 Data mesh alternatives and complementary solutions
2.2.1 Enterprise data warehouse
2.2.2 Data lake
2.2.3 Data lakehouse
2.2.4 Data fabric
2.2.5 Data mesh vs. the rest of the world
2.3 Understanding a data mesh implementation effort
2.3.1 The data mesh development cycle
2.3.2 Development cycle in the shoveling example
2.3.3 Enabling the team
2.3.4 Development cycle in detail
Chapter 3: Kickstart your data mesh MVP in a month
3.1 Getting the lay of the land
3.1.1 Drawing a system landscape diagram
3.1.2 Performing stakeholder analysis
3.2 Identifying candidates for the MVP implementation team
3.2.1 Choosing development teams
3.2.2 Choosing the cooperation model
3.2.3 Choosing a data governance team
3.3 Setting up MVP governance
3.3.1 Defining data mesh value statement(s)
3.3.2 Defining data governance policies
3.3.3 Federating data governance
3.4 Developing minimal data products
3.4.1 Identifying domain-oriented datasets
3.4.2 Choosing data product owners
3.4.3 Deciding on the minimum viable data product description
3.4.4 Developing the simplest tools to expose your data
3.5 Setting up the minimal platform
3.5.1 Ensuring platform-forced governability
3.5.2 Ensuring platform security
Part 2: The four principles in practice
Chapter 4: Domain ownership
4.1 Capturing and analyzing domains
4.1.1 Domain-driven design 101
4.1.2 Invite the right people
4.1.3 Choose the correct workshop technique
4.2 Applying ownership using domain decomposition
4.2.1 Domain, subdomain, and business capability
4.2.2 Decompose domains using business capability modeling
4.2.3 How are domains and business capabilities related to data?
4.2.4 Assign responsibilities to the data-product-owning team
4.2.5 Choose the right team to own data
4.3 Applying ownership using data use cases
4.3.1 Data use cases
4.3.2 Model and bounded context
4.3.3 Set up boundaries of use-case-driven data products
4.3.4 Choose the right team to own data
4.4 Applying ownership using design heuristics
4.4.1 What is a heuristic?
4.4.2 Using design heuristics
4.4.3 Designing heuristics and possible boundaries
4.5 Final landscape: The mesh of interconnected data products
4.5.1 Messflix data mesh
4.5.2 Data products form a mesh
4.5.3 Is it already a data mesh?
Chapter 5: Data as a product
5.1 Applying product thinking
5.1.1 Product thinking analysis
5.1.2 Data product canvas
5.2 What is a data product?
5.2.1 Data product definition
5.2.2 Product, not project
5.2.3 What can be a data product?
5.3 Data product ownership
5.3.1 Data product owner
5.3.2 Data product owner responsibilities
5.3.3 An Agile DevOps team as a base for data product dev team
5.3.4 Data product owner and product owner
5.4 Conceptual architecture of a data product
5.4.1 External architecture view
5.4.2 Internal architecture view
5.5 Data product fundamental characteristics
5.5.1 Self-described data product
5.5.2 Introduction to metadata
5.5.3 Metadata as code
5.5.4 Data product metadata
5.5.5 Domain dataset metadata
5.5.6 Other kinds of metadata
5.6 Additional data product characteristics: FAIR and immutability
5.6.1 Findability
5.6.2 Accessibility
5.6.3 Interoperable
5.6.4 Reusable
5.6.5 Immutable
5.7 Data contracts and sharing agreements inside the data mesh
5.7.1 Data contracts and sharing agreements
5.7.2 Implementing data contracts and sharing agreements
Chapter 6: Federated computational governance
6.1 Data governance in a nutshell
6.2 Benefits of data governance
6.2.1 Business value perspective
6.2.2 Data usability perspective
6.2.3 Data control perspective
6.3 Planning data governance outcomes
6.3.1 Hierarchy of data governance outcomes
6.3.2 Strategic-level outcomes
6.3.3 Tactical-level outcomes
6.3.4 Implementation-level outcomes
6.4 Federating data governance
6.4.1 Thinking of data governance in terms of “sliders”
6.4.2 Extreme ends of data governance models
6.4.3 Federated data governance model
6.4.4 Setting-up governance team operations
6.5 Making data governance computational
6.5.1 Making policies computational
6.5.2 Automating policy checks
Chapter 7: The self-serve data platform
7.1 The MVP platform
7.1.1 Platform definition
7.1.2 Platform thinking
7.2 Improvements with X as a service
7.2.1 X as a service explained
7.2.2 X as a service applied
7.3 Improvements with platform architecture
7.3.1 Platform architecture explained
7.3.2 Platform architecture applied
7.4 Improvements for the data producers
Part 3: Infrastructure and technical architecture
Chapter 8: Comparing self-serve data platforms
8.1 Data mesh on Google Cloud Platform
8.1.1 Self-serve data platform architecture
8.1.2 Identifying the components of the platform
8.1.3 Identifying the components of the data product
8.1.4 Workflows
8.1.5 Variations
8.1.6 Relation to data mesh ideas
8.1.7 GCP architecture summary
8.2 Data mesh on AWS
8.2.1 Self-serve data platform architecture
8.2.2 Identifying the components of the platform
8.2.3 Identifying the components of the data products
8.2.4 Workflows
8.2.5 Relation to data mesh ideas
8.2.6 Variations
8.2.7 AWS architecture summary
8.3 Data mesh on Databricks
8.3.1 Self-serve data platform architecture
8.3.2 Identifying the components of the platform
8.3.3 Identifying the components of the data product
8.3.4 Workflow considerations
8.3.5 Variations
8.3.6 Databricks architecture summary
8.4 Data mesh on Kafka
8.4.1 Self-serve data platform architecture
8.4.2 Identifying the components
8.4.3 Considerations
8.4.4 Kafka architecture summary
Chapter 9: Solution architecture design
9.1 Capturing and understanding the current state
9.1.1 What is software architecture?
9.1.2 How to document architecture: The C4 model
9.2 Understanding architectural drivers of a data product design
9.2.1 Architectural drivers
9.2.2 Capturing architectural drivers for a data-product design
9.3 Designing the future architecture of a data product and related systems
9.3.1 Design session
9.3.2 File-based data product: Spreadsheet
9.3.3 From monolith and microservice to a data product
9.3.4 Exposing data for stream processing and batch processing
appendix A
appendix B
appendix C
appendix D
D.1 Notes on thinnest viable platforms
D.2 Note on phasing out interfaces
index
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O
P
Q
R
S
T
U
V
W
X