Practical Data Quality: Learn real-world techniques to transform data quality management in your organization

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Poor data quality can lead to increased costs, hinder revenue growth, compromise decision-making, and introduce risk into organizations. This leads to employees, customers, and suppliers finding every interaction with the organization frustrating. Practical Data Quality provides a comprehensive view of managing data quality within your organization, covering everything from business cases through to embedding improvements that you make to the organization permanently. Each chapter explains a key element of data quality management, from linking strategy and data together to profiling and designing business rules which reveal bad data. The book outlines a suite of tried-and-tested reports that highlight bad data and allow you to develop a plan to make corrections. Throughout the book, you'll work with real-world examples and utilize re-usable templates to accelerate your initiatives. By the end of this book, you'll have gained a clear understanding of every stage of a data quality initiative and be able to drive tangible results for your organization at pace.

Author(s): Robert Hawker
Edition: 1
Publisher: Packt Publishing Pvt Ltd
Year: 2023

Language: English
Pages: 461

Practical Data Quality
Foreword
Contributors
About the author
About the reviewers
Preface
Who this book is for
What this book covers
To get the most out of this book
Templates and diagrams
Get in touch
Share Your Thoughts
Download a free PDF copy of this book
Part 1 – Getting Started
1
The Impact of Data Quality on Organizations
The value of this book
Importance of executive support
Detailed definition of bad data
Bad data versus perfect data
Impact of bad data quality
Quantification of the impact of bad data
Impacts of bad data in depth
Process and efficiency impacts
Reporting and analytics impacts
Compliance impacts
Data differentiation impacts
Causes of bad data
Lack of a data culture
Prioritizing process speed over data governance
Mergers and acquisitions
Summary
References
2
The Principles of Data Quality
Data quality in the wider context of data governance
Data governance as a discipline
Data governance tools and MDM
How data quality fits into data governance and MDM
Generally accepted principles and terminology of data quality
The basic terms of data quality defined
Data quality dimensions
Stakeholders in data quality initiatives
Different stakeholder types and their roles
The data quality improvement cycle
Business case
Data discovery
Rule development
Monitoring
Remediation
Embedding into BAU
Summary
References
3
The Business Case for Data Quality
Activities, components, and costs
Activities in a data quality initiative
Early phases
Planning and business case phase
Developing quantitative benefit estimates
Example – the difficulty of calculating quantitative benefits
Strategies for quantification
Developing qualitative benefits
Surveys and focus groups
Outlining data quality qualitative risks in depth
Anticipating leadership challenges
The “Excel will do the job” challenge
Ownership of ongoing costs challenge
The excessive cost challenge
The “Why do we need a data quality tool?” challenge
Summary
4
Getting Started with a Data Quality Initiative
The first few weeks after budget approval
Key activities in those early weeks
Understanding data quality workstreams
Workstreams required early on
Identifying the right people for your team
Mapping resources to the workstreams
Summary
Part 2 – Understanding and Monitoring the Data That Matters
5
Data Discovery
An overview of the data discovery process
Understanding business strategy, objectives, and challenges
Approaches to stakeholder identification
Content of stakeholder conversations
The hierarchy of strategy, objectives, processes, analytics, and data
Prioritizing using strategy
Linking challenges to processes, data, and reporting
Basics of data profiling
Typical tool data profiling capabilities
Using these capabilities
Connecting to data
Summary
6
Data Quality Rules
An introduction to data quality rules
Rule scope
The key features of data quality rules
Rule weightings
Rule dimensions
Rule priorities
Rule thresholds
Cost per failure
Implementing data quality rules
Designing rules
Building data quality rules
Testing data quality rules
Summary
7
Monitoring Data Against Rules
Introduction to data quality reporting
Different levels of reporting
Data security considerations
Designing a high-level data quality dashboard
Dimensions and filters
Designing a Rule Results Report
Typical features of the Rule Results Report
Designing Failed Data Reports
Typical features of the Failed Data Reports
Re-using Failed Data Reports
Multiple Failed Data Reports
Exporting Failed Data Reports
Managing inactive and duplicate data
Managing inactive data
Managing duplicate data
Detecting duplicates
Presenting findings to stakeholders
Launching data quality reporting successfully
Embedding reports into governance
Summary
Part 3 – Improving Data Quality for the Long Term
8
Data Quality Remediation
Overall remediation process
Prioritizing remediation activities
Revisiting benefits
Approach to determining priorities
Identifying the approach to remediation
Typical remediation approaches
Matching issues to the correct approach
Moving remediation to business as usual
Understanding the effort and cost
Types of cost in remediation
Governing remediation activities
Key governance activities
Tracking benefits
Quantitative example
Qualitative benefit tracking
Summary
9
Embedding Data Quality in Organizations
Preventing issue re-occurrence
Methods to prevent re-occurrence
The ongoing impact of human error
Short-horizon reporting
Ongoing data quality rule improvement
Strategies to identify rule changes
Updating data quality rules
Transitioning to day-to-day remediation
Requirements for success
Planning for a successful transition
Indications that the transition has been successful
Continuing the data quality journey
Roadmap of data quality initiatives
Identifying the next initiative
Obtaining support
What if no further initiative is sanctioned?
Summary
10
Best Practices and Common Mistakes
Best practices
Selecting the best practices
Manage data quality primarily at the source
Implementing supporting governance meetings
Including data quality in an organization-wide education program
Leveraging the data steward and producer relationship
Best practices throughout this book
Common mistakes
Failure to implement best practices
A lack of practicality
Technically driven data quality rules
One-off remediation activity
Restricting access to data quality results
Avoid silos in data quality work
The future of data quality work
LLMs
Greater emphasis on high-quality data in organizations
Summary
Index
Why subscribe?
Other Books You May Enjoy
Packt is searching for authors like you
Share Your Thoughts
Download a free PDF copy of this book