Data in Context: Models as Enablers for Managing and Using Data

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This book is about data fundamentals. The author describes in an accessible manner the fundamentals of what data is, the difference between information and data, and the pragmatic application of abstract data modeling techniques in the day-to-day business. The data management techniques are described both from a scientific, theoretical perspective and from an a practical, application-related perspective. The theoretical concepts are illustrated with concrete examples. This combination of theory and practical application provides an own unique signature to this data management book, which distinguishes it from many other books. The field of Data Science is extensive and pulls together several disciplines as I will demonstrate shortly. A full account of what Data Science is, how it works, and all the techniques that are involved is beyond the scope of this chapter. My objective is to capture the highlights in line with the purpose of this book, to present an overview of (understanding, managing, and using) data, and getting value from it in a specific business context. Several definitions of Data Science are currently in use. Wikipedia defines it as follows: Data Science is an interdisciplinary academic field that uses statistics, scientific computing, scientific methods, processes, algorithms, and systems to extract or extrapolate knowledge and insights from noisy, structured, and unstructured data. A definition by IBM states that: Data Science combines math and statistics, specialized programming, advanced analytics, Artificial Intelligence (AI), and Machine Learning with specific subject matter expertise to uncover actionable insights hidden in an organization’s data. These insights can be used to guide decision making and strategic planning.

Author(s): Bas van Gils
Publisher: Springer
Year: 2023

Language: English
Pages: 220

Preface
Foreword by Hans Mulder
Foreword by Raymond Slot
Foreword by Ronald Baan
Contents
Chapter 1 Introduction
Part I Data
Chapter 2 Understanding Data
2.1 Data and Information
2.2 Data and Semiotics
2.3 The Relational Model
2.3.1 Variables and values
2.3.2 Tuples
2.3.3 Relations
2.4 Relational Algebra
2.4.1 Union
2.4.2 Intersection
2.4.3 Difference
2.4.4 Projection
2.4.5 Product
2.4.6 Divide
2.4.7 Restriction
2.4.8 Join
2.5 Normalization
2.5.1 First Normal Form (1NF)
2.5.2 Second Normal Form (2NF)
2.5.3 Third Normal Form (3NF)
2.5.4 Boyce-Codd Normal Form (BCNF)
2.6 Conclusion
2.7 Reflection Questions
Chapter 3 Designing Data Structures
3.1 Notation
3.2 Entities
3.2.1 Oneness
3.2.2 Sameness
3.2.3 Categories
3.2.4 Specialization and Generalization
3.2.5 Naming Entities
3.2.6 Entities and the Semiotic Triangle
3.3 Attributes
3.3.1 Modeling Approaches
3.3.2 Attribution
3.4 Relationships
3.4.1 Relationships and Relations
3.4.2 Naming of Relationships
3.4.3 Self-relations
3.5 Worked Example
3.6 Reflection Questions
Chapter 4 Context
4.1 The Meanings of Context
4.2 Relevance of Context
4.3 Reflection Questions
Chapter 5 Related Approaches
5.1 Trees
5.2 Graphs
5.3 Facts
5.3.1 ORM2
5.3.2 DEMO
5.4 Ontology Modeling
5.5 Reflection Questions
Part II Data Management
Chapter 6 Managing Data as an Asset
Chapter 7 Data Modeling and Design
7.1 Data at Rest/in Motion
7.2 Modeling Styles for Data at Rest
7.2.1 Document Stores
7.2.2 Graph Databases
7.2.3 Business Intelligence Data
7.3 Reflection Questions
Chapter 8 Data Architecture
8.1 Defining Architecture
8.2 Data Landscape
8.3 Standardization and Integration
8.4 Data Integration Challenges
8.5 Data Architecture Example
8.6 Reflection Questions
Chapter 9 Data Storage and Operations
9.1 Database Management Systems
9.2 Data Life Cycle
9.3 Performance
9.4 Backups
9.5 Cloud Databases
9.6 Streaming Data
9.7 Reflection Questions
Chapter 10 Data Security
10.1 Risk
10.2 Confidentiality, Integrity, and Availability
10.2.1 Confidentiality
10.2.2 Integrity
10.2.3 Availability
10.3 Synthesis
10.4 Example
10.5 Reflection Questions
Chapter 11 Data Integration
11.1 Integration Concerns
11.2 Integration Patterns
11.2.1 Extract, Transform, Load (ETL)
11.2.2 Change Data Capture (CDC)
11.2.3 Data as a Service (DaaS)
11.2.4 Considerations
11.3 Reflection Questions
Chapter 12 Document and Content Management
12.1 Similarities with ‘Normal’ Data
12.2 Differences with ‘Normal’ Data
12.3 Integration of Structured/Unstructured Data
12.4 Growing Importance of Unstructured Data
12.5 Conclusion
12.6 Reflection Questions
Chapter 13 Reference and Master Data Management
13.1 Reference Data
13.2 Master Data
13.3 Considerations
13.4 Reflection Questions
Chapter 14 Data Warehousing and Business Intelligence
14.1 Classic DWH Architectures
14.1.1 Kimball Architecture
14.1.2 Inmon Architecture
14.2 Modern DWH Architectures
14.2.1 DIAL
14.2.2 Data Virtualization
14.2.3 Data Lake
14.3 Reflection Questions
Chapter 15 Data Science
15.1 Defining Data Science
15.2 Data Science and Artificial Intelligence
15.3 Results and Deployment
15.4 Critical Evaluation
15.5 Reflection Questions
Chapter 16 Metadata
16.1 Business Metadata
16.2 Technical Metadata
16.3 Operational Metadata
16.4 Horizontal Lineage
16.5 Vertical Lineage
16.6 Reflection Questions
Chapter 17 Data Quality
17.1 Data Quality Dimensions
17.1.1 Accuracy
17.1.2 Completeness of Records
17.1.3 Credibility of Data Values
17.1.4 Validity of Data Values
17.2 Data Quality Management
17.3 Data Quality: Example
17.4 Reflection Questions
Chapter 18 Data Governance
18.1 Definition
18.2 Common Implementation Model
18.3 Data Governance Board and Data (Management) Strategy
18.4 Non-Invasive Data Governance
18.5 Data Mesh
18.6 Data Governance and Data Management
18.7 Reflection Questions
Part III Parting Thoughts
Chapter 19 Conclusion
19.1 Nested Means-End
19.2 Understanding Data and Designing Data Structures
19.3 Data Management
19.4 Implications and Call to Action
19.5 Critical Reflection
Appendix A
Some Musings on SQL
A.1 Setting the Scene
A.2 Querying the Database
A.3 Parting Thoughts
About the author
Literature
Semiotics and the definition of ‘data’
Relational model/databases
ERD/ Modeling
Analysis and design
Data management
Architecture
Index