Data Science for Entrepreneurship: Principles and Methods for Data Engineering, Analytics, Entrepreneurship, and the Society

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

The fast-paced technological development and the plethora of data create numerous opportunities waiting to be exploited by entrepreneurs. This book provides a detailed, yet practical, introduction to the fundamental principles of data science and how entrepreneurs and would-be entrepreneurs can take advantage of it. It walks the reader through sections on data engineering, and data analytics as well as sections on data entrepreneurship and data use in relation to society. The book also offers ways to close the research and practice gaps between data science and entrepreneurship. By having read this book, students of entrepreneurship courses will be better able to commercialize data-driven ideas that may be solutions to real-life problems. Chapters contain detailed examples and cases for a better understanding. Discussion points or questions at the end of each chapter help to deeply reflect on the learning material.



Author(s): Werner Liebregts, Willem-Jan Van den Heuvel, Damian A. Tamburri, Willem-Jan van den Heuvel, Arjan van den Born, Florian Böing-Messing, Anne J. F. Lafarre
Series: Classroom Companion: Business
Publisher: Springer
Year: 2023

Language: English
Pages: 531
City: Cham

Preface
Acknowledgments
About the Book
Contents
About the Editors
Contributors
Editors and Contributors
1: The Unlikely Wedlock Between Data Science and Entrepreneurship
1.1 Introduction
1.2 Defining Data Science and Entrepreneurship
1.3 Towards a Definition of Data Entrepreneurship
1.4 Processes of Data Science and Entrepreneurship
1.4.1 The Data Science Process
1.4.2 The Entrepreneurial Process
1.4.3 Comparing Data Science and Entrepreneurial Processes
1.5 The Data Entrepreneurship Framework
References
I: Data Engineering
2: Big Data Engineering
2.1 Introduction: The Big Data Engineering Realm
2.1.1 Data Engineering Challenges in Theory and Practice
2.2 (Big) Data Engineering to Leverage Analytics
2.2.1 Value-Driven Big Data Engineering
2.2.2 Key Fabric of Data Engineering
2.2.2.1 Intelligent Enterprise Application Architecture (iA)2
2.2.2.2 Data Pipelines
2.2.2.3 Data Lakes and Data Warehouses
2.2.3 MLOps: Data Engineering (Finally) Meets AI/Machine Learning
Take-Home Messages
References
3: Data Governance
3.1 Introduction
3.2 Motivational Case Studies
3.2.1 SODALITE Vehicle IoT
3.2.2 SODALITE Clinical Trials
3.3 Data Governance in a Nutshell
3.4 Data Governance Dimensions
3.4.1 Data Principles
3.4.2 Data Quality
3.4.3 Metadata
3.4.4 Data Access
3.4.5 Data Life Cycle
3.5 Data Governance Structure
3.5.1 Executive Sponsor
3.5.2 Data Governance Council
3.5.3 Data Custodian
3.5.4 Data Steward
3.5.5 Data User Groups
3.6 Contemporary Data Governance
3.6.1 Big Data Governance
3.6.2 IoT Data Governance
3.7 Case Studies with Data Governance
3.7.1 SODALITE Vehicle IoT Architecture
3.7.2 SODALITE Clinical Trial Architecture
Take-Home Messages
References
4: Big Data Architectures
4.1 Introduction
4.2 Background
4.2.1 Key Attributes of Big Data Systems
4.2.2 From Structured Data to Semi-structured Data
4.3 Lambda Architecture
4.4 Kappa Architecture
4.5 SEI-CMU Reference Architecture
References
5: Data Engineering in Action
5.1 Introduction
5.2 The ANITA Project for the Fighting of Cybercrime
5.2.1 Data Collection
5.2.2 ANITA Architecture
5.2.3 Data Extraction
5.2.4 Data Management and Analysis
5.3 The PRoTECT Project for the Protection of Public Spaces
5.3.1 Objectives of PRoTECT
5.3.2 PRoTECT and the Data Fusion Approach
5.4 The Beehives Project for the Quality of Urban Biodiversity
5.4.1 Problem Description
5.4.2 Objectives
5.4.3 Data Gathering
5.4.4 Big Data Analytics for Biodiversity
5.4.5 Systemic Change
5.4.6 Bringing It All Together: The IoT Beehive Stratified Architecture
Take-Home Messages
References
II: Data Analytics
6: Supervised Machine Learning in a Nutshell
6.1 Introduction
6.2 Supervised Learning: Classification
6.2.1 Motivating Example: Credit Card Fraud Detection
6.2.2 An Overview of Classifiers
6.2.3 Evaluating a Classification Model
6.2.4 Designing a Pipeline for Machine Learning Classification
6.2.4.1 Data Mining
6.2.4.2 Data Preprocessing
6.2.4.3 Data Classification
6.3 Supervised Learning: Regression
6.3.1 Simple Linear Regression
6.3.2 Regression Methods: An Overview
6.3.3 Evaluating a Regression Model
6.3.4 Designing a Pipeline for Machine Learning Regression
6.3.4.1 Data Mining
6.3.4.2 Data Preprocessing
Take-Home Messages
References
Further Reading
7: An Intuitive Introduction to Deep Learning
7.1 Brief Historical Overview
7.2 Datasets, Instances, and Features
7.3 The Perceptron
7.3.1 The Decision Boundary
7.3.2 The Delta Learning Rule
7.3.3 Strengths and Limitations of the Perceptron
7.4 The Multilayer Perceptron
7.4.1 Combining Decision Boundaries
7.4.2 The Generalized Delta Learning Rule
7.5 Deep Neural Networks
7.5.1 Combinations of Combinations of … Decision Boundaries
7.5.2 The Generalized Delta Learning Rule in Deep Networks
7.5.3 From Two- to High-Dimensional Feature Vectors
7.6 Convolution: Shifting a Perceptron Over an Image
7.6.1 The Basic Convolution Operation
7.7 Convolutional Neural Networks
7.7.1 Convolutional Layers
7.7.2 Pooling Layers
7.7.3 Combinations of Combinations of … Features
7.7.4 Dense Layers
7.7.5 From AlexNet to Modern CNNs
7.8 Skin Cancer Diagnosis: A CNN Application
7.8.1 Introduction
7.8.2 Data Collection and Preparation
7.8.3 Baseline and Multitask CNN
7.8.4 Experiments and Results
7.8.5 Conclusion on the CNN Application
References
8: Sequential Experimentation and Learning
8.1 Introduction
8.2 The Multi-Armed Bandit Problem
8.3 Solutions to Bandit Problems: Allocation Policies
8.3.1 ϵ-First
8.3.2 ϵ-Greedy
8.3.3 Upper Confidence Bound Methods
8.3.4 Thompson Sampling
8.3.5 Bootstrapped Thompson Sampling
8.3.6 Policies for the Contextual MAB Problem
8.4 Evaluating Contextual Bandit Policies: The Contextual Package
8.4.1 Formalization of the cMAB Problem for Its Use in Contextual
8.4.2 Class Diagram and Structure
8.4.3 Context-Free Versus Contextual Policies
8.4.4 Offline Policy Evaluation with Unbalanced Logging Data
8.5 Experimenting with Bandit Policies: StreamingBandit
8.5.1 Basic Example
8.5.2 StreamingBandit in Action
References
9: Advanced Analytics on Complex Industrial Data
9.1 Introduction
9.2 Data Analytics for Fault Diagnosis
9.2.1 Maintenance of Equipment
9.2.2 Preparing the Data
9.2.3 Machine Learning Classifiers
9.2.4 Deep Learning Techniques
9.2.5 Fault Diagnosis in Practice
9.2.6 Simulating a Real-World Situation
9.2.7 Summary
9.3 Graph Signal Processing (GSP)
9.3.1 GSP Background
9.3.2 GSP Applications
9.3.3 Summary
9.4 Local Pattern Mining on Complex Graph Data
9.4.1 Overview
9.4.2 Local Pattern Mining on Graphs
9.4.3 Local Pattern Mining on Attributed Graphs
9.4.4 MinerLSD: Local Pattern Mining on Attributed Graphs
9.4.5 Application Example
9.4.6 Summary
Take-Home Messages
References
10: Data Analytics in Action
10.1 Introduction
10.2 BagsID: AI-Powered Software System to Reidentify Baggage
10.2.1 Business Proposition
10.2.2 System Overview
10.2.3 AI Engine
10.2.4 Software Engineering Aspects
10.3 Understanding Employee Communication with Longitudinal Social Network Analysis of Email Flows
10.3.1 Digital Innovation Communication Networks
10.3.2 The Relational Event Modeling Framework
10.4 Using Vehicle Sensor Data for Pay-How-You-Drive Insurance
10.4.1 Time Series
10.4.2 Driving Behavior Analysis
References
III: Data Entre
11: Data-Driven Decision-Making
11.1 Introduction
11.2 Introduction to Decision-Making
11.2.1 Decision-Making Characteristics
11.2.2 The Decision-Making Process and Decision Rules
11.2.3 Decision-Making for Entrepreneurs
11.3 Data-Driven Decision-Making
11.3.1 What Is Data-Driven Decision-Making?
11.3.2 Maturity Levels of Data-Driven Decision-Making
11.3.3 Methodology Options for Data-Driven Decision-Making
11.3.4 Data-Driven Decision-Making by Entrepreneurs
11.4 Data-Driven Decision-Making: Why?
11.4.1 Quality Reasons for Data-Driven Decision-Making
11.4.1.1 DDDM for Decision Quality
11.4.2 Capacity Reasons for Data-Driven Decision-Making
11.4.2.1 DDDM for Reducing Information Overload
11.4.3 Mental Reasons for Less Data-Driven Decision-Making
11.5 Data-Driven Decision-Making: How?
11.5.1 Overview of Data-Driven Decision-Making Solutions
11.5.2 Data-Driven Decision-Making Solutions for Programmed Decision-Making
11.5.2.1 Operations Research Solutions
11.5.2.2 Data Science Solutions
11.5.2.3 Recommender Systems
11.5.3 Data-Driven Decision-Making Solutions for Nonprogrammed Decision-Making
11.5.3.1 Agent-Based Modeling (ABM)
11.5.3.2 Case-Based Reasoning/Decision Analysis
11.5.3.3 Technology-Assisted Reviews (TAR)
11.5.3.4 Scenario-Based Decision-Making
11.5.3.5 Competitive Benchmarking
References
12: Digital Entrepreneurship
12.1 Introduction
12.2 What Is Digital Entrepreneurship?
12.3 What Is Different in the Digital Economy?
12.3.1 How Do Digitization and Digital Artifacts Affect the Nature of Business and of New Venture Creation?
12.3.2 What Are the Implications for Entrepreneurship of the Nature of the Digital Economy?
12.4 Digital Platforms and Digital Entrepreneurship
12.4.1 Creating and Growing a Digital Platform Firm
12.4.2 Competing on Digital Platforms
12.5 Supporting and Regulating Digital Entrepreneurship
12.5.1 Understanding and Supporting Digital Entrepreneurial Ecosystems
12.5.2 Regulating Digital Entrepreneurship
Take-Home Messages
References
13: Strategy in the Era of Digital Disruption
13.1 Introduction
13.2 Disruption Driven by Business Model Innovations
13.2.1 Freemium Business Models
13.2.2 Sharing Economy Business Models
13.2.3 Usage-Based Business Models
13.3 Disruption Driven by Innovation Ecosystems
13.3.1 Supply-Side Synergies
13.3.2 Demand-Side Synergies
13.4 Disruption Driven by Platforms and Network Effects
13.5 Discussion
13.5.1 Trend 1: Industry Crossover Trends in a Digital World
13.5.2 Trend 2: Changing Competitive Landscape
13.5.3 Trend 3: Rising Customer Expectations
References
14: Digital Servitization in Agriculture
14.1 Introduction
14.2 Servitization
14.3 Types of Services
14.4 Servitization in Agriculture
14.5 Digital Servitization
14.6 Digital Servitization in Agriculture
References
15: Entrepreneurial Finance
15.1 Introduction
15.2 Pre-seed Financing and Support
15.2.1 Family, Friends, and Fools
15.2.2 Accelerators, Incubators, and Startup Studios
15.2.2.1 Incubators
15.2.2.2 Accelerators
15.2.2.3 Startup Studios: Venture Builders
15.3 Early Sources of Funding (Seed and Startup Stage)
15.3.1 Business Angels
15.3.2 Crowdfunding
15.3.3 Initial Coin Offerings (ICOs)
15.4 Venture Capital and Private Equity (Growth Stage) (Da Rin & Hellmann, 2019)
15.4.1 Ownership and Valuation
15.4.2 Preferred Shares
15.4.3 Staged Financing
15.4.4 Corporate Governance
15.4.5 Exit Routes
15.5 Tech Startup Financing in Practice
15.6 Answers to the Cases
References
16: Entrepreneurial Marketing
16.1 Introduction
16.2 Defining Marketing and Sales
16.3 Customers Buy Solutions Rather Than Products
16.3.1 The Means-End Chain
16.3.2 Trade-Offs Regarding Radically New Products and Services
16.4 Co-developing and Positioning a New Product or Service
16.5 Organizing Customer Development as a Separate Process
16.6 A One-Page Marketing and Sales Plan
16.6.1 The General Motivation and Objectives
16.6.2 Three Main Pillars
16.6.3 Building a Marketing Information System
16.7 Leveraging Your Growing Customer Base
16.7.1 Segmentation and Targeting
16.7.2 Efficient A/B Testing of Value Proposition
Take-Home Messages
References
IV: Data and Society
17: Data Protection Law and Responsible Data Science
17.1 Introduction
17.2 A Few Words on the Meaning of Privacy and Data Protection
17.3 Material Scope of Data Protection Law: Defining Processing and Personal Data
17.3.1 Defining Processing
17.3.2 Defining Personal Data
17.3.2.1 “Any Information”
17.3.2.2 “Relating to”
17.3.2.3 “Identified or Identifiable”
17.3.2.4 “Natural Person (Data Subject)”
17.3.3 Conclusion: Personal Data and Non-personal Data
17.4 Personal Scope of Data Protection: Controller and Processor
17.4.1 The Three Main Actors of Data Protection
17.4.2 Data Controllers
17.4.3 Data Processors
17.4.4 Problematic Situations
17.4.4.1 Controller or Processor?
17.4.4.2 Multiple Controllers
17.5 Art. 6, GDPR: The Need for a Legitimate Ground of Processing
17.5.1 Consent
17.5.1.1 Consent Must Be Given in Relation to a Specific Purpose
17.5.1.2 Consent Must Be Informed
17.5.1.3 Consent Must Be Unambiguous
17.5.1.4 Consent Must Be Free
17.5.1.5 Special Categories of Data: Explicit Consent
17.5.2 Contract
17.5.3 Vital Interests of the Data Subject
17.5.4 Performance of a Task Carried Out in the Public Interest or in the Exercise of Official Authority Vested in the Control
17.5.5 Compliance with a Legal Obligation to Which the Controller Is Subject
17.5.6 Legitimate Interests of the Data Controller or a Third Party
17.5.6.1 The Interest of the Data Controller: A Legitimate One
17.5.6.2 Interests or Fundamental Rights of Data Subject
17.5.6.3 Balancing of Interests
Step 1: Qualify the Interests
Step 2: Impact(s) on the Data Subject
Step 3: Factors for Appraising the Impacts
Step 4: Provisional Balance
Step 5: Additional Safeguards
Step 6: Final Balance
17.6 Art. 5 GDPR: Principles to Be Applied to the Processing of Data
17.6.1 Purpose Limitation Principle
17.6.1.1 Purpose Specification: Why?
17.6.1.2 Specific Purpose
17.6.1.3 Explicit Purpose
17.6.1.4 Legitimate Purpose
17.6.1.5 Different Purpose
17.6.2 Data Minimisation
17.6.3 Storage Limitation
17.6.4 Additional Obligations
17.6.4.1 Data Accuracy
17.6.4.2 Lawfulness, Fairness, and Transparency
17.6.4.3 Integrity and Confidentiality
References
18: Perspectives from Intellectual Property Law
18.1 Introduction
18.2 Meeting the Criteria
18.2.1 The Formal Requirements of Copyright
18.2.2 Sui Generis Database Right
18.2.3 Trade Secret Right
18.2.4 Summary
18.3 The Scope of Protection
18.3.1 Copyright: Protected Subject Matter
18.3.2 Sui Generis Database Protection
18.3.3 Trade Secret Right
18.3.4 Summary
18.4 Exceptions and Limitations
18.4.1 Limitations of the Rights
18.4.2 Exceptions: Common Ground
18.4.3 Exceptions Specific to the Right
18.5 Alternative Sources
Further Reading
19: Liability and Contract Issues Regarding Data
19.1 Introduction
19.2 General Characteristics of Private Law
19.3 What Is Data?
19.4 Contracts and Data
19.4.1 Formation of Contracts
19.4.2 Content of Contracts
19.4.3 Contractual Remedies
19.4.3.1 Prerequisites for Invoking a Remedy
19.4.3.2 The Available Remedies
19.5 Tort Law and Data
19.5.1 Fault Liability
19.5.2 Strict Liability
19.5.3 Causality and Defenses
19.5.4 Damages and Other Remedies in Tort
Take-Home Messages
References
20: Data Ethics and Data Science: An Uneasy Marriage?
20.1 Introduction
20.2 Data Ethics in Academia
20.2.1 Moral Theories
20.2.2 Consequentialism
20.2.3 Deontological Ethics
20.2.4 Virtue Ethics
20.2.5 The Focus of Academic Data Ethics
20.3 Data Ethics in the Commercial Domain
20.3.1 Technological Level
20.3.2 Individual Level
20.3.3 Organizational Level
20.4 Law and Data Ethics
20.5 Data Ethics and Data Science: Are They in It for the Long Run?
References
21: Value-Sensitive Software Design
21.1 Introduction
21.2 The Good, the Bad, and the Never Neutral
21.2.1 Non-neutrality
21.2.2 Impact on a Micro-level
21.2.3 Impact on a Macro-level
21.2.4 In Sum
21.3 Employing the Never Neutral
21.3.1 A Challenge for Designers
21.3.2 Value-Sensitive Design
21.3.3 Values
21.3.4 Legal Values and Design
References
22: Data Science for Entrepreneurship: The Road Ahead
22.1 Introduction
22.2 The Road Ahead
22.2.1 AI Software
22.2.2 MLOps
22.2.3 Edge Computing
22.2.4 Digital Twins
22.2.5 Large-Scale Experimentation
22.2.6 Big Data and AI Opportunities
22.2.7 Government Regulation
References