This textbook aims to point out the most important principles of data analysis from the mathematical point of view. Specifically, it selected these questions for exploring: Which are the principles necessary to understand the implications of an application, and which are necessary to understand the conditions for the success of methods used? Theory is presented only to the degree necessary to apply it properly, striving for the balance between excessive complexity and oversimplification. Its primary focus is on principles crucial for application success. Topics and features:
- Focuses on approaches supported by mathematical arguments, rather than sole computing experiences
- Investigates conditions under which numerical algorithms used in data science operate, and what performance can be expected from them
- Considers key data science problems: problem formulation including optimality measure; learning and generalization in relationships to training set size and number of free parameters; and convergence of numerical algorithms
- Examines original mathematical disciplines (statistics, numerical mathematics, system theory) as they are specifically relevant to a given problem
- Addresses the trade-off between model size and volume of data available for its identification and its consequences for model parametrization
- Investigates the mathematical principles involves with natural language processing and computer vision
- Keeps subject coverage intentionally compact, focusing on key issues of each topic to encourage full comprehension of the entire book
Although this core textbook aims directly at students of computer science and/or data science, it will be of real appeal, too, to researchers in the field who want to gain a proper understanding of the mathematical foundations “beyond” the sole computing experience.
Author(s): Tomas Hrycej, Bernhard Bermeitinger, Matthias Cetto, Siegfried Handschuh
Series: Texts in Computer Science
Edition: 1
Publisher: Springer
Year: 2023
Language: English
Pages: 226
City: Cham
Tags: Data Science; Big Data; Statistical Learning; Machine Learning; Deep Learning; Artificial Neural Networks; Data Processing; Pattern Recognition; Learning and Generalization; Numerical Algorithms; Natural Language Processing; Computer Vision
Preface
For Whom Is This Book Written?
What Makes This Book Different?
Comprehension Checks
Acknowledgments
Contents
Acronyms
Data Science and Its Tasks
Mathematical Foundations
Application-Specific Mappings and Measuring the Fit to Data
2.1 Continuous Mappings
2.1.1 Nonlinear Continuous Mappings
2.1.2 Mappings of Probability Distributions
2.2 Classification
2.2.1 Special Case: Two Linearly Separable Classes
2.2.2 Minimum Misclassification Rate for Two Classes
2.2.3 Probabilistic Classification
2.2.4 Generalization to Multiple Classes
2.3 Dynamical Systems
2.4 Spatial Systems
2.5 Mappings Received by ``Unsupervised Learning''
2.5.1 Representations with Reduced Dimensionality
2.5.2 Optimal Encoding
2.5.3 Clusters as Unsupervised Classes
2.6 Chapter Summary
2.7 Comprehension Check
3 Data Processing by Neural Networks
3.1 Feedforward and Feedback Networks
3.2 Data Processing by Feedforward Networks
3.3 Data Processing by Feedback Networks
3.4 Feedforward Networks with External Feedback
3.5 Interpretation of Network Weights
3.6 Connectivity of Layered Networks
3.7 Shallow Networks Versus Deep Networks
3.8 Chapter Summary
3.9 Comprehension Check
Learning and Generalization
4.1 Algebraic Conditions for Fitting Error Minimization
4.2 Linear and Nonlinear Mappings
4.3 Overdetermined Case with Noise
4.4 Noise and Generalization
4.5 Generalization in the Underdetermined Case
4.6 Statistical Conditions for Generalization
4.7 Idea of Regularization and Its Limits
4.7.1 Special Case: Ridge Regression
4.8 Cross-Validation
4.9 Parameter Reduction Versus Regularization
4.10 Chapter Summary
4.11 Comprehension Check
5 Numerical Algorithms for Data Science
5.1 Classes of Minimization Problems
5.1.1 Quadratic Optimization
5.1.2 Convex Optimization
5.1.3 Non-convex Local Optimization
5.1.4 Global Optimization
5.2 Gradient Computation in Neural Networks
5.3 Algorithms for Convex Optimization
5.4 Non-convex Problems with a Single Attractor
5.4.1 Methods with Adaptive Step Size
5.4.2 Stochastic Gradient Methods
5.5 Addressing the Problem of Multiple Minima
5.5.1 Momentum Term
5.5.2 Simulated Annealing
5.6 Section Summary
5.7 Comprehension Check
Applications
Specific Problems of Natural Language Processing
6.1 Word Embeddings
6.2 Semantic Similarity
6.3 Recurrent Versus Sequence Processing Approaches
6.4 Recurrent Neural Networks
6.5 Attention Mechanism
6.6 Autocoding and Its Modification
6.7 Transformer Encoder
6.7.1 Self-attention
6.7.2 Position-Wise Feedforward Networks
6.7.3 Residual Connection and Layer Normalization
6.8 Section Summary
6.9 Comprehension Check
Specific Problems of Computer Vision
7.1 Sequence of Convolutional Operators
7.1.1 Convolutional Layer
7.1.2 Pooling Layers
7.1.3 Implementations of Convolutional Neural Networks
7.2 Handling Invariances
7.3 Application of Transformer Architecture to Computer Vision
7.3.1 Attention Mechanism for Computer Vision
7.3.2 Division into Patches
7.4 Section Summary
7.5 Comprehension Check
Index