Mathematical Methods in Data Science

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

Mathematical Methods in Data Science introduces a new approach based on network analysis to integrate big data into the framework of ordinary and partial differential equations for data analysis and prediction. The mathematics is accompanied with examples and problems arising in data science to demonstrate advanced mathematics, in particular, data-driven differential equations used. Chapters also cover network analysis, ordinary and partial differential equations based on recent published and unpublished results. Finally, the book introduces a new approach based on network analysis to integrate big data into the framework of ordinary and partial differential equations for data analysis and prediction.

There are a number of books on mathematical methods in data science. Currently, all these related books primarily focus on linear algebra, optimization and statistical methods. However, network analysis, ordinary and partial differential equation models play an increasingly important role in data science. With the availability of unprecedented amount of clinical, epidemiological and social COVID-19 data, data-driven differential equation models have become more useful for infection prediction and analysis.

Author(s): Jingli Ren, Haiyan Wang
Publisher: Elsevier
Year: 2023

Language: English
Pages: 258
City: Amsterdam

Front Cover
Mathematical Methods in Data Science
Copyright
Contents
Preface
Acknowledgments
1 Linear algebra
1.1 Introduction
1.2 Elements of linear algebra
1.2.1 Linear spaces
1.2.1.1 Linear combinations
1.2.1.2 Linear independence and dimension
1.2.2 Orthogonality
1.2.2.1 Orthonormal bases
1.2.2.2 Best approximation theorem
1.2.3 Gram–Schmidt process
1.2.4 Eigenvalues and eigenvectors
1.2.4.1 Diagonalization of symmetric matrices
1.2.4.2 Constrained optimization
1.3 Linear regression
1.3.1 QR decomposition
1.3.2 Least-squares problems
1.3.3 Linear regression
1.4 Principal component analysis
1.4.1 Singular value decomposition
1.4.2 Low-rank matrix approximations
1.4.3 Principal component analysis
1.4.3.1 Covariance matrix
1.4.3.2 Principal component analysis
1.4.3.3 Total variance
2 Probability
2.1 Introduction
2.2 Probability distribution
2.2.1 Probability axioms
2.2.1.1 Sample spaces and events
2.2.2 Conditional probability
2.2.3 Discrete random variables
2.2.3.1 The expected value and variance of X
2.2.4 Continues random variables
2.2.4.1 Expected values and variances
2.2.4.2 The normal distribution
2.3 Independent variables and random samples
2.3.1 Joint probability distributions
2.3.1.1 Two discrete random variables
2.3.1.2 Two continuous random variables
2.3.1.3 Independent random variables
2.3.2 Correlation and dependence
2.3.2.1 Correlation for random variables
2.3.2.2 Correlation for samples
2.3.3 Random samples
2.3.3.1 Random samples
2.3.3.2 The central limit theorem
2.4 Maximum likelihood estimation
2.4.1 MLE for random samples
2.4.2 Linear regression
3 Calculus and optimization
3.1 Introduction
3.2 Continuity and differentiation
3.2.1 Limits and continuity
3.2.2 Derivatives
3.2.2.1 Single-variable case
3.2.2.2 General case
3.2.2.3 Further derivatives
3.2.3 Taylor's theorem
3.3 Unconstrained optimization
3.3.1 Necessary and sufficient conditions of local minimizers
3.3.1.1 Sufficient conditions for local minimizers
3.3.2 Convexity and global minimizers
3.3.2.1 Convex sets and functions
3.3.2.2 Global minimizers of convex functions
3.3.3 Gradient descent
3.3.3.1 Steepest descent
3.4 Logistic regression
3.5 K-means
3.6 Support vector machine
3.7 Neural networks
3.7.1 Mathematical formulation
3.7.2 Activation functions
3.7.2.1 Step function
3.7.2.2 ReLU function
3.7.2.3 Sigmoid
3.7.2.4 Softmax function
3.7.3 Cost function
3.7.4 Backpropagation
3.7.5 Backpropagation algorithm
4 Network analysis
4.1 Introduction
4.2 Graph modeling
4.3 Spectral graph bipartitioning
4.4 Network embedding
4.5 Network based influenza prediction
4.5.1 Introduction
4.5.2 Data analysis with spatial networks
4.5.2.1 Collection of data
4.5.2.2 Regression models for spatial network
4.5.3 ANN method for prediction
4.5.3.1 Dynamical complexity of ILI
4.5.3.2 DRBFNN method for prediction
4.5.3.3 Performance analysis
5 Ordinary differential equations
5.1 Introduction
5.2 Basic differential equation models
5.2.1 Logistic differential equations
5.2.2 Epidemical model
5.3 Prediction of daily PM2.5 concentration
5.3.1 Introduction
5.3.2 Genetic programming for ODE
5.3.2.1 Construction of ODEs for prediction of PM2.5 concentration
5.3.2.2 Genetic programming for ODE structure
5.3.2.3 Fitness function
5.3.2.4 Parameter identification
5.3.3 Experimental results and prediction analysis
5.3.3.1 Statistical model
5.3.3.2 ODE models obtained by our data-driven method
5.3.3.3 Prediction comparison between statistical model and the ODE models
5.3.3.4 Remarks
5.4 Analysis of COVID-19
5.4.1 Introduction
5.4.2 Modeling and parameter estimation
5.4.2.1 Model hypothesis
5.4.2.2 Model establishment
5.4.2.3 Parameter estimation
5.4.3 Model simulations
5.4.4 Conclusion and perspective
5.5 Analysis of COVID-19 in Arizona
5.5.1 Introduction
5.5.2 Data sources and collection
5.5.3 Model simulations
5.5.4 Remarks
6 Partial differential equations
6.1 Introduction
6.2 Formulation of partial differential equation models
6.3 Bitcoin price prediction
6.3.1 Network analysis for bitcoin
6.3.2 PDE modeling
6.3.3 Bitcoin price prediction
6.3.4 Remarks
6.4 Prediction of PM2.5 in China
6.4.1 Introduction
6.4.2 PDE model for PM2.5
6.4.3 Data collection and clustering
6.4.4 PM2.5 prediction
6.4.5 Remarks
6.5 Prediction of COVID-19 in Arizona
6.5.1 Introduction
6.5.2 Arizona COVID data
6.5.3 PDE modeling of Arizona COVID-19
6.5.4 Model prediction
6.5.5 Remarks
6.6 Compliance with COVID-19 mitigation policies in the US
6.6.1 Introduction
6.6.2 Data set sources and collection
6.6.3 PDE model for quantifying compliance with COVID-19 policies
6.6.4 Model prediction
6.6.5 Analysis of compliance with the US COVID-19 mitigation policy
6.6.6 Remarks
Bibliography
Index
Back Cover