In order best exploit the incredible quantities of data being generated in most diverse disciplines data sciences increasingly gain worldwide importance. The book gives the mathematical foundations to handle data properly. It introduces basics and functionalities of the R programming language which has become the indispensable tool for data sciences. Thus it delivers the reader the skills needed to build own tool kits of a modern data scientist.
Author(s): Frank Emmert-Streib, Salissou Moutari, Matthias Dehmer
Publisher: de Gruyter
Year: 2020
Language: English
Tags: Statistics, Data Science, R Machine Learning
Preface
1 Introduction
1.1 Relationships between mathematical subjects and data science
1.2 Structure of the book
1.2.1 Part one
1.2.2 Part two
1.2.3 Part three
1.3 Our motivation for writing this book
1.4 Examples and listings
1.5 How to use this book
Part I Introduction to R
2 Overview of programming paradigms
2.1 Introduction
2.2 Imperative programming
2.3 Functional programming
2.4 Object-oriented programming
2.5 Logic programming
2.6 Other programming paradigms
2.7 Compiler versus interpreter languages
2.8 Semantics of programming languages
2.9 Further reading
2.10 Summary
3 Setting up and installing the R program
3.1 Installing R on Linux
3.2 Installing R on MAC OS X
3.3 Installing R on Windows
3.4 Using R
3.5 Summary
4 Installation of R packages
4.1 Installing packages from CRAN
4.2 Installing packages from Bioconductor
4.3 Installing packages from GitHub
4.4 Installing packages manually
4.5 Activation of a package in an R session
4.6 Summary
5 Introduction to programming in R
5.1 Basic elements of R
5.2 Basic programming
5.3 Data structures
5.4 Handling character strings
5.5 Sorting vectors
5.6 Writing functions
5.7 Writing and reading data
5.8 Useful commands
5.9 Practical usage of R
5.10 Summary
6 Creating R packages
6.1 Requirements
6.2 R code optimization
6.3 S3, S4, and RC object-oriented systems
6.4 Creating an R package based on the S3 class system
6.5 Checking the package
6.6 Installation and usage of the package
6.7 Loading and using a package
6.8 Summary
Part II Graphics in R
7 Basic plotting functions
7.1 Plot
7.2 Histograms
7.3 Bar plots
7.4 Pie charts
7.5 Dot plots
7.6 Strip and rug plots
7.7 Density plots
7.8 Combining a scatterplot with histograms: the layout function
7.9 Three-dimensional plots
7.10 Contour and image plots
7.11 Summary
8 Advanced plotting functions: ggplot2
8.1 Introduction
8.2 qplot()
8.3 ggplot()
8.4 Summary
9 Visualization of networks
9.1 Introduction
9.2 igraph
9.3 NetBioV
9.4 Summary
Part III Mathematical basics of data science
10 Mathematics as a language for science
10.1 Introduction
10.2 Numbers and number operations
10.3 Sets and set operations
10.4 Boolean logic
10.5 Sum, product, and Binomial coefficients
10.6 Further symbols
10.7 Importance of definitions and theorems
10.8 Summary
11 Computability and complexity
11.1 Introduction
11.2 A brief history of computer science
11.3 Turing machines
11.4 Computability
11.5 Complexity of algorithms
11.6 Summary
12 Linear algebra
12.1 Vectors and matrices
12.2 Operations with matrices
12.3 Special matrices
12.4 Trace and determinant of a matrix
12.5 Subspaces, dimension, and rank of a matrix
12.6 Eigenvalues and eigenvectors of a matrix
12.7 Matrix norms
12.8 Matrix factorization
12.9 Systems of linear equations
12.10 Exercises
13 Analysis
13.1 Introduction
13.2 Limiting values
13.3 Differentiation
13.4 Extrema of a function
13.5 Taylor series expansion
13.6 Integrals
13.7 Polynomial interpolation
13.8 Root finding methods
13.9 Further reading
13.10 Exercises
14 Differential equations
14.1 Ordinary differential equations (ODE)
14.2 Partial differential equations (PDE)
14.3 Exercises
15 Dynamical systems
15.1 Introduction
15.2 Population growth models
15.3 The Lotka–Volterra or predator–prey system
15.4 Cellular automata
15.5 Random Boolean networks
15.6 Case studies of dynamical system models with complex attractors
15.7 Fractals
15.8 Exercises
16 Graph theory and network analysis
16.1 Introduction
16.2 Basic types of networks
16.3 Quantitative network measures
16.4 Graph algorithms
16.5 Network models and graph classes
16.6 Further reading
16.7 Summary
16.8 Exercises
17 Probability theory
17.1 Events and sample space
17.2 Set theory
17.3 Definition of probability
17.4 Conditional probability
17.5 Conditional probability and independence
17.6 Random variables and their distribution function
17.7 Discrete and continuous distributions
17.8 Expectation values and moments
17.9 Bivariate distributions
17.10 Multivariate distributions
17.11 Important discrete distributions
17.12 Important continuous distributions
17.13 Bayes’ theorem
17.14 Information theory
17.15 Law of large numbers
17.16 Central limit theorem
17.17 Concentration inequalities
17.18 Further reading
17.19 Summary
17.20 Exercises
18 Optimization
18.1 Introduction
18.2 Formulation of an optimization problem
18.3 Unconstrained optimization problems
18.4 Constrained optimization problems
18.5 Some applications in statistical machine learning
18.6 Further reading
18.7 Summary
18.8 Exercises
Bibliography
Subject Index