The book describes the theoretical principles of nonstatistical methods of data analysis but without going deep into complex mathematics. The emphasis is laid on presentation of solved examples of real data either from authors' laboratories or from open literature. The examples cover wide range of applications such as quality assurance and quality control, critical analysis of experimental data, comparison of data samples from various sources, robust linear and nonlinear regression as well as various tasks from financial analysis. The examples are useful primarily for chemical engineers including analytical/quality laboratories in industry, designers of chemical and biological processes.
Features:
- Exclusive title on Mathematical Gnostics with multidisciplinary applications, and specific focus on chemical engineering.
- Clarifies the role of data space metrics including the right way of aggregation of uncertain data.
- Brings a new look on the data probability, information, entropy and thermodynamics of data uncertainty.
- Enables design of probability distributions for all real data samples including smaller ones.
- Includes data for examples with solutions with exercises in R or Python.
The book is aimed for Senior Undergraduate Students, Researchers, and Professionals in Chemical/Process Engineering, Engineering Physics, Stats, Mathematics, Materials, Geotechnical, Civil Engineering, Mining, Sales, Marketing and Service, and Finance.
Author(s): Pavel Kovanic
Publisher: CRC Press
Year: 2023
Language: English
Pages: 342
City: Boca Raton
Cover
Half Title
Title Page
Copyright Page
Contents
Preface
Introduction
Author Biography
1. Introductory Kindergarten
1.1. Elemental Notions
1.1.1. Abelian Group
1.1.2. Variability
1.1.3. Morphism and Invariant
1.1.4. Vector Space
1.1.5. Matrices
1.1.6. Probability Distribution
1.2. Sources of Inspiration for Mathematical Gnostics
1.2.1. Theory of General Systems
1.2.2. Theory of Measurement
1.2.3. Geometries
1.2.4. Maxwell's Contributions
1.2.5. Relativistic Physics
1.2.6. Thermodynamics
1.2.7. Matrix Algebra
1.3. Conclusions
2. Axioms
2.1. Axioms of the Data Model
2.2. Applications of Axiom 1
2.3. Data Aggregation as the Second Gnostic Axiom
2.4. Conclusions
3. Introduction to Non-Standard Thought
3.1. Paradigm
3.2. Statistical Paradigms
3.3. Statistical Data Weighing
3.4. Non-Statistical Paradigms of Uncertainty
3.5. On the Need of an Alternative to Statistics
3.6. Principles of Advanced Data Analysis
3.7. The Gnostic Concept
3.8. Conclusions
4. Quantification
4.1. Ideal Quantification
4.2. Real Quantification
4.3. Conclusions
5. Estimation and Ideal Gnostic Cycle
5.1. A Game with Nature
5.2. Double Numbers
5.3. Gnostic Data Characteristics
5.4. The Ideal Gnostic Cycle
5.5. Information Perpetuum Mobile?
5.6. Existence and Uniqueness of the Ideal Gnostic Cycle
5.7. Conclusions
6. Geometry
6.1. A Historical Dispute on Robustness of Statistics
6.2. Distance as a Problem
6.3. Additivity in Data Aggregation
6.3.1. Statistical Mean Value and Data Weighting
6.4. Double Robustness
6.5. The Curvature of the Space of Uncertain Data
6.6. Three Geometries
6.7. Conclusions
7. Aggregation
7.1. Why the Least Squares Method (Frequently) Works
7.2. Aggregation of Uncertain Data
7.3. The Second Axiom
7.4. Conclusions
8. Thermodynamics of Uncertain Data
8.1. Thermodynamic Interpretation of Gnostic Data Characteristics
8.2. Maxwell's Demon
8.3. Entropy ↔ Information Conversion
8.4. Albert Perez's Information
8.5. Statistical Interpretation of Gnostic Data Characteristics
8.6. Between Mediocristan and Extremistan
8.7. Conclusions
9. Kernel Estimation
9.1. Parzen's Estimating Kernel
9.2. Gnostic Kernel
9.3. Scale Parameters
9.4. Conclusions
10. Probability Distribution Functions
10.1. Probabilities
10.2. Data Domains
10.3. Tasks Solvable by Distribution Functions
10.4. The Estimating Local Distribution
10.5. Quantifying Distributions
10.6. Empirical Distribution Function and the Fit
10.7. Some Applications of Distribution Functions
10.7.1. Revealing Historical Information
10.7.2. Hypotheses Testing
10.7.3. A Large Survey of Chemical Pollutants
10.8. The Homogeneity Problem
10.9. Conclusions
11. Applications of Local Distributions
11.1. Enrichment of the EGDF-Analysis
11.2. Revealing Inner Structure of a Data Sample
11.3. Marginal Analysis
11.4. Information Capability of Data
11.5. Interval Analysis
11.6. Diversity of Samples
11.7. Conclusions
12. On the Notion of Normality
12.1. Normality of Data
12.1.1. Statistical Approach
12.1.2. Empirical Way in Clinical Practice
12.1.3. Similarity-Based Reference Values in Economy
12.1.4. Fuzzy-Set Approach
12.1.5. Automatic Warning and Emergency Systems
12.2. Requirements to Ideal Estimation of Bounds of Normality
12.3. Elements of Gnostic Solution of the Normality Problem in a One-Dimensional Analysis
12.4. Critics on the Identity Gaussian ≡ Normal
12.4.1. Re-definition of Normality
12.4.2. On a Still Daydreamed Research Project BONUS
12.5. Conclusions
13. Applications of Global Distribution Functions
13.1. Global Distribution Function
13.2. Comparison of Global with Local Distribution
13.3. Two Didactic Stories
13.4. Conclusions
14. Data Censoring
14.1. Uncensored Data
14.2. Left-Censored Data
14.3. Right-Censored Data
14.4. Interval Data
14.5. On an Unknown Limit of Detection
14.6. Examples of Surviving
14.7. Non-Standard Application of Data Censoring
14.7.1. Data and Psychology
14.7.2. Three Aspects of Data Interpretation
14.8. Conclusions
15. Gnostic Thermodynamic Analysis of Data Uncertainty
15.1. Gnostic Data Calibration
15.1.1. Real Data for Examples
15.2. Data Calibration
15.2.1. LS-Optimal Numerical Operators
15.3. Calibration of the NIST12 Data
15.4. Calibration of the NIST37 Data
15.5. Conclusions
16. Robust Estimation of a Constant
16.1. Gnostic Data Aggregation Principle Used in Estimation
16.2. Scale Parameter
16.3. More on the Gnostic Data Aggregation
16.3.1. Example
16.3.2. Example of Robust Estimation of the Mean of Multiplicative Data
16.3.3. Robust Estimation of the Mean of Simulated Data
16.4. Conclusions
17. Measuring the Data Uncertainty
17.1. Shortly on the Standard Approach
17.2. The Need of Objective Measuring the Variability
17.3. The Triplication of the Mean Values
17.4. The Need of a Unit of Uncertainty
17.5. The Error of a Mean
17.6. Examples
17.6.1. Swiss Fertility and Socioeconomic Indicators (1888) Data
17.6.2. Financial Statement Analysis
17.6.3. Weather Parameters
17.6.4. An Important Medical Parameter
17.6.5. Non-homogeneous Data
17.6.6. Parameters of Uncertainty
17.7. Discussion on Different Means
17.7.1. Re-definition of Variance
17.8. Conclusions
18. Homo- or Heteroscedastic Data
18.1. Decision Making
18.2. Examples
18.3. Conclusion
19. Gnostic Multidimensional Regression Models
19.1. Formulation of the Robust Regression Problem
19.2. Additive and Multiplicative Regression Models
19.3. Comparison of Robust Regression Models
19.3.1. Statistical Methods for Comparison
19.3.2. Robust Regression in Mathematical Gnostics
19.3.3. Data for Comparison
19.3.4. Criteria for Evaluation of Methods
19.3.5. Results of Comparison
19.3.6. Discussion of the Results
19.4. The Explicit and Implicit Regression Models
19.5. Examples
19.6. Homogeneity of an MD-Model
19.7. An Important Multidimensional Model
19.8. Applications of the Robust Regression Models
19.9. Conclusions
20. Data Filtering
20.1. Filtering
20.2. Total Data Variability and Its Components
20.3. Filtering by Regression
20.4. Filtering Effect of Proper Data Aggregation
20.5. Improving the Matrix Quality
20.6. Cleaning of Matrices
20.7. Conclusions
21. Decision Making in Mathematical Gnostics
21.1. Datacratic Decision Making in Mathematical Gnostics
21.2. Conclusions
22. Comparisons
22.1. Comparisons of Measurement of Toxicity
22.2. Comparison of Measurement of Concentration of Cannabinoids
22.2.1. Comparison of Multiplicative Errors
22.3. Requirements to the Advanced Comparison
22.4. Preparing Data for Analysis
22.5. Analysis of Measurement Errors
22.5.1. Characterization of Data
22.5.2. Comparison by Parameters
22.6. Conclusions
23. Advanced Production Quality Control
23.1. Exploratory Analysis
23.2. Automation of the Exploratory Analysis
23.3. On the Necessity of Data Inspection
23.4. Data Certification
23.5. Example of Advanced Quality Control
23.6. Homogeneity and Outliers
23.7. Estimation of Left-Censored Data
23.8. Data Certification and Interval Analysis
23.9. Comparison of Laboratories
23.10. Conclusions
24. Robust Correlation
24.1. Correlation via Distribution Functions
24.2. Correlations by Means of Regression
24.3. Correlation and Filtering
24.4. Autocorrelations
24.5. Conclusions
25. General Relations
25.1. Relations Considered in Mathematical Gnostics
25.2. Robust Curve Fitting
25.3. The Experimental Mathematics
25.4. Visualization of a Matrix
25.5. Critical Points
25.5.1. Relations in Biology
25.5.2. Relations in Technology
25.5.3. Relations in Meteorology
25.5.4. Auto-Relations
25.6. Conclusions
Bibliography
Index