This book discusses Artificial Neural Networks (ANN) and their ability to predict outcomes using deep and shallow learning principles. The author first describes ANN implementation, consisting of at least three layers that must be established together with cells, one of which is input, the other is output, and the third is a hidden (intermediate) layer. For this, the author states, it is necessary to develop an architecture that will not model mathematical rules but only the action and response variables that control the event and the reactions that may occur within it. The book explains the reasons and necessity of each ANN model, considering the similarity to the previous methods and the philosophical - logical rules.
Author(s): Zekâi Şen
Publisher: Springer
Year: 2023
Language: English
Pages: 677
City: Cham
Preface
Contents
Chapter 1: Introduction
1.1 General
1.2 Historical Developments
1.3 Information and Knowledge Evolution Stages
1.4 Determinism Versus Uncertainty
1.4.1 Randomness
1.5 Logic
1.5.1 Bivalent (Crisp) Logic
1.5.2 Fuzzy Logic
1.5.3 Bivalent-Fuzzy Distinction
1.6 Humans, Society, and Technology
1.7 Education and Uncertainty
1.8 Future Scientific Methodological Developments
1.8.1 Shallow Learning
1.8.2 Deep Learning
1.8.3 Shallow-Deep Learning Relations
1.9 Innovation
1.9.1 Inventions
1.10 Book Content and Reading Recommendations
1.11 Conclusions
References
Chapter 2: Artificial Intelligence
2.1 General
2.2 Artificial Intelligence History
2.2.1 Before Renaissance
2.2.2 Recent History
2.2.3 AI Education
2.3 Humans and Intelligence
2.4 Intelligence Types
2.5 Artificial Intelligence Methods
2.5.1 Rational AI
2.5.2 Methodical AI
2.6 Artificial Intelligence Methodologies
2.7 Natural and Artificial Intelligence Comparison
2.8 AI Proposal
2.9 AI in Science and Technology
2.10 Misuses in Artificial Intelligence Studies
2.11 Conclusions
References
Chapter 3: Philosophical and Logical Principles in Science
3.1 General
3.2 Human Mind
3.3 Rational Thought Models and Reasoning
3.3.1 Deductive
3.3.2 Inductive
3.3.3 Deductive and Inductive Conclusion
3.3.4 Proportionality Rule
3.3.5 Shape Rule
3.4 Philosophy
3.4.1 Ontology
3.4.2 Metaphysics
3.4.3 Epistemology
3.4.4 Aesthetics
3.4.5 Ethics
3.5 Science
3.5.1 Phenomenological
3.5.2 Logical Foundation
3.5.3 Objectivity
3.5.4 Testability
3.5.5 Selectivity
3.5.6 Falsification
3.5.7 Restrictive Assumptions
3.6 Science and Philosophy
3.6.1 Philosophy of Science
3.6.2 Implications for the Philosophy of Science
3.7 Logic
3.7.1 Logic Rules
3.7.2 Elements of Logic
3.7.2.1 Bivalent (Crisp) Logic Words
3.7.3 Logic Sentences (Propositions)
3.7.4 Propositions and Inferences
3.7.5 Logic Circuits
3.7.6 Logic Types
3.7.6.1 Crisp Logic
3.7.6.2 Multiple Logic
3.7.6.3 Probabilistic Logic
3.7.6.4 Symbolic Logic
3.7.6.5 Logic, Engineering, and Computers
3.8 Sets and Clusters
3.8.1 Crisp Sets
3.8.1.1 Subsets
3.8.2 Fuzzy Sets
3.8.2.1 Membership Functions
3.8.2.2 Fuzzy Operations
Fuzzy ANDing
Fuzzy ORing
Fuzzy NOTing
3.9 Fuzzy Logic Principles
3.9.1 Fuzziness in Daily Affairs
3.9.2 Fuzzy Logical Thinking Model
3.9.3 The Need for Fuzzy Logic
3.9.4 Mind as the Source of Fuzziness
3.9.5 Fuzzy Propositions
3.9.6 Fuzzy Inferences System (FIS)
3.9.7 Fuzzy Modeling Systems
3.9.7.1 Pure Fuzzy (Mamdani) System
3.9.7.2 Partial Fuzzy (Sugeno) System
3.9.7.3 General Fuzzy System
3.10 Defuzzification
3.11 Conclusions
References
Chapter 4: Uncertainty and Modeling Principles
4.1 General
4.2 Percentages and Probability Principles
4.3 Probability Measures and Definitions
4.3.1 Frequency Definition
4.3.2 Classical Definition
4.3.3 Subjective Definition
4.4 Types of Probability
4.4.1 Common Probability
4.4.2 Conditional Probability
4.4.3 Marginal Probability
4.5 Axioms of Probability
4.5.1 Probability Dependence and Independence
4.5.2 Probability Assumptions
4.6 Numerical Uncertainties
4.6.1 Uncertainty Definitions
4.7 Forecast: Estimation
4.8 Types of Uncertainty
4.8.1 Chaotic Uncertainty
4.9 Indeterminism
4.10 Uncertainty in Science
4.11 Importance of Statistics
4.12 Basic Questions Prior to Data Treatment
4.13 Simple Probability Principles
4.14 Statistical Principles
4.15 Statistical Parameters
4.15.1 Central Measure Parameters
4.15.1.1 Arithmetic Mean
4.15.1.2 Median
4.15.1.3 Mode
4.15.1.4 Percentiles
4.15.1.5 Statistical Weighted Average
4.15.1.6 Physical Weighted Average
4.15.1.7 Geometric Average
4.15.1.8 Harmonic Average
4.15.2 Deviation Parameters
4.15.2.1 Data Range
4.15.2.2 Deviations
4.15.2.3 Sum of Square Deviations
4.15.2.4 Variance
4.15.2.5 Skewness Coefficient
4.15.2.6 Kurtosis Coefficient
4.15.2.7 Standard Deviation
4.15.2.8 Variation Coefficient
4.15.2.9 Standardized Variables
4.15.2.10 Dimensionless Standard Series
4.15.2.11 Skewness: Kurtosis Diagram
4.15.2.12 Engineering Truncations
4.16 Histogram (Percentage Frequency Diagram)
4.16.1 Data Frequency
4.16.2 Subintervals and Parameters
4.17 Normal (Gaussian) Test
4.18 Statistical Model Efficiency Formulations
4.19 Correlation Coefficient
4.19.1 Pearson Correlation
4.19.2 Nonparametric Correlation Coefficient
4.19.2.1 Spearman´s Rank Correlation
4.19.2.2 Autorun (Sen) Test
4.20 Classical Regression Techniques
4.20.1 Scatter Diagrams
4.20.2 Mathematical Linear Regression Model
4.20.3 Statistical Linear Regression Model
4.20.4 Least Squares
4.20.5 Simple Linear Regression Procedure
4.20.6 Residual Properties
4.21 Cluster Regression Analysis
4.21.1 Study Area
4.21.2 Cluster Regression Model
4.21.3 Application
4.22 Trend Identification Methodologies
4.22.1 Mann-Kendal (MK) Test
4.22.2 Sen Slope (SS)
4.22.3 Regression Method (RM)
4.22.4 Spearman´s Rho Test (SR)
4.22.5 Pettitt Change Point Test
4.22.6 Innovative Trend Analysis (ITA)
4.23 Future Directions and Recommendations
4.24 Conclusions
References
Chapter 5: Mathematical Modeling Principles
5.1 General
5.2 Conceptual Models
5.2.1 Knowledge and Information
5.2.2 Observation
5.2.3 Experience
5.2.4 Audiovisual
5.3 Mathematics
5.3.1 Arithmetic Operations
5.3.2 Logical Relationships
5.3.3 Equations
5.4 Geometry and Algebra
5.5 Modeling Principles
5.5.1 Square Graph for Model Output Justification
5.5.2 Model Modification
5.5.3 Discussions
5.6 Equation with Logic
5.6.1 Equation by Experiment
5.6.2 Extracting Equations from Data
5.6.3 Extracting Equations from Dimensions
5.7 Logical Mathematical Derivations
5.7.1 Logic Modeling of Electrical Circuits
5.8 Risk and Reliability
5.9 The Logic of Mathematical Functions
5.9.1 Straight Line
5.9.2 Quadratic Curve (Parabola)
5.9.3 Cubic Curve
5.9.4 Multi-degree Curve (Polynomial)
5.9.5 Equation with Decimal Exponent (Power Function)
5.9.6 Exponential Curve
5.9.7 Logarithmic Curve
5.9.8 Double Asymptotic Curve (Hyperbola)
5.9.9 Complex Curve
5.10 Mathematics Logic and Language
5.10.1 From Language to Mathematical
5.10.2 From Mathematics to Language
5.11 Mathematical Models
5.11.1 Closed Mathematics Models
5.11.2 Explicit Mathematics Models
5.11.2.1 Linear Model Through Origin
5.11.3 Polynomial Models
5.11.3.1 Parabola Model
5.12 Conclusions
Appendix A: VINAM Matlab Software
References
Chapter 6: Genetic Algorithm
6.1 General
6.2 Decimal Number System
6.3 Binary Number System
6.4 Random Numbers
6.4.1 Random Selections
6.4.1.1 Cards Drawing
6.4.1.2 Roulette Wheel
6.4.1.3 Stochastic Choice
6.5 Genetic Numbers
6.5.1 Genetic Algorithm Data Structure
6.5.1.1 Digital Converter
6.5.1.2 Target Function
6.5.1.3 Fitness Function
6.6 Methods of Optimization
6.6.1 Definition and Characteristics of Genetic Algorithms
6.6.1.1 Optimization
6.6.1.2 Optimization Stages
6.6.1.3 What Is Optimization?
6.6.1.4 Optimization Classes
6.7 Least Minimization Methods
6.7.1 Completely Systematic Research Method
6.7.2 Analytical Optimization
6.7.3 Steepest Ascend (Descent) Method
6.8 Simulated Annealing (SA) Method
6.8.1 Application
6.8.2 Random Optimization Methods
6.9 Binary Genetic Algorithms (GA)
6.9.1 Benefits and Consequences of Genetic Algorithm (GA)
6.9.2 Definition of GAs
6.9.3 Binary GA
6.9.4 Selection of Variables and Target Function
6.9.5 Target Function and Vigor Measurement
6.9.6 Representation of Variables
6.9.7 Initial Population
6.9.8 Selection Process
6.9.9 Selection of Spouses
6.9.10 Crossing
6.9.10.1 Single-Cut Crossover
6.9.10.2 Double-Cut Crossover
6.9.10.3 Multi-section Crossover
6.9.10.4 Uniform Crossover
6.9.10.5 Inversion
6.9.10.6 Mixed Crossover
6.9.10.7 Interconnected Crossover
6.9.10.8 Linear Associative Crossover
6.9.11 Mutation (Number Change)
6.10 Probabilities of GA Transactions
6.11 Gray Codes
6.12 Important Issues Regarding the Behavior of the Method
6.13 Convergence and Schema Concept
6.14 GA Parameters Selection
6.14.1 Gene Size
6.14.2 Population Size
6.14.3 Example of a Simple GA
6.14.4 Decimal Number-Based Genetic Algorithms
6.14.5 Continuous Variable GA Elements
6.14.6 Variables and Goal Function
6.14.7 Parameter Coding, Accuracy, and Limits
6.14.8 Initial Population
6.14.8.1 Natural Selection
6.14.8.2 Crossover
6.14.8.3 Waiting Pool
6.14.8.4 Mutation (Number Change)
6.15 General Applications
6.15.1 Function Maximizing
6.15.2 Geometric Weight Functions
6.15.3 Classification of Precipitation Condition
6.15.4 Two Independent Datasets
6.16 Conclusions
References
Chapter 7: Artificial Neural Networks
7.1 General
7.2 Biological Structure
7.3 ANN Definition and Characteristics
7.4 History
7.5 ANN Principles
7.5.1 ANN Terminology and Usage
7.5.2 Areas of ANN Use
7.5.3 Similarity of ANNs to Classic Methods
7.6 Vector and Matrix Similarity
7.6.1 Similarity to Kalman Filters
7.6.2 Similarity to Multiple Regression
7.6.3 Similarity to Stochastic Processes
7.6.4 Similarity to Black Box Models
7.7 ANN Structure
7.8 Perceptron (Single Linear Sensor)
7.8.1 Perceptron Principles
7.8.2 Perceptron Architecture
7.8.3 Perceptron Algorithm
7.8.4 Perceptron Implementation
7.8.4.1 Data Clustering
7.8.4.2 Pattern Separation
7.9 Single Recurrent Linear Neural Network
7.9.1 ADALINE Application
7.9.2 Multi-linear Sensors (MLS)
7.9.3 Multiple Adaptive Linear Element (MADALINE) Neural Network
7.9.4 ORing Problem
7.10 Multilayer Artificial Neural Networks and Management Principles
7.11 ANN Properties
7.11.1 ANN Architectures
7.11.2 Layered ANN
7.11.2.1 Two-Layer ANN
7.11.2.2 Multi-layer ANN
7.11.3 System Dynamics
7.12 Activation Functions
7.12.1 Linear
7.12.2 Threshold
7.12.3 Ramp
7.12.4 Sigmoid
7.12.5 Hyperbolic
7.12.6 Gaussian
7.13 Key Points Before ANN Modeling
7.13.1 ANN Audit
7.13.2 ANN Mathematical Calculations
7.13.3 Training and Modeling with Artificial Networks
7.13.4 ANN Learning Algorithm
7.13.5 ANN Education
7.14 Description of Training Rules
7.14.1 Supervised Training
7.14.1.1 Training Where the Teacher Tells the Correct Result
7.14.1.2 Education in Which the Teacher Gives Only Reward-Punishment
7.14.2 Unsupervised Training
7.14.2.1 Hamming Networks
7.14.2.2 Application
7.14.3 Compulsory Supervision
7.15 Competitive Education
7.15.1 Semi-teacher Training
7.15.2 Learning Rule Algorithms
7.15.2.1 Perceptron Learning Rule
7.15.2.2 Widrow-Hoff Learning Rule
7.15.2.3 Hebb Learning Algorithm
7.15.2.4 Delta Learning Rule
7.15.3 Back Propagation Algorithm
7.15.3.1 Unsupervised ANN
7.15.3.2 Linear Vector Piece
7.15.3.3 Linear Vector Piece ANN
7.15.3.4 Learning Rule of LVQ ANN
7.15.3.5 Training of LVQ ANN
7.16 Renovative Oscillation Theory (ROT) ANN
7.16.1 Differences of ROT ANN and Others
7.16.2 ROT ANN Architecture
7.16.3 ROT ANN Education
7.16.3.1 Application
7.17 ANN with Radial Basis Activation Function
7.17.1 K-Means
7.17.2 Radial Basis Activation Function
7.17.3 RBF ANN Architecture
7.17.4 RBF ANN Training
7.18 Recycled Artificial Neural Networks
7.18.1 Elman ANN
7.18.2 Elman ANN Training
7.19 Hopfield ANN
7.19.1 Discrete Hopfield ANN
7.19.2 Application
7.19.3 Continuous Hopfield ANN
7.20 Simple Competitive Learning Network
7.20.1 Application
7.21 Self-Organizing Mapping ANN
7.21.1 SOM ANN Training
7.22 Memory Congruent ANN
7.22.1 Matrix Fit Memory Method
7.22.1.1 Application
7.22.1.2 The Least Squares Method
7.22.1.3 Application
7.23 General Applications
7.23.1 Missing Data Complement Application
7.23.2 Classification Application
7.23.3 Temperature Prediction Application
7.24 Conclusions
References
Chapter 8: Machine Learning
8.1 General
8.2 Machine Learning-Related Topics
8.3 Historical Backgrounds of ML and AI Couple
8.4 Machine Learning Future
8.4.1 Dataset
8.5 Uncertainty Sources and Calculation Methods
8.5.1 Reduction of Uncertainties
8.5.2 Probability Density Functions (PDF)
8.5.3 Confidence Interval
8.5.4 Uncertainty Roadmap
8.6 Features
8.7 Labels
8.7.1 Numeric Labels: Regression
8.7.2 Categorical Labels: Classification
8.7.3 Ordinal Labels
8.8 Learning Through Applications
8.9 Forecast Verification
8.9.1 Parametric Validation
8.9.1.1 Frequency Distribution
8.9.2 Prediction Skill
8.9.3 The Contingency Table
8.10 Learning Methods
8.10.1 Supervised Learning
8.10.2 Unsupervised Learning
8.10.3 Reinforcement Learning
8.11 Objective and Loss Functions
8.11.1 Loss Function for Classification
8.11.1.1 Binary Cross Entropy Loss (BCEL)
8.11.1.2 Categorical Cross Entropy Loss
8.12 Optimization
8.13 ML and Simple Linear Regression
8.13.1 Least Square Technique
8.14 Classification and Categorization
8.15 Clustering
8.15.1 Clustering Goal
8.15.2 Clustering Algorithm
8.15.3 Cluster Distance Measure
8.16 k-Means Clustering
8.17 Fuzzy c-Means Clustering
8.18 Frequency-k-Means-c-Means Relationship
8.19 Dimensionality Reduction
8.20 Ensemble Methods
8.21 Neural Nets and Deep Learning
8.21.1 Learning
8.22 Conclusions
Appendix: Required Software for Data Reliability Analysis
References
Chapter 9: Deep Learning
9.1 General
9.2 Deep Learning Methods
9.2.1 Deep Learning Neural Networks
9.2.2 Limitations and Challenges
9.3 Deep Learning and Machine Learning
9.4 Different Neural Network Architectures
9.5 Convolutional Neural Network (CNN)
9.5.1 CNN Foundation
9.5.2 CNN Network Layers
9.5.2.1 Input Layer
9.5.2.2 Convolution Layer
9.5.2.3 Convolution Operation
9.5.2.4 Pooling Layer
9.6 Activation Functions
9.6.1 Sigmoid
9.6.2 Tanh
9.6.3 Rectifier Linear Unit (ReLU)
9.6.4 Leaky ReLU
9.6.5 Noisy ReLU
9.6.6 Parametric Linear Units
9.7 Fully Connected (FC) Layer
9.8 Optimization (Loss) Functions
9.8.1 Soft-Max Loss Function (Cross-Entropy)
9.8.2 Euclidean Loss Function
9.8.3 Hinge Loss Function
9.9 CNN Training Process
9.10 Parameter Initialization
9.11 Regularization to CNN
9.11.1 Dropout
9.11.2 Drop-Weights
9.11.3 The i2 Regularization
9.11.4 The i Regularization
9.11.5 Early Stopping
9.12 Recurrent Neural Networks
9.12.1 RNN Architecture Structure
9.12.2 RNN Computational Time Sequence
9.13 The Problem of Long-Term Dependencies
9.13.1 LSTM Network
9.14 Autoencoders
9.14.1 Deep Convolutional AE (DCAE)
9.15 Natural Language Models
9.16 Conclusions
References
Index