In response to the exponentially increasing need to analyze vast amounts of data, Neural Networks for Applied Sciences and Engineering: From Fundamentals to Complex Pattern Recognition provides scientists with a simple but systematic introduction to neural networks. Beginning with an introductory discussion on the role of neural networks in scientific data analysis, this book provides a solid foundation of basic neural network concepts. It contains an overview of neural network architectures for practical data analysis followed by extensive step-by-step coverage on linear networks, as well as, multi-layer perceptron for nonlinear prediction and classification explaining all stages of processing and model development illustrated through practical examples and case studies. Later chapters present an extensive coverage on Self Organizing Maps for nonlinear data clustering, recurrent networks for linear nonlinear time series forecasting, and other network types suitable for scientific data analysis. With an easy to understand format using extensive graphical illustrations and multidisciplinary scientific context, this book fills the gap in the market for neural networks for multi-dimensional scientific data, and relates neural networks to statistics. Features§Explains neural networks in a multi-disciplinary context§Uses extensive graphical illustrations to explain complex mathematical concepts for quick and easy understanding?Examines in-depth neural networks for linear and nonlinear prediction, classification, clustering and forecasting§Illustrates all stages of model development and interpretation of results, including data preprocessing, data dimensionality reduction, input selection, model development and validation, model uncertainty assessment, sensitivity analyses on inputs, errors and model parameters Sandhya Samarasinghe obtained her MSc in Mechanical Engineering from Lumumba University in Russia and an MS and PhD in Engineering from Virginia Tech, USA. Her neural networks research focuses on theoretical understanding and advancements as well as practical implementations.
Author(s): Sandhya Samarasinghe
Edition: 1
Year: 2006
Language: English
Pages: 570
084933375X......Page 1
Half Title......Page 2
Series Title......Page 3
Title......Page 4
Dedication......Page 6
Contents......Page 8
Preface......Page 18
Acknowledgments......Page 22
About the Author......Page 24
1.1 Introduction......Page 25
Table of Contents......Page 0
1.2 Layout of the Book......Page 28
References......Page 31
2.1 Introduction and Overview......Page 34
2.2 Neural Networks and Their Capabilities......Page 35
2.3 Inspirations from Biology......Page 39
2.4 Modeling Information Processing in Neurons......Page 41
2.5 Neuron Models and Learning Strategies......Page 42
2.5.1 Threshold Neuron as a Simple Classifier......Page 43
2.5.2.1 Hebbian Learning......Page 46
2.5.2.3 Supervised Learning......Page 49
2.5.3 Perceptron with Supervised Learning as a Classifier......Page 50
2.5.3.1 Perceptron Learning Algorithm......Page 51
2.5.3.2 A Practical Example of Perceptron on a Larger Realistic Data Set: Identifying the Origin of Fish from the Growth-Ring Diameter of Scales......Page 58
2.5.3.3 Comparison of Perceptron with Linear Discriminant Function Analysis in Statistics......Page 61
2.5.3.4 Multi-Output Perceptron for Multicategory Classification......Page 63
2.5.3.6 Perceptron Summary......Page 68
2.5.4 Linear Neuron for Linear Classification and Prediction......Page 69
2.5.4.1 Learning with the Delta Rule......Page 70
2.5.4.2 Linear Neuron as a Classifier......Page 74
2.5.4.3 Classification Properties of a Linear Neuron as a Subset of Predictive Capabilities......Page 76
2.5.4.4 Example: Linear Neuron as a Predictor......Page 77
2.5.4.5 A Practical Example of Linear Prediction: Predicting the Heat Influx in a Home......Page 84
2.5.4.6 Comparison of Linear Neuron Model with Linear Regression......Page 85
2.5.4.8 Comparison of a Multiple-Input Linear Neuron with Multiple Linear Regression......Page 86
2.5.4.9 Multiple Linear Neuron Models......Page 87
2.5.4.11 Linear Neuron and Linear Network Summary......Page 88
Problems......Page 89
References......Page 90
3.1 Overview and Introduction......Page 92
3.1.1 Multilayer Perceptron......Page 94
3.2 Nonlinear Neurons......Page 95
3.2.1 Neuron Activation Functions......Page 96
3.2.1.1 Sigmoid Functions......Page 97
3.2.1.2 Gaussian Functions......Page 99
3.2.2 Example: Population Growth Modeling Using a Nonlinear Neuron......Page 100
3.3.1 Processing with a Single Nonlinear Hidden Neuron......Page 103
3.3.2.1 Example 1: Approximating a Square Wave......Page 109
3.3.2.2 Example 2: Modeling Seasonal Species Migration......Page 117
3.4.1 Processing of Two-Dimensional Inputs by Nonlinear Neurons......Page 121
3.4.2 Network Output......Page 125
3.4.3.1 Example 1: Two-Dimensional Nonlinear Function Approximation......Page 126
3.4.3.2 Example 2: Two-Dimensional Nonlinear Classification Model......Page 128
3.5 Multidimensional Data Modeling with Nonlinear Multilayer Perceptron Networks......Page 132
Problems......Page 133
References......Page 135
4.1 Introduction and Overview......Page 136
4.2 Supervised Training of Networks for Nonlinear Pattern Recognition......Page 137
4.3 Gradient Descent and Error Minimization......Page 138
4.4 Backpropagation Learning......Page 139
4.4.1 Example: Backpropagation Training—A Hand Computation......Page 140
4.4.1.1 Error Gradient with Respect to Output Neuron Weights......Page 143
4.4.1.2 The Error Gradient with Respect to the Hidden-Neuron Weights......Page 146
4.4.1.3 Application of Gradient Descent in Backpropagation Learning......Page 150
4.4.1.4 Batch Learning......Page 151
4.4.1.5 Learning Rate and Weight Update......Page 153
4.4.1.7 Momentum......Page 157
4.4.2 Example: Backpropagation Learning Computer Experiment......Page 161
4.4.3 Single-Input Single-Output Network with Multiple Hidden Neurons......Page 164
4.4.4 Multiple-Input, Multiple-Hidden Neuron, and Single-Output Network......Page 165
4.4.5 Multiple-Input, Multiple-Hidden Neuron, Multiple-Output Network......Page 166
4.4.6 Example: Backpropagation Learning Case Study—Solving a Complex Classification Problem......Page 168
4.5 Delta-Bar-Delta Learning (Adaptive Learning Rate) Method......Page 175
4.5.1 Example: Network Training with Delta-Bar-Delta— A Hand Computation......Page 177
4.5.2 Example: Delta-Bar-Delta with Momentum— A Hand Computation......Page 180
4.5.3 Network Training with Delta-Bar-Delta— A Computer Experiment......Page 181
4.5.4 Comparison of Delta-Bar-Delta Method with Backpropagation......Page 182
4.5.5 Example: Network Training with Delta-Bar-Delta— A Case Study......Page 183
4.6.1 Example: Network Training with Steepest Descent—Hand Computation......Page 186
4.6.2 Example: Network Training with Steepest Descent—A Computer Experiment......Page 187
4.7 Second-Order Methods of Error Minimization and Weight Optimization......Page 189
4.7.1 QuickProp......Page 190
4.7.1.1 Example: Network Training with QuickProp— A Hand Computation......Page 191
4.7.1.3 Comparison of QuickProp with Steepest Descent, Delta-Bar-Delta, and Backpropagation......Page 193
4.7.2 General Concept of Second-Order Methods of Error Minimization......Page 195
4.7.3 Gauss–Newton Method......Page 197
4.7.3.1 Network Training with the Gauss–Newton Method—A Hand Computation......Page 199
4.7.3.2 Example: Network Training with Gauss–Newton Method—A Computer Experiment......Page 201
4.7.4 The Levenberg–Marquardt Method......Page 203
4.7.4.1 Example: Network Training with LM Method—A Hand Computation......Page 205
4.7.4.2 Network Training with the LM Method—A Computer Experiment......Page 206
4.7.5 Comparison of the Efficiency of the First-Order and Second-Order Methods in Minimizing Error......Page 207
4.7.6 Comparison of the Convergence Characteristics of First-Order and Second-Order Learning Methods......Page 208
4.7.6.1 Backpropagation......Page 210
4.7.6.2 Steepest Descent Method......Page 211
4.7.6.3 Gauss–Newton Method......Page 212
4.7.6.4 Levenberg–Marquardt Method......Page 213
Problems......Page 215
References......Page 216
5.1 Introduction and Overview......Page 218
5.2 Bias–Variance Tradeoff......Page 219
5.3 Improving Generalization of Neural Networks......Page 220
5.3.1 Illustration of Early Stopping......Page 222
5.3.1.1 Effect of Initial Random Weights......Page 226
5.3.1.2 Weight Structure of the Trained Networks......Page 229
5.3.1.3 Effect of Random Sampling......Page 230
5.3.1.4 Effect of Model Complexity: Number of Hidden Neurons......Page 235
5.3.1.5 Summary on Early Stopping......Page 236
5.3.2 Regularization......Page 238
5.4 Reducing Structural Complexity of Networks by Pruning......Page 244
5.4.1 Optimal Brain Damage......Page 245
5.4.1.1 Example of Network Pruning with Optimal Brain Damage......Page 246
5.4.2 Network Pruning Based on Variance of Network Sensitivity......Page 252
5.4.2.1 Illustration of Application of Variance Nullity in Pruning Weights......Page 255
5.4.2.2 Pruning Hidden Neurons Based on Variance Nullity of Sensitivity......Page 258
5.5 Robustness of a Network to Perturbation of Weights......Page 260
5.5.1 Confidence Intervals for Weights......Page 262
5.6 Summary......Page 264
Problems......Page 265
References......Page 266
6.1 Introduction and Overview......Page 268
6.1.1 Example: Thermal Conductivity of Wood in Relation to Correlated Input Data......Page 270
6.2.1 Correlation Scatter Plots and Histograms......Page 271
6.2.2 Parallel Visualization......Page 272
6.2.3 Projecting Multidimensional Data onto Two-Dimensional Plane......Page 273
6.3 Correlation and Covariance between Variables......Page 274
6.4.1 Standardization......Page 276
6.4.2 Simple Range Scaling......Page 277
6.4.3 Whitening—Normalization of Correlated Multivariate Data......Page 278
6.5 Selecting Relevant Inputs......Page 282
6.5.1.1 Partial Correlation......Page 283
6.5.1.2 Multiple Regression and Best-Subsets Regression......Page 284
6.6.1 Multicollinearity......Page 285
6.6.2 Principal Component Analysis (PCA)......Page 286
6.6.3 Partial Least-Squares Regression......Page 290
6.7 Outlier Detection......Page 291
6.9 Case Study: Illustrating Input Selection and Dimensionality Reduction for a Practical Problem......Page 293
6.9.1 Data Preprocessing and Preliminary Modeling......Page 294
6.9.2 PCA-Based Neural Network Modeling......Page 298
6.9.3 Effect of Hidden Neurons for Non-PCA- and PCA-Based Approaches......Page 301
6.9.4 Case Study Summary......Page 302
6.10 Summary......Page 303
References......Page 304
7.1 Introduction and Overview......Page 306
7.2.1 Quality Criterion......Page 308
7.2.2 Incorporating Bayesian Statistics to Estimate Weight Uncertainty......Page 311
7.2.2.1 Square Error......Page 312
7.2.3 Intrinsic Uncertainty of Targets for Multivariate Output......Page 315
7.2.4 Probability Density Function of Weights......Page 316
7.2.5.1 Estimation of Geophysical Parameters from Remote Sensing: A Case Study......Page 318
7.3 Assessing Uncertainty of Neural Network Outputs Using Bayesian Statistics......Page 323
7.3.1.1 Total Network Output Errors......Page 324
7.3.1.3 Statistical Analysis of Error Covariance......Page 325
7.3.1.4 Decomposition of Total Output Error into Model Error and Intrinsic Noise......Page 327
7.4.1.1 Methods Based on Magnitude of Weights......Page 334
7.4.1.2 Sensitivity Analysis......Page 335
7.4.2 Example: Comparison of Methods to Assess the Influence of Inputs on Outputs......Page 336
7.4.3 Uncertainty of Sensitivities......Page 337
7.4.4.1 PCA Decomposition of Inputs and Outputs......Page 338
7.4.4.2 PCA-Based Neural Network Regression......Page 343
7.4.4.3 Neural Network Sensitivities......Page 346
7.4.4.4 Uncertainty of Input Sensitivity......Page 348
7.4.4.5 PCA-Regularized Jacobians......Page 351
7.5 Summary......Page 356
Problems......Page 357
References......Page 358
8.1 Introduction and Overview......Page 360
8.2 Structure of Unsupervised Networks......Page 361
8.3 Learning in Unsupervised Networks......Page 362
8.4.1 Winner Selection Based on Neuron Activation......Page 363
8.4.2 Winner Selection Based on Distance to Input Vector......Page 364
8.4.2.1 Other Distance Measures......Page 365
8.4.3 Competitive Learning Example......Page 366
8.4.3.2 Illustration of the Calculations Involved inWinner Selection......Page 367
8.4.3.3 Network Training......Page 369
8.5.1.1 Selection of Neighborhood Geometry......Page 372
8.5.1.3 Neighbor Strength......Page 373
8.5.1.4 Example: Training Self-Organizing Networks with a Neighbor Feature......Page 374
8.5.1.5 Neighbor Matrix and Distance to Neighbors from the Winner......Page 377
8.5.1.6 Shrinking Neighborhood Size with Iterations......Page 380
8.5.1.7 Learning Rate Decay......Page 381
8.5.1.8 Weight Update Incorporating Learning Rate and Neighborhood Decay......Page 382
8.5.1.10 Two Phases of Self-Organizing Map Training......Page 383
8.5.1.11 Example: Illustrating Self-Organizing Map Learning with a Hand Calculation......Page 384
8.5.1.12 SOM Case Study: Determination of Mastitis Health Status of Dairy Herd from Combined Milk Traits......Page 391
8.5.2 Example of Two-Dimensional Self-Organizing Maps: Clustering Canadian and Alaskan Salmon Based on the Diameter of Growth Rings of the Scales......Page 394
8.5.2.1 Map Structure and Initialization......Page 395
8.5.2.2 Map Training......Page 396
8.5.2.3 U-Matrix......Page 403
8.5.4 Example: Training Two-Dimensional Maps on Multidimensional Data......Page 405
8.5.4.2 Map Structure and Training......Page 406
8.5.4.3 U-Matrix......Page 412
8.5.4.4 Point Estimates of Probability Density of Inputs Captured by the Map......Page 413
8.5.4.5 Quantization Error......Page 414
8.5.4.6 Accuracy of Retrieval of Input Data from the Map......Page 416
8.5.5 Forming Clusters on the Map......Page 418
8.5.5.1 Approaches to Clustering......Page 419
8.5.5.2 Example Illustrating Clustering on a Trained Map......Page 420
8.5.5.3 Finding Optimum Clusters on the Map with the Ward Method......Page 424
8.5.5.4 Finding Optimum Clusters by K-Means Clustering......Page 426
8.5.6.1 n-Fold Cross Validation......Page 429
8.6 Evolving Self-Organizing Maps......Page 434
8.6.1 Growing Cell Structure of Map......Page 436
8.6.1.1 Centroid Method for Mapping Input Data onto Positions between Neurons on the Map......Page 439
8.6.2 Dynamic Self-Organizing Maps with Controlled Growth (GSOM)......Page 442
8.6.2.1 Example: Application of Dynamic Self-Organizing Maps......Page 445
8.6.3 Evolving Tree......Page 450
8.7 Summary......Page 454
Problems......Page 455
References......Page 457
9.1 Introduction and Overview......Page 459
9.2 Linear Forecasting of Time-Series with Statistical and Neural Network Models......Page 462
9.2.1 Example Case Study: Regulating Temperature of a Furnace......Page 464
9.2.1.1 Multistep-Ahead Linear Forecasting......Page 466
9.3.1 Focused Time-Lagged and Dynamically Driven Recurrent Networks......Page 468
9.3.1.1 Focused Time-Lagged Feedforward Networks......Page 470
9.3.1.2 Spatio-Temporal Time-Lagged Networks......Page 472
9.3.2 Example: Spatio-Temporal Time-Lagged Network—Regulating Temperature in a Furnace......Page 474
9.3.2.1 Single-Step Forecasting with Neural NARx Model......Page 476
9.3.2.2 Multistep Forecasting with Neural NARx Model......Page 477
9.3.3 Case Study: River Flow Forecasting......Page 479
9.3.3.1 Linear Model for River Flow Forecasting......Page 482
9.3.3.2 Nonlinear Neural (NARx) Model for River Flow Forecasting......Page 485
9.3.3.3 Input Sensitivity......Page 489
9.4 Hybrid Linear (ARIMA) and Nonlinear Neural Network Models......Page 490
9.4.1 Case Study: Forecasting the Annual Number of Sunspots......Page 492
9.5 Automatic Generation of Network StructureUsing Simplest Structure Concept......Page 493
9.5.1 Case Study: Forecasting Air Pollution with Automatic Neural Network Model Generation......Page 495
9.6 Generalized Neuron Network......Page 497
9.6.1 Case Study: Short-Term Load Forecasting with a Generalized Neuron Network......Page 504
9.7.1.1 Encapsulating Long-Term Memory......Page 507
9.7.1.2 Structure and Operation of the Elman Network......Page 510
9.7.1.3 Training Recurrent Networks......Page 512
9.7.1.4 Network Training Example: Hand Calculation......Page 517
9.7.1.5 Recurrent Learning Network Application Case Study: Rainfall Runoff Modeling......Page 522
9.7.1.6 Two-Step-Ahead Forecasting with Recurrent Networks......Page 525
9.7.1.7 Real-Time Recurrent Learning Case Study: Two-Step-Ahead Stream Flow Forecasting......Page 527
9.7.2.1 Encapsulating Long-Term Memory in Recurrent Networks with Output Feedback......Page 530
9.7.2.2 Application of a Recurrent Net with Output and Error Feedback and Exogenous Inputs: (NARIMAx) Case Study: Short-Term Temperature Forecasting......Page 532
9.7.2.3 Training of Recurrent Nets with Output Feedback......Page 535
9.7.3 Fully Recurrent Network......Page 537
9.7.3.1 Fully Recurrent Network Practical Application Case Study: Short-Term Electricity Load Forecasting......Page 539
9.8 Bias and Variance in Time-Series Forecasting......Page 541
9.8.1 Decomposition of Total Error into Bias and Variance Components......Page 543
9.8.2 Example Illustrating Bias–Variance Decomposition......Page 544
9.9 Long-Term Forecasting......Page 550
9.9.1 Case Study: Long-Term Forecasting with Multiple Neural Networks (MNNs)......Page 553
9.10 Input Selection for Time-Series Forecasting......Page 555
9.10.1.1 Partial Mutual Information Method......Page 557
9.10.1.2 Generalized Regression Neural Network......Page 560
9.10.1.3 Self-Organizing Maps for Input Selection......Page 561
9.10.1.4 Genetic Algorithms for Input Selection......Page 563
9.10.2 Practical Application of Input Selection Methods for Time-Series Forecasting......Page 565
9.10.3 Input Selection Case Study: Selecting Inputs for Forecasting River Salinity......Page 568
9.11 Summary......Page 571
Problems......Page 573
References......Page 574
A.1.1 Addition of Vectors......Page 577
A.1.3 The Norm of a Vector......Page 578
A.1.4 Vector Multiplication: Dot Products......Page 579
A.2.2 Matrix Multiplication......Page 580
A.2.3 Multiplication of a Matrix by a Vector......Page 581
References......Page 582