Rank-Based Methods for Shrinkage and SelectionA practical and hands-on guide to the theory and methodology of statistical estimation based on rank
Robust statistics is an important field in contemporary mathematics and applied statistical methods. Rank-Based Methods for Shrinkage and Selection: With Application to Machine Learning describes techniques to produce higher quality data analysis in shrinkage and subset selection to obtain parsimonious models with outlier-free prediction. This book is intended for statisticians, economists, biostatisticians, data scientists and graduate students.
Rank-Based Methods for Shrinkage and Selection elaborates on rank-based theory and application in machine learning to robustify the least squares methodology. It also includes:
- Development of rank theory and application of shrinkage and selection
- Methodology for robust data science using penalized rank estimators
- Theory and methods of penalized rank dispersion for ridge, LASSO and Enet
- Topics include Liu regression, high-dimension, and AR(p)
- Novel rank-based logistic regression and neural networks
- Problem sets include R code to demonstrate its use in machine learning
Author(s): A. K. Md. Ehsanes Saleh, Mohammad Arashi, Resve A. Saleh, Mina Norouzirad
Publisher: Wiley
Year: 2022
Language: English
Pages: 481
City: Hoboken
Rank-Based Methods for Shrinkage and Selection
Contents in Brief
Contents
List of Figures
List of Tables
Foreword
Preface
1 Introduction to Rank-based Regression
1.1 Introduction
1.2 Robustness of the Median
1.2.1 Mean vs. Median
1.2.2 Breakdown Point
1.2.3 Order and Rank Statistics
1.3 Simple Linear Regression
1.3.1 Least Squares Estimator (LSE)
1.3.2 Theil's Estimator
1.3.3 Belgium Telephone Data Set
1.3.4 Estimation and Standard Error Comparison
1.4 Outliers and their Detection
1.4.1 Outlier Detection
1.5 Motivation for Rank-based Methods
1.5.1 Effect of a Single Outlier
1.5.2 Using Rank for the Location Model
1.5.3 Using Rank for the Slope
1.6 The Rank Dispersion Function
1.6.1 Ranking and Scoring Details
1.6.2 Detailed Procedure for R-estimation
1.7 Shrinkage Estimation and Subset Selection
1.7.1 Multiple Linear Regression using Rank
1.7.2 Penalty Functions
1.7.3 Shrinkage Estimation
1.7.4 Subset Selection
1.7.5 Blended Approaches
1.8 Summary
1.9 Problems
2 Characteristics of Rank-based Penalty Estimators
2.1 Introduction
2.2 Motivation for Penalty Estimators
2.3 Multivariate Linear Regression
2.3.1 Multivariate Least Squares Estimation
2.3.2 Multivariate R-estimation
2.3.3 Multicollinearity
2.4 Ridge Regression
2.4.1 Ridge Applied to Least Squares Estimation
2.4.2 Ridge Applied to Rank Estimation
2.5 Example: Swiss Fertility Data Set
2.5.1 Estimation and Standard Errors
2.5.2 Parameter Variance using Bootstrap
2.5.3 Reducing Variance using Ridge
2.5.4 Ridge Traces
2.6 Selection of Ridge Parameter bold0mu mumu dotted2
2.6.1 Quadratic Risk
2.6.2 K-fold Cross-validation Scheme
2.7 LASSO and aLASSO
2.7.1 Subset Selection
2.7.2 Least Squares with LASSO
2.7.3 The Adaptive LASSO and its Geometric Interpretation
2.7.4 R-estimation with LASSO and aLASSO
2.7.5 Oracle Properties
2.8 Elastic Net (Enet)
2.8.1 Naive Enet
2.8.2 Standard Enet
2.8.3 Enet in Machine Learning
2.9 Example: Diabetes Data Set
2.9.1 Model Building with R-aEnet
2.9.2 MSE vs. MAE
2.9.3 Model Building with LS-Enet
2.10 Summary
2.11 Problems
3 Location and Simple Linear Models
3.1 Introduction
3.2 Location Estimators and Testing
3.2.1 Unrestricted R-estimator of bold0mu mumu dotted
3.2.2 Restricted R-estimator of bold0mu mumu dotted
3.3 Shrinkage R-estimators of Location
3.3.1 Overview of Shrinkage R-estimators of bold0mu mumu dotted
3.3.2 Derivation of the Ridge-type R-estimator
3.3.3 Derivation of the LASSO-type R-estimator
3.3.4 General Shrinkage R-estimators of bold0mu mumu dotted
3.4 Ridge-type R-estimator of bold0mu mumu dotted
3.5 Preliminary Test R-estimator of bold0mu mumu dotted
3.5.1 Optimum Level of Significance of PTRE
3.6 Saleh-type R-estimators
3.6.1 Hard-Threshold R-estimator of
3.6.2 Saleh-type R-estimator of bold0mu mumu dotted
3.6.3 Positive-rule Saleh-type (LASSO-type) R-estimator of bold0mu mumu dotted
3.6.4 Elastic Net-type R-estimator of bold0mu mumu dotted
3.7 Comparative Study of the R-estimators of Location
3.8 Simple Linear Model
3.8.1 Restricted R-estimator of Slope
3.8.2 Shrinkage R-estimator of Slope
3.8.3 Ridge-type R-estimation of Slope
3.8.4 Hard-Threshold R-estimator of Slope
3.8.5 Saleh-type R-estimator of Slope
3.8.6 Positive-rule Saleh-type (LASSO-type) R-estimator of Slope
3.8.7 The Adaptive LASSO (aLASSO-type) R-estimator
3.8.8 nEnet-type R-estimator of Slope
3.8.9 Comparative Study of R-estimators of Slope
3.9 Summary
3.10 Problems
4 Analysis of Variance (ANOVA)
4.1 Introduction
4.2 Model, Estimation and Tests
4.3 Overview of Multiple Location Models
4.3.1 Example: Corn Fertilizers
4.3.2 One-way ANOVA
4.3.3 Effect of Variance on Shrinkage Estimators
4.3.4 Shrinkage Estimators for Multiple Location
4.4 Unrestricted R-estimator
4.5 Test of Significance
4.6 Restricted R-estimator
4.7 Shrinkage Estimators
4.7.1 Preliminary Test R-estimator
4.7.2 The Stein–Saleh-type R-estimator
4.7.3 The Positive-rule Stein–Saleh-type R-estimator
4.7.4 The Ridge-type R-estimator
4.8 Subset Selection Penalty R-estimators
4.8.1 Preliminary Test Subset Selector R-estimator
4.8.2 Saleh-type R-estimator
4.8.3 Positive-rule Saleh Subset Selector (PRSS)
4.8.4 The Adaptive LASSO (aLASSO)
4.8.5 Elastic-net-type R-estimator
4.9 Comparison of the R-estimators
4.9.1 Comparison of URE and RRE
4.9.2 Comparison of URE and Stein–Saleh-type R-estimators
4.9.3 Comparison of URE and Ridge-type R-estimators
4.9.4 Comparison of URE and PTSSRE
4.9.5 Comparison of LASSO-type and Ridge-type R-estimators
4.9.6 Comparison of URE, RRE and LASSO
4.9.7 Comparison of LASSO with PTRE
4.9.8 Comparison of LASSO with SSRE
4.9.9 Comparison of LASSO with PRSSRE
4.9.10 Comparison of nEnetRE with URE
4.9.11 Comparison of nEnetRE with RRE
4.9.12 Comparison of nEnetRE with HTRE
4.9.13 Comparison of nEnetRE with SSRE
4.9.14 Comparison of Ridge-type vs. nEnetRE
4.10 Summary
4.11 Problems
5 Seemingly Unrelated Simple Linear Models
5.1 Introduction
5.1.1 Problem Formulation
5.2 Signed and Signed Rank Estimators of Parameters
5.2.1 General Shrinkage R-estimator of bold0mu mumu dotted
5.2.2 Ridge-type R-estimator of bold0mu mumu dotted
5.2.3 Preliminary Test R-estimator of bold0mu mumu dotted
5.3 Stein–Saleh-type R-estimator of bold0mu mumu dotted
5.3.1 Positive-rule Stein–Saleh R-estimators of bold0mu mumu dotted
5.4 Saleh-type R-estimator of bold0mu mumu dotted
5.4.1 LASSO-type R-estimator of the bold0mu mumu dotted
5.5 Elastic-net-type R-estimators
5.6 R-estimator of Intercept When Slope Has Sparse Subset
5.6.1 General Shrinkage R-estimator of Intercept
5.6.2 Ridge-type R-estimator of bold0mu mumu dotted
5.6.3 Preliminary Test R-estimators of bold0mu mumu dotted
5.7 Stein–Saleh-type R-estimator of bold0mu mumu dotted
5.7.1 Positive-rule Stein–Saleh-type R-estimator of bold0mu mumu dotted
5.7.2 LASSO-type R-estimator of bold0mu mumu dotted
5.8 Summary
5.8.1 Problems
6 Multiple Linear Regression Models
6.1 Introduction
6.2 Multiple Linear Model and R-estimation
6.3 Model Sparsity and Detection
6.4 General Shrinkage R-estimator of bold0mu mumu dotted
6.4.1 Preliminary Test R-estimators
6.4.2 Stein–Saleh-type R-estimator
6.4.3 Positive-rule Stein–Saleh-type R-estimator
6.5 Subset Selectors
6.5.1 Preliminary Test Subset Selector R-estimator
6.5.2 Stein–Saleh-type R-estimator
6.5.3 Positive-rule Stein–Saleh-type R-estimator (LASSO-type)
6.5.4 Ridge-type Subset Selector
6.5.5 Elastic Net-type R-estimator
6.6 Adaptive LASSO
6.6.1 Introduction
6.6.2 Asymptotics for LASSO-type R-estimator
6.6.3 Oracle Property of aLASSO
6.7 Summary
6.8 Problems
7 Partially Linear Multiple Regression Model
7.1 Introduction
7.2 Rank Estimation in the PLM
7.2.1 Penalty R-estimators
7.2.2 Preliminary Test and Stein–Saleh-type R-estimator
7.3 ADB and ADL2-risk
7.4 ADL2-risk Comparisons
7.4.0.1 Ridge vs. others
7.5 Summary: L2-risk Efficiencies
7.6 Problems
8 Liu Regression Models
8.1 Introduction
8.2 Linear Unified (Liu) Estimator
8.2.1 Liu-type R-estimator
8.3 Shrinkage Liu-type R-estimators
8.4 Asymptotic Distributional Risk
8.5 Asymptotic Distributional Risk Comparisons
8.5.1 Comparison of SSLRE and PTLRE
8.5.2 Comparison of PRSLRE and PTLRE
8.5.3 Comparison of PRLRE and SSLRE
8.5.4 Comparison of Liu-Type Rank Estimators With Counterparts
8.6 Estimation of d
8.7 Diabetes Data Analysis
8.7.1 Penalty Estimators
8.7.2 Performance Analysis
8.8 Summary
8.9 Problems
9 Autoregressive Models
9.1 Introduction
9.2 R-estimation of bold0mu mumu dotted for the AR(bold0mu mumu ppdottedpppp)-Model
9.3 LASSO, Ridge, Preliminary Test and Stein–Saleh-type R-estimators
9.4 Asymptotic Distributional L2-risk
9.5 Asymptotic Distributional L2-risk Analysis
9.5.1 Comparison of Unrestricted vs. Restricted R-estimators
9.5.2 Comparison of Unrestricted vs. Preliminary Test R-estimatorPreliminary test R-estimator
9.5.3 Comparison of Unrestricted vs. Stein–Saleh-type R-estimatorsStein-Saleh-type R-estimator
9.5.4 Comparison of the Preliminary Test vs. Stein–Saleh-type R-estimatorsStein-Saleh-type R-estimator
9.6 Summary
9.7 Problems
10 High-Dimensional Models
10.1 Introduction
10.2 Identifiability of bold0mu mumu dotted* and Projection
10.3 Parsimonious Model Selection
10.4 Some Notation and Separation
10.4.1 Special Matrices
10.4.2 Steps Towards Estimators
10.4.3 Post-selection Ridge Estimation of bold0mu mumu dottedS1* and bold0mu mumu dottedS2*
10.4.4 Post-selection Ridge R-estimators for bold0mu mumu dottedS1* and bold0mu mumu dottedS2*
10.5 Post-selection Shrinkage R-estimators
10.6 Asymptotic Properties of the Ridge R-estimators
10.7 Asymptotic Distributional L2-Risk Properties
10.8 Asymptotic Distributional Risk Efficiency
10.9 Summary
10.10 Problems
11 Rank-based Logistic Regression
11.1 Introduction
11.2 Data Science and Machine Learning
11.2.1 What is Robust Data Science?
11.2.2 What is Robust Machine Learning?
11.3 Logistic Regression
11.3.1 Log-likelihood Setup
11.3.2 Motivation for Rank-based Logistic Methods
11.3.3 Nonlinear Dispersion Function
11.4 Application to Machine Learning
11.4.1 Example: Motor Trend Cars
11.5 Penalized Logistic Regression
11.5.1 Log-likelihood Expressions
11.5.2 Rank-based Expressions
11.5.3 Support Vector Machines
11.5.4 Example: Circular Data
11.6 Example: Titanic Data Set
11.6.1 Exploratory Data Analysis
11.6.2 RLR vs. LLR vs. SVM
11.6.3 Shrinkage and Selection
11.7 Summary
11.8 Problems
12 Rank-based Neural Networks
12.1 Introduction
12.2 Set-up for Neural Networks
12.3 Implementing Neural Networks
12.3.1 Basic Computational Unit
12.3.2 Activation Functions
12.3.3 Four-layer Neural Network
12.4 Gradient Descent with Momentum
12.4.1 Gradient Descent
12.4.2 Momentum
12.5 Back Propagation Example
12.5.1 Forward Propagation
12.5.2 Back Propagation
12.5.3 Dispersion Function Gradients
12.5.4 RNN Algorithm
12.6 Accuracy Metrics
12.7 Example: Circular Data Set
12.8 Image Recognition: Cats vs. Dogs
12.8.1 Binary Image Classification
12.8.2 Image Preparation
12.8.3 Over-fitting and Under-fitting
12.8.4 Comparison of LNN vs. RNN
12.9 Image Recognition: MNIST Data Set
12.10 Summary
12.11 Problems
Bibliography
Author Index
Subject Index
EULA