Multivariate statistical analysis has undergone a rich and varied evolution during the latter half of the 20th century. Academics and practitioners have produced much literature with diverse interests and with varying multidisciplinary knowledge on different topics within the multivariate domain. Due to multivariate algebra being of sustained interest and being a continuously developing field, its appeal breaches laterally across multiple disciplines to act as a catalyst for contemporary advances, with its core inferential genesis remaining in that of statistics.
It is exactly this varied evolution caused by an influx in data production, diffusion, and understanding in scientific fields that has blurred many lines between disciplines. The cross-pollination between statistics and biology, engineering, medical science, computer science, and even art, has accelerated the vast amount of questions that statistical methodology has to answer and report on. These questions are often multivariate in nature, hoping to elucidate uncertainty on more than one aspect at the same time, and it is here where statistical thinking merges mathematical design with real life interpretation for understanding this uncertainty.Statistical advances benefit from these algebraic inventions and expansions in the multivariate paradigm. This contributed volume aims to usher novel research emanating from a multivariate statistical foundation into the spotlight, with particular significance in multidisciplinary settings. The overarching spirit of this volume is to highlight current trends, stimulate a focus on, and connect multidisciplinary dots from and within multivariate statistical analysis. Guided by these thoughts, a collection of research at the forefront of multivariate statistical thinking is presented here which has been authored by globally recognized subject matter experts.
Author(s): Andriëtte Bekker, Johannes T. Ferreira, Mohammad Arashi, Ding-Geng Chen
Series: Emerging Topics in Statistics and Biostatistics
Publisher: Springer
Year: 2022
Language: English
Pages: 433
City: Cham
Preface
Contents
About the Editors
Trends in Multi- and Matrix-Variate Analysis
Association-Based Optimal Subpopulation Selection for Multivariate Data
1 Introduction
2 The Proposed Method
Averaged Absolute Association (AAA) Criterion
Efficient Algorithms
3 Simulation Study
Evaluation of Selected Subpopulation
Comparisons of the Algorithms
Comparison with the Tau-Path Method
4 Case Study
5 Discussion
References
Likelihood-Based Inference for Linear Mixed-Effects Models with Censored Response Using Skew-Normal Distribution
1 Introduction
2 The Multivariate Skew-Normal Distribution
3 The Skew-Normal Linear Mixed-Effects Model with Censored Responses
The Statistical Model
The Likelihood Function
The ECM Algorithm
Approximate Standard Errors
Estimation of the Random Effects
Prediction of Future Observations
4 Illustrative Example—UTI Data
5 Conclusions
References
Robust Estimation of Multiple Change Points in Multivariate Processes
1 Introduction
2 Methodology
Matrix Normal Distribution
Change Point Estimation
3 Experiments
4 Applications
Illustration on Crime Rates in US Cities
Effect of Colorado Amendment 64
5 Discussion
References
Some Computational Aspects of a Noncentral Dirichlet Family
1 Introduction
2 Foundations of the Dirichlet
3 Methods and Approach
Log-Likelihood
Method for Investigating lamda 3λ3
Initial Parameters for MLE Search
4 Data Fitting
Simulation Study 1
Simulation Study 2
Dataset 1—Household Expenditure Data
Dataset 2—Pekin Duckling Data
5 Final Thoughts and Future Directions
References
Modeling Handwritten Digits Dataset Using the Matrix Variate t Distribution
1 Introduction
2 Matrix Variate t Distribution
3 Parameter Estimation
Maximum Likelihood Estimation
Estimation via EM Algorithm
4 Simulation Study and Real Data Example
Simulation Study
Real Data Example
5 Conclusions
References
On the Identification of Extreme Elements in a Residual for the GMANOVA-MANOVA Model
1 Introduction
2 Background
Residuals in the GMANOVA-MANOVA Model
The GMANOVA-MANOVA Model and the Parametric Bootstrap Technique
3 Data Analysis
4 Concluding Remarks
References
Matrix-variate Smooth Transition Models for Temporal Networks
1 Introduction
2 A Smooth Transition Matrix Model
Transition Mechanisms
Nonlinear Network Models
Extensions
3 Bayesian Inference
Prior Specification
Posterior Approximation
4 Empirical Analysis
Volatility Networks
Oil Production Networks
5 Conclusion
References
A Flexible Matrix-Valued Response Regression for Skewed Data
1 Introduction
2 Background
Matrix-variate Normal Distribution
Unimodal–bimodal Normal (UBN) Distribution
Skewed Matrix-Variate UBN (MatUBN) Distribution
3 Proposed Regression Model
Model Formulation
Extending the Model Using Envelope Formulation
4 Simulation Study
5 Applications
6 Concluding
References
Multivariate Functional Singular Spectrum Analysis: A Nonparametric Approach for Analyzing Multivariate Functional Time Series
1 Background
General Scheme of Univariate Singular Spectrum Analysis
General Scheme of Functional Singular Spectrum Analysis
General Scheme of Multivariate Singular Spectrum Analysis
2 General Scheme of Multivariate Functional Singular Spectrum Analysis
Preliminaries and Notations
Multivariate Functional Singular Spectrum Analysis Algorithm
Computer Implementation Strategy
3 Generalizing Multivariate Singular Spectrum Analysis to Multivariate Functional Singular Spectrum Analysis
From Horizontal Multivariate Singular Spectrum Analysis to Horizontal Multivariate Functional Singular Spectrum Analysis
From Vertical Multivariate Singular Spectrum Analysis to Vertical Multivariate Functional Singular Spectrum Analysis
4 Numerical Studies
Simulation Study
Application to NDVI Images and Intraday Temperature Data
Application to Remote Sensing Density Curves
5 Discussion
References
Compositional Data Analysis—Linear Algebra, Visualization and Interpretation
1 Introduction
2 Basic Algebraic Definitions and Results
Logratio Transformations and Associated Pattern Matrices
Inverting Logratio Transformations
Log-Contrasts
3 Logratio Visualization
4 Summary and Discussion
References
Multivariate Count Data Regression Models and Their Applications
1 Introduction
2 Review of T-R{W} Family of Distributions
Sub-Families of Discrete T-R{W} Distributions
The Family of Generalized Geometric Distributions
3 Bivariate and Multivariate T-geometric{W} Families
Sarmanov Family of Bivariate and Multivariate Distributions
Bivariate and Multivariate T-geometric{W} Families
Multivariate T-geometric{W} Regression Model
4 Inference on Bivariate and Multivariate T-geometric{W} Regression Models
Test for Independence
Test for Dispersion
Test to Compare Nested and Non-nested Models
Goodness-Of-Fit Statistics
5 Application
Sex Partners Data
Inmates Profiling Data
6 Summary and Conclusions
7 Appendix
References
A Generalized Multivariate Gamma Distribution
1 Introduction
2 The Multivariate Gamma Distribution
3 Marginal Distributions
4 Factorizations
5 Joint Moments
6 Moment Generating Function
7 Entropies
8 Estimation
9 Simulation
10 Conclusion
References
Aspects of High-Dimensional Methodology and Bayesian Learning
A Comparison of Different Clustering Approaches for High-Dimensional Presence-Absence Data
1 Introduction
2 Data and Preprocessing
3 Clustering Methods
Latent Class Analysis
Methods Operating on Distances
Methods Operating on Euclidean Data
4 The Simulation
Data Generation
Scenarios
5 Results
General Results
More Detailed Insight
6 Conclusions
References
High-Dimensional Feature Selection for Logistic Regression Using Blended Penalty Functions
1 Introduction
2 Penalised GLM with the MEnet Penalty
Modified Elastic-Net Penalty
Penalised Likelihood Function
Reforming of the MEnet Penalty Term
Parameter Estimation
3 Simulation Study
4 Colon Cancer Classification
5 Conclusion and Future Work
References
A Generalized Quadratic Garrote Approach Towards Ridge Regression Analysis
1 Introduction
2 Quadratic Garrote
Variance and Bias
3 Simulation Study
Sparse Setting
Nearly-Sparse Setting
High Dimensional Setting
4 Example: The Boston Housing Dataset
5 Discussion
References
High-Dimensional Nonlinear Optimization Problem in Semiparametric Regression Model
1 Introduction
2 Differencing Approach to Approximate the Model
How Does the Approximation Work?
3 Ridge Estimation of Sparse Semiparametric Regression Model
4 Least Absolute Shrinkage and Selection Operator Approach
5 A Mathematical Heuristic Algorithm for Estimation of High-Dimensional SRM
6 Numerical studies
Application to Riboflavin Production Data Set
Some Simulation Studies
7 Summary and Conclusions
References
Frontiers in Robust Analysis and Mixture Modelling
Parsimonious Finite Mixtures of Matrix-Variate Regressions
1 Introduction
2 Methodology
Parsimonious Matrix-Variate FMR
Maximum Likelihood Estimation
Computational and Operative Details
3 Data Analyses
Simulated Data
Real Data
4 Conclusions
References
Robust Multivariate Modelling for Heterogeneous Data Sets with Mixtures of Multivariate Skew Laplace Normal Distributions
1 Introduction
2 The MSLN Distribution
3 Finite Mixtures of the MSLN Distributions
ML Estimation
Initial Values
The Empirical Information Matrix
4 Applications
Simulation Study
An Illustrative Real Data Example: Old Faithful Geyser Data Set
5 Conclusions
References
Robust Estimation Through Preliminary Testing Based on the LAD-LASSO
1 Introduction
2 LAD-LASSO Estimator
3 Improvement Strategy on LAD
4 Numerical Study
Synthetic Data Analysis
Gross Domestic Product Data Analysis
5 Codes
6 Conclusion
References