Data can be extremely valuable if we are able to extract information from them. This is why multivariate data analysis is essential for business and science. This book offers an easy-to-understand introduction to the most relevant methods of multivariate data analysis. It is strictly application-oriented, requires little knowledge of mathematics and statistics, demonstrates the procedures with numerical examples and illustrates each method via a case study solved with IBM’s statistical software package SPSS. Extensions of the methods and links to other procedures are discussed and recommendations for application are given. An introductory chapter presents the basic ideas of the multivariate methods covered in the book and refreshes statistical basics which are relevant to all methods.
For the 2nd edition, all chapters were checked and calculated using the current version of IBM SPSS.
Contents
- Introduction to empirical data analysis
- Regression analysis
- Analysis of variance
- Discriminant analysis
- Logistic regression
- Contingency analysis
- Factor analysis
- Cluster analysis
- Conjoint analysis
The original German version is now available in its 17th edition. In 2015, this book was honored by the Federal Association of German Market and Social Researchers as “the textbook that has shaped market research and practice in German-speaking countries”. A Chinese version is available in its 3rd edition.
On the website www.multivariate-methods.info, the authors further analyze the data with Excel and R and provide additional material to facilitate the understanding of the different multivariate methods. In addition, interactive flashcards are available to the reader for reviewing selected focal points. Download the Springer Nature Flashcards App and use exclusive content to test your knowledge.
Author(s): Klaus Backhaus, Bernd Erichson, Sonja Gensler, Rolf Weiber, Thomas Weiber
Publisher: Springer Gabler
Year: 2023
Language: English
Pages: 617
City: Wiesbaden
www.multivariate-methods.info
Preface 2nd Edition
Preface
Contents
1 Introduction to Empirical Data Analysis
1.1 Multivariate Data Analysis: Overview and Basics
1.1.1 Empirical Studies and Quantitative Data Analysis
1.1.2 Types of Data and Special Types of Variables
1.1.2.1 Scales of Empirical Data
1.1.2.2 Binary and Dummy Variables
1.1.3 Classification of Methods of Multivariate Analysis
1.1.3.1 Structure-testing Methods
1.1.3.2 Structure-Discovering Methods
1.1.3.3 Summary and Typical Research Questions of the Different Methods
1.2 Basic Statistical Concepts
1.2.1 Basic Statistical Measures
1.2.2 Covariance and Correlation
1.3 Statistical Testing and Interval Estimation
1.3.1 Conducting a Test for the Mean
1.3.1.1 Critical Value Approach
1.3.1.2 Using the p-value
1.3.1.3 Type I and Type II Errors
1.3.1.4 Conducting a One-tailed t-test for the Mean
1.3.2 Conducting a Test for a Proportion
1.3.3 Interval Estimation (Confidence Interval)
1.4 Causality
1.4.1 Causality and Correlation
1.4.2 Testing for Causality
1.5 Outliers and Missing Values
1.5.1 Outliers
1.5.1.1 Detecting Outliers
1.5.1.2 Dealing with Outliers
1.5.2 Missing Values
1.6 How to Use IBM SPSS, Excel, and R
References
2 Regression Analysis
2.1 Problem
2.2 Procedure
2.2.1 Model Formulation
2.2.2 Estimating the Regression Function
2.2.2.1 Simple Regression Models
2.2.2.2 Multiple Regression Models
2.2.3 Checking the Regression Function
2.2.3.1 Standard Error of the Regression
2.2.3.2 Coefficient of Determination (R-square)
2.2.3.3 Stochastic Model and F-test
2.2.3.4 Overfitting and Adjusted R-Square
2.2.4 Checking the Regression Coefficients
2.2.4.1 Precision of the Regression Coefficient
2.2.4.2 t-test of the Regression Coefficient
2.2.4.3 Confidence Interval of the Regression Coefficient
2.2.5 Checking the Underlying Assumptions
2.2.5.1 Non-linearity
2.2.5.2 Omission of Relevant Variables
2.2.5.3 Random Errors in the Independent Variables
2.2.5.4 Heteroscedasticity
2.2.5.5 Autocorrelation
2.2.5.6 Non-normality
2.2.5.7 Multicollinearity and Precision
2.2.5.8 Influential Outliers
2.3 Case Study
2.3.1 Problem Definition
2.3.2 Conducting a Regression Analysis With SPSS
2.3.3 Results
2.3.3.1 Results of the First Analysis
2.3.3.2 Results of the Second Analysis
2.3.3.3 Checking the Assumptions
2.3.3.4 Stepwise Regression
2.3.4 SPSS Commands
2.4 Modifications and Extensions
2.4.1 Regression With Dummy Variables
2.4.2 Regression Analysis With Time-Series Data
2.4.3 Multivariate Regression Analysis
2.5 Recommendations
References
3 Analysis of Variance
3.1 Problem
3.2 Procedure
3.2.1 One-way ANOVA
3.2.1.1 Model Formulation
3.2.1.2 Variance Decomposition and Model Quality
3.2.1.3 Statistical Evaluation
3.2.1.4 Interpretation of the Results
3.2.2 Two-way ANOVA
3.2.2.1 Model Formulation
3.2.2.2 Variance Decomposition and Model Quality
3.2.2.3 Statistical Evaluation
3.2.2.4 Interpretation of the Results
3.3 Case Study
3.3.1 Problem Definition
3.3.2 Conducting a Two-way ANOVA with SPSS
3.3.3 Results
3.3.3.1 Two-way ANOVA
3.3.3.2 Post-hoc Tests for the Factor Placement
3.3.3.3 Contrast Analysis for the Factor Placement
3.3.4 SPSS Commands
3.4 Modifications and Extensions
3.4.1 Extensions of ANOVA
3.4.2 Covariance Analysis (ANCOVA)
3.4.2.1 Extension of the Case Study and Implementation in SPSS
3.4.2.2 Two-way ANCOVA with Covariates in the Case Study
3.4.3 Checking Variance Homogeneity Using the Levene Test
3.5 Recommendations
References
4 Discriminant Analysis
4.1 Problem
4.2 Procedure
4.2.1 Definition of Groups and Specification of the Discrimination Function
4.2.2 Estimation of the Discriminant Function
4.2.2.1 Discriminant Criterion
4.2.2.2 Standardization of the Discriminant Coefficients
4.2.2.3 Stepwise Estimation Procedure
4.2.2.4 Multi-group Discriminant Analysis
4.2.3 Assessment of the Discriminant Function
4.2.3.1 Assessment Based on the Discriminant Criterion
4.2.3.2 Comparing Estimated and Actual Group Membership
4.2.4 Testing the Describing Variables
4.2.5 Classification of New Observations
4.2.5.1 Distance Concept
4.2.5.2 Classification Functions Concept
4.2.5.3 Probability Concept
4.2.6 Checking the Assumptions of Discriminant Analysis
4.3 Case Study
4.3.1 Problem Definition
4.3.2 Conducting a Discriminant Analysis with SPSS
4.3.3 Results
4.3.3.1 Results of the Blockwise Estimation Procedure
4.3.3.2 Results of a Stepwise Estimation Procedure
4.3.4 SPSS Commands
4.4 Recommendations
References
5 Logistic Regression
5.1 Problem
5.2 Procedure
5.2.1 Model Formulation
5.2.1.1 The Linear Probability Model (Model 1)
5.2.1.2 Logit Model with Grouped Data (Model 2)
5.2.1.3 Logistic Regression (Model 3)
5.2.1.4 Classification
5.2.1.5 Multiple Logistic Regression (Model 4)
5.2.2 Estimation of the Logistic Regression Function
5.2.3 Interpretation of the Regression Coefficients
5.2.4 Checking the Overall Model
5.2.4.1 Likelihood Ratio Statistic
5.2.4.2 Pseudo-R-Square Statistics
5.2.4.3 Assessment of the Classification
5.2.4.4 Checking for Outliers
5.2.5 Checking the Estimated Coefficients
5.2.6 Conducting a Binary Logistic Regression with SPSS
5.3 Multinomial Logistic Regression
5.3.1 The Multinomial Logistic Model
5.3.2 Example and Interpretation
5.3.3 The Baseline Logit Model
5.3.4 Measures of Goodness-of-Fit
5.3.4.1 Pearson Goodness-of-Fit Measure
5.3.4.2 Deviance
5.3.4.3 Information Criteria for Model Selection
5.4 Case Study
5.4.1 Problem Definition
5.4.2 Conducting a Multinomial Logistic Regression with SPSS
5.4.3 Results
5.4.3.1 Blockwise Logistic Regression
5.4.3.2 Stepwise Logistic Regression
5.4.4 SPSS Commands
5.5 Modifications and Extensions
5.6 Recommendations
References
6 Contingency Analysis
6.1 Problem
6.2 Procedure
6.2.1 Creating a Cross Table
6.2.2 Interpretation of Cross Tables
6.2.3 Testing for Associations
6.2.3.1 Testing for Statistical Independence
6.2.3.2 Assessing the Strength of Associations
6.2.3.3 Role of Confounding Variables in Contingency Analysis
6.3 Case Study
6.3.1 Problem Definition
6.3.2 Conducting a Contingency Analysis with SPSS
6.3.3 Results
6.3.4 SPSS Commands
6.4 Recommendations and Relations to Other Multivariate Methods
6.4.1 Recommendations on How to Implement Contingency Analysis
6.4.2 Relation of Contingency Analysis to Other Multivariate Methods
References
7 Factor Analysis
7.1 Problem
7.2 Procedure
7.2.1 Evaluating the Suitability of Data
7.2.2 Extracting the Factors and Determining their Number
7.2.2.1 Graphical Illustration of Correlations
7.2.2.2 Fundamental Theorem of Factor Analysis
7.2.2.3 Graphical Factor Extraction
7.2.2.4 Mathematical Methods of Factor Extraction
7.2.2.4.1 Principal component analysis (PCA)
7.2.2.4.2 Factor-Analytical Approach
7.2.2.5 Number of Factors
7.2.2.6 Checking the Quality of a Factor Solution
7.2.3 Interpreting the Factors
7.2.4 Determining the Factor Scores
7.2.5 Summary: The Essence of Factor Analysis
7.3 Case Study
7.3.1 Problem Definition
7.3.2 Conducting a Factor Analysis with SPSS
7.3.3 Results
7.3.3.1 Prerequisite: Suitability
7.3.3.2 Results of PAF with Nine Variables
7.3.3.3 Product Positioning
7.3.3.4 Differences Between PAF and PCA
7.3.4 SPSS Commands
7.4 Extension: Confirmatory Factor Analysis (CFA)
7.5 Recommendations
References
8 Cluster Analysis
8.1 Problem
8.2 Procedure
8.2.1 Selection of Cluster Variables
8.2.2 Determination of Similarities or Distances
8.2.2.1 Overview of Proximity Measures in Cluster Analysis
8.2.2.2 Proximity Measures for Metric Data
8.2.2.2.1 Simple and Squared Euclidean Distance Metric (L2 Norm)
8.2.2.2.2 City Block Metric (L1 Norm)
8.2.2.2.3 Minkowski Metric (Generalization of the L Norms)
8.2.2.2.4 Pearson Correlation as a Measure of Similarity
8.2.3 Selection of the Clustering Method
8.2.3.1 Hierarchical Agglomerative Procedures
8.2.3.2 Single Linkage, Complete Linkage and the Ward Procedure
8.2.3.2.1 Single-Linkage Clustering (Nearest Neighbor)
8.2.3.2.2 Complete Linkage (Furthest Neighbor)
8.2.3.2.3 Ward’s Method
8.2.3.3 Clustering Properties of Selected Clustering Methods
8.2.3.4 Illustration of the Clustering Properties with an Extended Example
8.2.4 Determination of the Number of Clusters
8.2.4.1 Analysis of the Scree Plot and Elbow Criterion
8.2.4.2 Cluster Stopping Rules
8.2.4.3 Evaluation of the Robustness and Quality of a Clustering Solution
8.2.5 Interpretation of a Clustering Solution
8.2.6 Recommendations for Hierarchical Agglomerative Cluster Analyses
8.3 Case Study
8.3.1 Problem Definition
8.3.2 Conducting a Cluster Analysis with SPSS
8.3.3 Results
8.3.3.1 Outlier Analysis Using the Single-Linkage Method
8.3.3.2 Clustering Process using Ward’s Method
8.3.3.3 Interpretation of the Two-Cluster-Solution
8.3.4 SPSS Commands
8.4 Modifications and Extensions
8.4.1 Proximity Measures for Non-Metric Data
8.4.1.1 Proximity Measures for Nominally Scaled Variables
8.4.1.2 Proximity Measures for Binary Variables
8.4.1.2.1 Overview and Output Data for a Calculation Example
8.4.1.2.2 Simple Matching, Jaccard and RR Similarity Coefficients
8.4.1.2.3 Comparison of the Proximity Measures
8.4.1.3 Proximity Measures for Mixed Variables
8.4.2 Partitioning clustering methods
8.4.2.1 K-means clustering
8.4.2.1.1 Procedure of KM-CA
8.4.2.1.2 Conducting KM-CA with SPSS
8.4.2.2 Two-Step Cluster Analysis
8.4.2.2.1 Procedure of TS-CA
8.4.2.2.2 Conducting a TS-CA with SPSS
8.4.2.3 Comparing between KM-CA and TS-CA
8.5 Recommendations
References
9 Conjoint Analysis
9.1 Problem
9.2 Procedure
9.2.1 Selection of Attributes and Attribute Levels
9.2.2 Design of the Experimental Study
9.2.2.1 Definition of Stimuli
9.2.2.2 Number of Stimuli
9.2.3 Evaluation of the Stimuli
9.2.4 Estimation of the Utility Function
9.2.4.1 Specification of the Utility Function
9.2.4.2 Estimation of Utility Parameters
9.2.4.3 Assessment of the Estimated Utility Function
9.2.5 Interpretation of the Utility Parameters
9.2.5.1 Preference Structure and Relative Importance of an Attribute
9.2.5.2 Standardization of Utility Parameters
9.2.5.3 Aggregated Utility Parameters
9.2.5.4 Simulations Based on Utility Parameters
9.3 Case Study
9.3.1 Problem Definition
9.3.2 Conducting a Conjoint Analysis with SPSS
9.3.3 Results
9.3.3.1 Individual-level Results
9.3.3.2 Results of Joint Estimation
9.3.4 SPSS Commands
9.4 Choice-based Conjoint Analysis
9.4.1 Selection of Attributes and Attribute Levels
9.4.2 Design of the Experimental Study
9.4.2.1 Definition of Stimuli and Choice Sets
9.4.2.2 Number of Choice Sets
9.4.3 Evaluation of the Stimuli
9.4.4 Estimation of the Utility Function
9.4.4.1 Specification of the Utility Function
9.4.4.2 Specification of the Choice Model
9.4.4.3 Estimation of the Utility Parameters
9.4.4.4 Assessment of the Estimated Utility Function
9.4.5 Interpretation of the Utility Parameters
9.4.5.1 Preference Structure and Relative Importance of an Attribute
9.4.5.2 Disaggregated Utility Parameters
9.4.5.3 Simulations Based on Utility Parameters
9.4.5.4 Conclusion
9.5 Recommendations
9.5.1 Recommendations for Conducting a (Traditional) Conjoint Analysis
9.5.2 Alternatives to Conjoint Analysis
References
Index