The Significance Test Controversy Revisited: The Fiducial Bayesian Alternative

This document was uploaded by one of our users. The uploader already confirmed that they had the permission to publish it. If you are author/publisher or own the copyright of this documents, please report to us by using this DMCA report form.

Simply click on the Download Book button.

Yes, Book downloads on Ebookily are 100% Free.

Sometimes the book is free on Amazon As well, so go ahead and hit "Search on Amazon"

This book explains the misuses and abuses of Null Hypothesis Significance Tests, which are reconsidered in light of Jeffreys’ Bayesian concept of the role of statistical inference, in experimental investigations. Minimizing the technical aspects, the studies focuses mainly on methodological contributions.

The first part of the book gives an overview of the major approaches to statistical testing and an enlightening discussion of the philosophies of Fisher, Neyman-Pearson and Jeffrey. The conceptual and methodological implications of current practices of reporting effect sizes and confidence intervals are also examined and challenged. This sheds new light on the "significance testing controversy" and provides an appropriate Bayesian framework for a comprehensive approach to the analysis and interpretation of experimental data.

The second part of the book provides concrete Bayesian routine procedures that bypass common misuses of significance testing and are readily applicable in a wide range of real applications. This approach addresses the need for objective reporting of experimental data, that is acceptable to the scientific community. This is emphasized by the name fiducial (from the Latin fiducia = confidence). The fiducial Bayesian procedures provide the reader with a real opportunity to think sensibly about problems of statistical inference.

This book prepares students and researchers to critically read statistical analyses reported in the literature and equips them with an appropriate alternative to the use of significance testing.

 

Author(s): Bruno Lecoutre, Jacques Poitevineau
Edition: 2
Publisher: Springer
Year: 2022

Language: English
Pages: 205
City: Berlin

Preface
What About the Current Practice?
What If Significance Is Retired?
The Fiducial Bayesian Alternative
Contents
Acronyms
1 Introduction
1.1 The Stranglehold of Significance Tests
1.2 Beyond the Significance Test Controversy
1.3 The Bayesian Methods
1.4 The Fiducial Bayesian Inference
1.5 The Feasibility of Fiducial Bayesian Methods
Part I The Significance Test Controversy Revisited
2 Frequentist and Bayesian Inference
2.1 Two Different Approaches to Statistical Inference
2.2 The Frequentist Approach: From Unknown to Known
2.2.1 Sampling Probabilities
2.2.2 Null Hypothesis Significance Testing in Practice
2.2.3 Confidence Interval
2.3 The Bayesian Approach: From Known to Unknown
2.3.1 The Likelihood Function and the Bayesian Probabilities
2.3.2 An Opinion-Based Analysis
2.3.3 A ``No Information Initially'' Analysis
2.3.4 Bayesian Procedures Are No More Arbitrary
2.3.5 Epilogue
3 Three Views of Statistical Tests
3.1 The Fisher Test of Significance
3.2 The Neyman-Pearson Hypothesis Test
3.3 The Jeffreys Bayesian Approach to Testing
3.4 Different Views of Statistical Inference
3.4.1 The Aim of Statistical Inference
3.4.2 The Role of Bayesian Probabilities
3.4.3 Statistical Tests: Judgment, Action, or Decision?
3.5 One Theory or Two?
3.6 Concluding Remarks
4 GHOST: An Officially Recommended Practice
4.1 Null Hypothesis Significance Testing
4.1.1 An Amalgam
4.1.2 Misuses and Abuses
4.2 What About the Researcher's Point of View?
4.3 An Official Good Statistical Practice
4.3.1 Guidelined Hypotheses Official Significance Testing
4.3.2 A Hybrid Practice
5 The Significance Test Controversy Revisited
5.1 Significance Tests vs Pure Estimation
5.2 The Null Hypothesis: A Straw Man
5.3 Usual Two-Sided Tests Do Not Tell the Direction
5.4 Determining Sample Size
5.5 Critique of P-Values: A Need to Rethink
5.5.1 Jeffreys' Answer to the Problem of Pure Estimation
5.5.2 The Bayesian Interpretation of the P-Value
5.5.3 Student's Conception
5.5.4 Jaynes' Bayesian Test
5.5.5 The Bayesian Interpretation of the Two-Sided P-Value
5.5.6 Killeen's prep
5.6 Decision and Estimation
5.7 The Role of Previous Information and the Sample Size
5.8 The Limited Role of Significance Problems
5.9 Non-inferiority and Equivalence Questions
5.10 Other Issues
5.10.1 The Bayes Factor
5.11 The Reference Prior Approach
5.12 Stopping Rules and the Likelihood Principle
6 Reporting Effect Sizes: The New Star System
6.1 What Is an Effect Size?
6.2 Abuses and Misuses Continue
6.2.1 The Robust Beauty of Simple Effect Sizes
6.2.2 Heuristic Benchmarks: A New Star System
6.2.3 A Good Adaptive Practice Is Not a Good Statistical Practice
6.2.4 The Need for a More Appropriate Sample Size
6.2.5 The Shortcomings of the Phi Coefficient
6.3 When Things Get Worse
6.3.1 A Lot of Choices for a Standardized Difference
6.3.2 A Plethora of ES Indicators
6.3.3 Don't Confuse a Statistic with a Parameter
6.4 Two Lessons
7 Reporting Confidence Intervals: A Paradoxical Situation
7.1 Three Views of Interval Estimates
7.1.1 The Bayesian Approach (Laplace, Jeffreys)
7.1.2 Fisher' Fiducial Inference
7.1.3 Neyman's Frequentist Confidence Interval
7.2 What Is a Good Interval Estimate?
7.2.1 Conventional Frequentist Properties
7.2.2 The Fatal Disadvantage of ``Shortest Intervals''
7.2.3 One-Sided Probabilities Are Needed
7.2.4 The Jeffreys Credible Interval Is a Great Frequentist Procedure
7.3 Neyman-Pearson's Criterion Questioned
7.3.1 Noncentral F-Based Confidence Intervals Are Inconsistent
7.3.2 The Official Procedure for Demonstrating Equivalence
7.4 Isn't Everyone a Bayesian?
Part II Retire Statistical Significance? The Fiducial Bayesian Alternative
8 Routine Procedures for Proportions
8.1 One Proportion
8.1.1 Binomial Model: The Clinical Trial Example
8.1.2 The Bayesian Solution
8.1.3 The Use of Predictive Probabilities
8.1.4 Equal-Tailed vs Highest Posterior Density Credible Intervals
8.1.5 Other Sampling Models for One Proportion
8.2 Response-Adaptive Designs
8.2.1 The Play-the-Winner Rule
8.2.2 Generalizations: Other Response-Adaptive Designs
8.3 Multinomial Model and Measures of Association
8.3.1 Psychological Example: The Use of Fractions
8.3.2 Epidemiological Example: Disease and Risk Exposure
8.3.3 The Bayesian Solution
8.4 Poisson Model for Rare Events
8.4.1 A Randomized Clinical Trial Example: Adverse Events
8.4.2 The Bayesian Solution
9 Fuzzy Procedures and Frequentist Inference
9.1 Ignorance Zone and Fuzzy Fiducial Bayesian Inference
9.1.1 The Inclusive and Exclusive Conventions
9.1.2 The Coherence of Noninformative Priors
9.1.3 Ignorance Zone and Frequentist Properties
9.1.4 Fuzzy Fiducial Bayesian Inference
9.1.5 Other Sampling Models for One Proportion
9.2 Generalizations
9.2.1 Poisson Model
9.2.2 Multinomial Model
9.3 Bayesian Predictive and Frequentist Reverse Probabilities
9.3.1 Reverse Stochastic Curtailment
9.3.2 Reverse Tests and Bayesian Predictive Probabilities
9.3.3 Reverse Bayesian Inference
10 Basic Procedures for Means
10.1 Fiducial Bayesian Methods for an Unstandardized Contrast
10.1.1 The Student Pharmaceutical Example
10.1.2 Specific Inference
10.2 Fiducial Bayesian Methods for a Standardized Contrast
10.2.1 A Conceptually Straightforward Generalization
10.2.2 Inference About the Proportion of Population Differences
10.3 Inference About Pearson's Correlation Coefficient
10.4 A Coherent Bayesian Alternative to GHOST
10.4.1 P-Values
10.4.2 Power
10.4.3 Interval Estimates
10.4.4 Effect Sizes
10.4.5 Making Predictions
10.4.6 Data Planning and Monitoring
10.5 Our Guidelines
11 ANOVA Procedures
11.1 From F Tests to Fiducial Bayesian Methods
11.1.1 The Traditional Approach
11.1.2 Fiducial Bayesian Procedures
11.1.3 Numerical Application
11.1.4 Some Conceptual and Methodological Considerations
11.2 The Scheffé Simultaneous Interval Estimate
11.3 Contrast Analysis
11.4 Covariance Analysis
11.5 Illustrative Example: The ``0.05 Cliff Effect''
11.5.1 Numerical Results
11.5.2 A Cliff Effect Indicator
11.5.3 An Overall Analysis Is Not Suitable
11.5.4 An Appropriate Analysis
11.6 Avoid ``Canned'' Effect Sizes
11.6.1 Standardized Effects Are Problematic
11.6.2 The Abuses of Canned Effect Sizes
11.7 Our Guidelines for ANOVA
12 Time-to-Event Data
12.1 Exponential Model with Right Censoring
12.1.1 The Fiducial Bayesian Solution
12.2 The Posterior Predictive Distribution
12.2.1 Numerical Examples
12.2.2 Computational Illustration: The Myelomatosis Data
12.3 Weibull Model with Right Censoring
12.3.1 The Fiducial Bayesian Solution
12.3.2 Computational Illustration: The Myelomatosis Data
12.4 European Myocardial Infarction Amiodarone Trial
12.4.1 Interim Analysis
12.4.2 Other Analyses
12.4.3 Some Interesting Features of the Bayesian Approach
13 Population (Hierarchical) Models
13.1 A Hierarchical Poisson Model
13.2 The Reaction Time Example
13.2.1 Analyses with a Single Fixed Factor
13.2.2 Subjects and Trials Random Factors (Population Model)
13.3 A Mathematical Multicenter Survey
13.3.1 ANOVA Mixed Models
13.3.2 Analysis with a Single Random Factor
13.3.3 Pupils and Classes Random Factors (Population Model)
13.3.4 Fiducial Bayesian Distributions
13.3.5 Moments of the fB Distributions and Approximations
13.4 The Two-Period Two-Treatment Cross-Over Design
13.4.1 A Drug Versus Placebo Trial
13.4.2 A Hierarchical Model
13.5 Methodological Remarks About Hierarchical Models
14 The Contribution of Informative Priors
14.1 Informative Priors
14.1.1 Skeptical and Enthusiastic Priors
14.1.2 Mixtures of Beta Densities
14.1.3 The Bayes Factor
14.2 Examining the Sensitivity of the Conclusions to the Prior
14.2.1 Some Technical Results
14.2.2 Illustration
14.2.3 The Role of Informative Bayesian Techniques
15 Conclusion
References
Index