The American Statistician, Volume 65, Issue 4, 2011, pp. 265-273
As the use of spreadsheet packages for statistical analysis increases, so does the need for assessing the reliability of these packages. This study compares the accuracy of six spreadsheet packages: Excel, Google Docs, Gnumeric, Numbers, OpenOffice Calc, and Quattro Pro. The National Institute of Standards and Technology (NIST) compiled sets of data specifically to test for computational accuracy. Certified statistically accurate computations for standard statistical procedures accompany these datasets. This study analyzes the accuracy of summary statistics such as the mean, standard deviation, and auto correlation as well as the
F statistics for a one-way ANOVA, and the coefficients and
R2 statistics in regression analysis using the Statistical Reference Datasets (StRD) provided by NIST. Wilkinson’s Tests are also examined to document a package’s ability to perform rounding, univariate statistics, scatterplots, and regression/correlation with particularly challenging data. The final analysis reports the accuracy of probability and percentile computations involving statistical distributions. The results suggest that Gnumeric is the most reliable both in performing statistical analysis and for calculations involving statistical distributions. Google Docs spreadsheet, while convenient, has deficiencies and should not be used for scientific statistical analysis. This article has supplementary material online.
KEYWORDS: Gnumeric; Microsoft Excel; OpenOffice; Open source; Software accuracy; Spreadsheet; StRD.