Why We Don’t Really Know What "Statistical Significance" Means: A Major Educational Failure
Postprint version. Published in Journal of Marketing Education, Volume 28, Issue 2, August 2006, pages 114-120.
Publisher URL: http://dx.doi.org/10.1177/0273475306288399
The Neyman-Pearson theory of hypothesis testing, with the Type I error rate, α, as the significance level, is widely regarded as statistical testing orthodoxy. Fisher’s model of significance testing, where the evidential p value denotes the level of significance, nevertheless dominates statistical testing practice. This paradox has occurred because these two incompatible theories of classical statistical testing have been anonymously mixed together, creating the false impression of a single, coherent model of statistical inference. We show that this hybrid approach to testing, with its misleading p < α statistical significance criterion, is common in marketing research textbooks, as well as in a large random sample of papers from twelve marketing journals. That is, researchers attempt the impossible by simultaneously interpreting the p value as a Type I error rate and as a measure of evidence against the null hypothesis. The upshot is that many investigators do not know what our most cherished, and ubiquitous, research desideratum - "statistical significance" - really means. This, in turn, signals an educational failure of the first order. We suggest that tests of statistical significance, whether p’s or α’s, be downplayed in statistics and marketing research courses. Classroom instruction should focus instead on teaching students to emphasize the use of confidence intervals around point estimates in individual studies, and the criterion of overlapping confidence intervals when one has estimates from similar studies.
Raymond Hubbard and J. Scott Armstrong. "Why We Don’t Really Know What "Statistical Significance" Means: A Major Educational Failure" Marketing Papers (2006).
Available at: http://works.bepress.com/j_scott_armstrong/47