Glass (1981) and Cohen (1990) believed that effect sizes were integral to statistical interpretation. The APA Task Force on Statistical Inference has long advocated for the inclusion of effect sizes (Wilkinson, 1999). Many statisticians suggest that a report of the effect size is necessary to maintain the integrity of research, a topic that has recently garnered great debate4 due to the prevalence of poor replicability and other all too common failures in research. However, the vast majority of peer-reviewed, published academic studies stop short of reporting effect sizes and confidence intervals (Cumming, 2014). Often, an over-reliance on p-values conceals the fact that the study is under-powered (Halsey, Curran-Everett, Vowler, & Drummond, 2015); a test may be statistically significant, yet practically inconsequential (Fritz, Scherndl, & Kühberger, 2013).
The effect size is a measure of the magnitude of the effect. How much did grades improve after an intervention? To what degree were symptoms reduced after a treatment? Effect size supplements p-values by providing this critical information. For the amateur researcher, selecting an appropriate hypothesis test can be a challenging process to understand, with the approximation of the effect size only adding to the difficulty. Providing student researchers with a streamlined tool to calculate effect size and understand the interpretation of test statistics will serve to benefit the field of psychological research over time.
Our team designed our effect size tool with Shiny (Chang, Cheng, Allaire, Xie, & McPherson, 2017), a package in R (2013). The application relies on mathematical operations provided by MOTE (Magnitude of the Effect), a versatile package developed by Buchanan, Scofield, and Valentine (2017). To begin, the user simply selects the research design and corresponding effect size with intuitive drop-down menus (as seen below). The output includes a helpful description, a video tutorial, and statistics in APA style, including the effect size and the confidence interval. This application is designed for future implementation in statistics classrooms at the undergraduate and graduate level. Student and faculty feedback during beta testing has been overwhelmingly positive. We believe this application will aid in both teaching and learning statistics and research methods.
Buchanan, E., Scofield, J. and Valentine, K.D. (2017). MOTE: Effect Size and Confidence Interval Calculator. R package version 0.0.0.9100.
Chang, W., Cheng, J., Allaire, J., Xie Y. and McPherson, J. (2017). Shiny: Web Application Framework for R. R package version 1.0.5.
Cohen, J. (1990). Things I have learned (so far). American psychologist, 45(12), 1304, as cited by Sullivan & Feinn (2012).
Cumming, G. (2014). The new statistics: Why and how. Psychological science, 25(1), 7-29.
Fritz, A., Scherndl, T., & Kühberger, A. (2013). A comprehensive review of reporting practices in psychological journals: Are effect sizes really enough?. Theory & Psychology, 23(1), 98-122.
Glass, G.V., B. McGaw, and M.L. Smith (1981), Meta-Analysis in Social Research. Sage: Beverly Hills, as cited by Sullivan, G. M., & Feinn, R. (2012). Using effect size—or why the P value is not enough. Journal of graduate medical education, 4(3), 279-282.
Halsey, L. G., Curran-Everett, D., Vowler, S. L., & Drummond, G. B. (2015). The fickle P value generates irreproducible results. Nature methods, 12(3), 179.
R Core Team (2013). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.
Wilkinson, L., & American Psychological Association Task Force on Statistical Inference. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594-604.