Statistical Report Reform in Second Language Research: A Case Of Experimental Designs

Authors

  • Eka Fadilah Universitas Widya Kartika Surabaya, Indonesia

DOI:

https://doi.org/10.30762/jeels.v8i2.3415

Keywords:

statistical reforms, power analyses, effect size, confidence interval, second language research

Abstract

This survey aims to review statisical report procedures in the experimental studies appearing in ten SLA and Applied Linguistic journals from 2011 to 2017. We specify our study on how the authors report and interprete their power analyses, effect sizes, and confidence intervals. Results reveal that of 217 articles, the authors reported effect sizes (70%), apriori power and posthoc power consecutively (1.8% and 6.9%), and confidence intervals (18.4%). Additionally, it shows that the authors interprete those statistical terms counted 5.5%, 27.2%, and 6%, respectively. The call for statistical report reform recommended and endorsed by scholars, researchers, and editors is inevitably echoed to shed more light on the trustworthiness and practicality of the data presented.

References

American Psychological Association. (2010). Publication manual of the American Psychological Association (6th ed.). Washington, DC: Author.

American Educational Research Association. (2006). Standards for reporting on empirical social science research in AERA publications. Educational Researcher, 35(6), 33–40.

Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., & Munafo, M. R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14 (5), 365–376. doi:10.1038/ Nrn3475

Brown, J., D. (2015). Why bother learning advanced quantitative methods in L2 research. In: L Plonsky (ed.) Advancing quantitative methods in second language research. New York: Routledge, pp. 9–20.

Byrnes, H. (2013). Notes from the editor. Modern Language Journal, 97, 825–827

Chapelle, C. A., & Duff, P. A. (2003). Some guidelines for conducting quantitative and qualitative research in TESOL. TESOL Quarterly, 37, 157–178.

Cohen, J. (1988). Statistical power analysis for the behavioral sciences (2nd ed.). Hillsdale, NJ: Erlbaum.

Cohen, J. (1994). The earth is round (p < .05). American Psychologist, 49, 97–1003.

Cook, V. (1999). Using SLA research in language teaching. International Journal of Applied Linguistics, 9 (2), 267-284.

DeKeyser, R. and Schoonen, R. (2007), Editors' announcement. Language Learning, 57, IX–X. doi:10.1111/j.1467-9922.2007.00396_2.x

Cumming, G. & Fidler, F. (2010). Effect sizes and confidence intervals. in G.,R. Hancock & R.O.Mueller (Eds.): The reviewer’s guide to quantitative methods in the social sciences (pp. 79-91). New York: Routledge

Cumming, G. (2014). The new statistics: Why and how. Psychological Science, 25(1), 7-29.

Cumming, G. (2012). Understanding the new statistics: Effect sizes, confidence intervals, and meta-analysis. New York: Routledge.

Durlak, J.A., (2009). How to select, calculate, and interpret effect sizes. Journal of Pediatric Psychology, 34 (9), 917–928. doi:10.1093/jpepsy/jsp004

Ellis, N. C. (2000). Editorial statement. Language Learning, 50, xi–xiii.

Fadilah, E. (2018). Oral corrective feedback on students’ grammatical accuracy and willingness to communicate in EFL classroom: the effects of focused and unfocused prompts. ASIAN EFL JOURNAL 20 (4), 199-220.

Faul, F., Erdfelder, E., Lang, A.,G., and Bushner, A. (2007). G*Power 3: A flexible statistical power analysis program for the social, behavioral, and biomedical sciences. Behavior Research Methods, 39 (2), 175-191.

Fidler, F. (2002). The fifth edition of the APA Publication Manual: Why its statistics recommendations are so controversial. Educational and Psychological Measurement, 62, 749–770.

Gass, S. (2009). ‘A survey of SLA research’ in W. Ritchie and T. Bhatia (eds): Handbook of Second Language Acquisition. Emerald, pp. 3–28.

Gigerenzer, G. (2004). Mindless statistic. The Journal of Socio-Economics, 33, 587-606.

Greenland, S. (2012). Nonsignificance plus high power does not imply support for the null over the alternative. Ann Epidemiol, 22 (5), 364–368.

Harlow, L.L., Mulaik, S.A., 1935- & Steiger, J.,H. (1997). What if there were no significance tests? Mahwah, N.J. Lawrence Erlbaum Associates Publishers

Howell, D. C. (2002). Statistical methods for psychology. Pacific Grove, CA: Duxbury/Thomson Learning.

Ioannidis J.,P.,A. (2005). Why most published research findings are false. PLoS Medicine, 2 (8), 1-24

Kline, R. (2004). Beyond significance testing: Reforming data analysis methods in behavioral research. Washington, DC: American Psychological Association.

Larson-Hall, J. & Plonsky, L. (2015). Reporting and interpreting quantitative research findings: what gets reported and recommendation for the field. Language Learning, 65, 127-159.

Larson¬Hall, J. (2012). Our statistical intuitions may be misleading us: Why we need robust statistic. Language Teaching, 45 (4), 460-474.

Larson-Hall, J. (2010). A guide to doing statistics in second language research using SPSS. Routledge.

Larson-Hall, J. & Herrington, R.(2010). ‘Improving data analysis in second language acquisition by utilizing modern developments in applied statistics,’ Applied Linguistics,31, 368–90.

Lazaraton, A., Riggenbach, H., & Ediger, A. (1987). Forming a discipline: Applied linguists’ literacy in research methodology and statistics. TESOL Quarterly, 21, 263–277.

Lindstromberg, S. (2016). Inferential statistics in Language Teaching Research: A review and ways forward. Language Teaching Research, 20 (6), 741-768.

Loewen, S. & Gass, S. (2009). The use of statistics in SLA. Language Teaching, 42 (2), 181-196.

Loewen, S., Lavolette, E., Spino, L. A., Papi, M., Schmidtke, J., Sterling, S. and Wolff, D. (2014). Statistical literacy among applied linguists and second language acquisition researchers. TESOL Quarterly, 48, 360–388. doi:10.1002/tesq.128.

Maxwell, S.E., Kelley, k., & Rausch, J., R. (2008). Sample size planning for statistical power and accuracy in parameter estimation. Annual Review Psychology, 59, 537-563.

Miles, M.B., Huberman, A.M., & Saldana, J. (2014). Qualitative data analysis: A methods sourcebook (3rd ed.). California: SAGE.

Murphy, K. R., & Myors, B. (2004). Statistical Power Analysis: A Simple and General Model for Traditional and Modern Hypothesis Tests (2nd ed.). Mahwah, NJ: Erlbaum.

Murphy, K., R. (2010). Power analysis. in G.,R. Hancock & R.O.Mueller (Eds.): The reviewer’s guide to quantitative methods in the social sciences (pp. 329-336). New York: Routledge

Norris, J. M., & Ortega, L. (Eds.). (2006). Synthesizing research on language learning and teaching. Amsterdam: John Benjamins

Norris, J. M., Plonsky, L., Ross, S. J., & Schoonen, R. (2015). Guidelines for reporting quantitative methods and results in primary research. Language Learning, 65, 470–476.

Norris, J. M. (2015). Statistical significance testing in second language research: Basic problems and suggestions for reform. Language Learning, 65(S1), 97–126.

Oswald, F. L., & Plonsky, L. (2010). Meta-analysis in second language research: Choices and challenges. Annual Review of Applied Linguistics, 30, 85–110.

Plonsky, L. & Gass, S. (2011). ‘Quantitative research methods, study quality, and outcomes: The case of interaction research.’ Language Learning, 61, 325–66.

Plonsky, L. (2013). Study quality in SLA: An assessment of designs, analyses, and reporting practices in quantitative L2 research. Studies in Second Language Acquisition, 35, 655–687

Plonsky, L. (2014). Study quality in quantitative L2 research (1990–2010): A methodological synthesis and call for reform. Modern Language Journal, 98, 450–470.

Plonsky, L., & Oswald, F. (2014). How big is “big”? Interpreting effect sizes in L2 research. Language Learning, 64(4), 878–912.

Plonsky, L. (2015). Statistical power, p values, descriptive statistics, and effect sizes: A “back-tobasics” approach to advancing quantitative methods in L2 research. In L. Plonsky (Ed.), Advancing quantitative methods in second language research (pp. 23–45). New York, NY: Routledge.

R Core Team. (2012). R: A Language and environment for statistical computing. r foundation for statistical computing, available at http://www. R-project.org/.

Russell, J., & Spada, N. (2006). The effectiveness of corrective feedback for the acquisition of L2 grammar. In J. Norris & L. Ortega (Eds.), Synthesizing research on language learning and teaching (pp. 133–164). Amsterdam: John Benjamins.

Thompson, B. (1992). Two and one-half decades of leadership in measurement and evaluation. Journal of Counseling and Development, 70, 434–438.

Thompson, B. (2001). Significance, effect sizes, stepwise methods, and other issues: Strong arguments move the field. Journal of Experimental Education, 70, 80–93.

Thompson, B. (2002). What future quantitative social science research could look like: Confidence intervals for effect sizes? Educational Researcher, 31 (3), 25-32.

Tressoldi, P. E., Giofre, D., Sella, F., & Cumming, G. (2016). High impact = high statistical standards? not necessarily so. PLoS ONE 8(2), 1-7. doi:10.1371/journal.pone.0056180

Volker, M. A. (2006). Reporting effect size estimates in school psychology research. Psychology in the Schools, 43, 653–672.

Wilkinson, L., & (1999). Task Force on Statistical Inference. Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594–604.

Zumbo, B.D. & Hubley, A.,M. (1998). A note on misconceptions concerning prospective and retrospective power. The Statistician, 47(2), 385–388.

Downloads

Published

2021-11-25

How to Cite

Fadilah, E. (2021). Statistical Report Reform in Second Language Research: A Case Of Experimental Designs. JEELS (Journal of English Education and Linguistics Studies), 8(2), 175–201. https://doi.org/10.30762/jeels.v8i2.3415