摘要:
心理学的可重复性危机有两大已知的根源:传统统计学中虚无假设显著性检验体系的局限,和心理学的学术传统中的弊端,本文以开放科学协作组2015年报告的数据为依据,试对这两个根源的影响作一粗略的估算。采用Goodman(1992)和Cumming(2008)提出的方法对传统统计体系所加诸于可重复性的限制加以分析后,估算的结果表明传统统计学体系的制约,虽然有举足轻重的影响,却远不能完全解释该报告中低至36%的可重复率,该报告所反映的状况,显然还另有重大的非统计学的根源。本文进一步用Ioannidis(2005)提出的模型对这类非统计学因素的影响加以分析。分析后得到的若干组人为偏差率和Ha真实概率的估算,表明在原来研究所获得的几乎清一色的阳性结果中,大约只有不到三分之一或更低的比例是真阳性,而且相当部分的阳性结果,可能由人为偏差所造成。这样的分析可比较具体地描述该类因素对当前可重复性危机的可能影响。
危机根源的评估
Abstract:
The reproducibility crisis in psychology is known to have two roots, the root in the traditional statistical system of null hypothesis significance testing, and that in the academic tradition of psychology. This article was an attempt to crudely estimate the respective impacts of the two roots on the reproducibility crisis in psychology. The results reported by Open Science Collaboration (2015) were analyzed using the methods suggested by Goodman (1992) and by Cumming (2008) to roughly estimate the limiting influence on reproducibility imposed by the traditional system of statistics. The estimated limiting influence, although quite notable, appears to be far short of being able to account for the reproducibility rate as low as 36% indicated by the report, suggesting that factors other than the traditional system of statistics have played a tremendous role in the crisis. The model proposed by Ioannidis (2005) was adopted to analyze the possible impacts of factors other than the traditional system of statistics, and possible ranges of the joint impact of bias and the probability of true alternative hypotheses were extrapolated。 The analysis led to estimates indicating that, of all original positive results, only no more than one third, and probably even less, was true positive, and a considerable portion of these positive results was caused by bias. These results may help explicate how these factors are likely to contribute to the current crisis.
Key words:
the reproducibility crisis in psychology,
limitations of traditional system of statistical testing,
flaws in the academic tradition of psychology,
the probability of true alternative hypotheses,
bias,
evaluation on the roots of the reproducibility crisis
胡传鹏, 王非, 过继成思, 宋梦迪, 隋洁, 彭凯平. (2016). 心理学研究中的可重复性问题: 从危机到契机. 心理科学进展, 24(9), 1504-1518.
胡竹菁, 董圣鸿, 张阔. (2013). 《心理统计学》教学内容的新探索. 心理学探新, 33(5), 402-408.
焦璨, 张敏强. (2014). 迷失的边界: 心理学虚无假设检验方法探究. 中国社会科学, (2), 148-163.
朱滢. (2016). “开放科学 数据共享 软件共享”, 你准备好了吗?. 心理科学进展, 24(6), 995-996.
Anderson, C. J., Bahník, Š., Barnett-Cowan, M., Bosco, F. A., Chandler, J., Chartier, C. R., …Zuni, K. (2016). Response to comment on “Estimating the reproducibility of psychological science.”. Science, 351, 1037.
Baker, M. (2015). Over half of psychology studies fail reproducibility test.
Nature
. http://dx.doi.org/10.1038/nature.2015.182.
Baker, M. (2016). Psychology's reproducibility problem is exaggerated-say psychologists.
Nature
. http://dx.doi.org/10.1038/nature.2016.19498.
Carroll, A. E. (2017, May). Science needs a solution for the temptation of positive results.
The New York Times
. Retrieved from http://www.nytimes.com/.
Cumming, G.(2008). Replication and p intervals: P values predict the future only vaguely, but confidence intervals do much better. Perspectives on Psychological Science, 3(4), 286-300.
Ebersole, C. R., Atherton, O. E., Belanger, A. L., Skulborstad, H. M., Allen, J. M., Banks, J. B., … Nosek, B. A. (2016). Many Labs 3: Evaluating participant pool quality across the academic semester via replication. Journal of Experimental Social Psychology, 67, 68-82.
Gilbert, D. T., King, G., Pettigrew, S., & Wilson, T. D. (2016). Comment on “Estimating the reproducibility of psychological science ”. Science, 351, 1037.
Goodman, S. N. (1992). A comment on replication, P-values and evidence. Statistics in Medicine, 11(7), 875-879.
Ioannidis, J. P. A. (2005). Why most published research findings are false. PLoS Medicine, 2, e124.
Ioannidis, J. P. A. (2012). Why science is not necessarily self-correcting. Perspectives on Psychological Science, 7, 645-654.
John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524-532.
Joober, R., Schmitz, N., Annable, L., & Boksa, P. (2012). Publication bias: What are the challenges and can they be overcome? Journal of Psychiatry & Neuroscience, 37(3), 149-152.
Kaiser, J. (2017). Rigorous replication effort succeeds for just two of five cancer papers. Science. doi:10.1126/science.aal0628. [DOI:10.1126/science.aal0628]
Klein, R. A., Ratliff, K. A., Vianello, M., Adams, R. B., Jr., Bahník, Š., Bernstein, M. J., … Nosek, B. A. (2014). Investigating variation in replicability: A “many labs” replication project. Social Psychology, 45, 142-152.
Nosek, B. A., Spies, J. R., & Motyl, M. (2012). Scientific utopia: Ⅱ. Restructuring incentives and practices to promote truth over publishability. Perspectives on Psychological Science, 7, 615-631.
Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349, aac4716.
Wagenmakers, E. J., Wetzels, R., Borsboom, D., van der Maas, H. L. J., & Kievit, R. A. (2012). An agenda for purely confirmatory research. Perspectives on Psychological Science, 7(6), 632-638.