Sarah Belia and Fiona Fidler and Jennifer Williams and Geoff Cumming.
Researchers Misunderstand Confidence Intervals and Standard Error Bars.
In Psychological Methods, vol. 10, no. 4, pp. 389--396, 2005.


Links:

Abstract:

Little is known about researchers' understanding of confidence intervals (CIs) and standard error (SE) bars. Authors of journal articles in psychology, behavioral neuroscience, and medicine were invited to visit a Web site where they adjusted a figure until they judged 2 means, with error bars, to be just statistically significantly different (p less than 0.05). Results from 473 respondents suggest that many leading researchers have severe misconceptions about how error bars relate to statistical significance, do not adequately distinguish CIs and SE bars, and do not appreciate the importance of whether the 2 means are independent or come from a repeated measures design. Better guidelines for researchers and less ambiguous graphical conventions are needed before the advantages of CIs for research communication can be realized.

Bibtex:

@Article{        belia:2005:RMCI,
  author = 	 {Sarah Belia and Fiona Fidler and Jennifer Williams and Geoff Cumming},
  title = 	 {Researchers Misunderstand Confidence Intervals and Standard Error Bars},
  journal = 	 {Psychological Methods},
  year = 	 {2005},
  volume = 	 {10},
  number = 	 {4},
  pages = 	 {389--396},
  month = 	 {December},
}

Images:

References:

Altman, D. G.,Machin, D.,Bryant,T. N.,&Gardner,M.J.(2000). Statistics with confidence: Confidence intervals and statistical guidelines (2nd ed.). London: British Medical Journal Books. American Psychological Association. (2001). Publication manual of the American Psychological Association (5th ed.). Washington, DC: Author.
Cohen,J. (1990). ThingsI have learned so far. American Psychologist, 45, 1304-1312.
Cohen,J. (1994). The earth isround (p ! .05). American Psychologist, 49, 997-1003.
Cumming, G., & Finch, S. (2001). A primer on the understanding, use, and calculation of confidence intervals that are based on central and noncentral distributions. Educational and Psychological Measurement, 61, 530-572.
Cumming, G., & Finch, S. (2005). Inference by eye: Confidence intervals and how to read pictures of data. American Psychologist, 60, 170-180.
Cumming, G., Williams, J, & Fidler, F. (2004). Replication, and researchers' understanding of confidence intervals and standard error bars. Understanding Statistics, 3, 299-311.
Fidler, F., Thomason, N., Cumming, G., Finch, S., & Leeman, J. (2004). Editors can lead researchersto confidence intervals, but can'tmake themthink: Statisticalreformlessonsfrommedicine. Psychological Science, 15, 119-126.
Finch, S., Cumming, G., & Thomason, N. (2001). Reporting of statistical inference in the Journal of Applied Psychology: Little evidence of reform. Educational and Psychological Measurement, 61, 181-210.
Finch, S., Thomason, N., & Cumming, G. (2002). Past and future American Psychological Association guidelines for statistical practice. Theory and Psychology, 12, 825-853.
Grissom, R. J., & Kim, J. J. (2005). Effect sizes for research: A broad practical approach. Mahwah, NJ: Erlbaum.
Harlow, L. L., Mulaik, S. A., & Steiger,J. H. (Eds.). (1997). What if there were no significance tests? Mahwah, NJ: Erlbaum.
Kline, R. B. (2004). Beyond significance testing. Reforming data analysis methods in behavioral research. Washington, DC:American Psychological Association.
Masson, M. E. J., & Loftus, G. R. (2003). Using confidence intervals for graphically based data interpretation. Canadian Journal of Experimental Psychology, 57, 203-220.
Nickerson, R. (2000). Null hypothesis significance testing: A review of an old and continuing controversy. Psychological Methods, 5, 241-301.
Oakes, M. (1986). Statistical inference: A commentary for the social and behavioural sciences. Chichester, England: Wiley.
Payton, M. E., Greenstone, M. H., & Schenker, N. (2003). Overlapping confidence intervals orstandard errorintervals:What dothey mean in terms of statistical significance? Journal of Insect Science, 3, Article 34. Retrieved May 16, 2005, from www.insectscience.org/3.34
Schenker, N., & Gentleman, J. F. (2001). On judging the significance of differences by examining the overlap between confidence intervals. The American Statistician, 55, 182-186.
Schmidt, F. L. (1992). What do data really mean? Research findings, meta-analysis, and cumulative knowledge in psychology. American Psychologist, 47, 1173-1181.
Schmidt, F. L. (1996). Statistical significance testing and cumulative knowledge in psychology: Implications for training of researchers. Psychological Methods, 1, 115-129.
Smithson, M. (2002). Confidence intervals. Thousand Oaks, CA: Sage.
Thompson,B.(1996). AERA editorial policiesregarding statistical significance testing: Three suggested reforms. Educational Researcher, 25, 26-30.
Thompson, B. (1999). Why "encouraging" effect size reporting is not working: The etiology of researcher resistance to changing practices. Journal of Psychology, 133, 133-140.
Wilkinson, L., & the Task Force on Statistical Inference. (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594-604.
Wolfe, R., & Hanley, J. (2002). If we're so different why do we keep overlapping? When 1 plus 1 doesn't make 2. Canadian Medical Association Journal, 166, 65-66.