This site needs JavaScript to work properly. Please enable it to take advantage of the complete set of features!
Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation

Save citation to file

Add to Collections

Name must be less than 100 characters
Unable to load your collection due to an error
Please try again

Add to My Bibliography

Unable to load your delegates due to an error
Please try again

Your saved search

Would you like email updates of new search results?
Saved Search Alert Radio Buttons
()

Create a file for external citation management software

Your RSS Feed

. 2016 Dec;48(4):1205-1226.
doi: 10.3758/s13428-015-0664-2.

The prevalence of statistical reporting errors in psychology (1985-2013)

Affiliations

Affiliations

  • 1 Department of Methodology and Statistics, Tilburg School of Social and Behavioral Sciences, Tilburg University, Tilburg, Netherlands. m.b.nuijten@uvt.nl.
  • 2 Department of Methodology and Statistics, Tilburg School of Social and Behavioral Sciences, Tilburg University, Tilburg, Netherlands.
  • 3 Psychological Methods, University of Amsterdam, Amsterdam, Netherlands.

The prevalence of statistical reporting errors in psychology (1985-2013)

Michèle B Nuijten et al. Behav Res Methods. 2016 Dec.
. 2016 Dec;48(4):1205-1226.
doi: 10.3758/s13428-015-0664-2.

Affiliations

  • 1 Department of Methodology and Statistics, Tilburg School of Social and Behavioral Sciences, Tilburg University, Tilburg, Netherlands. m.b.nuijten@uvt.nl.
  • 2 Department of Methodology and Statistics, Tilburg School of Social and Behavioral Sciences, Tilburg University, Tilburg, Netherlands.
  • 3 Psychological Methods, University of Amsterdam, Amsterdam, Netherlands.

Abstract

This study documents reporting errors in a sample of over 250,000 p-values reported in eight major psychology journals from 1985 until 2013, using the new R package "statcheck." statcheck retrieved null-hypothesis significance testing (NHST) results from over half of the articles from this period. In line with earlier research, we found that half of all published psychology papers that use NHST contained at least one p-value that was inconsistent with its test statistic and degrees of freedom. One in eight papers contained a grossly inconsistent p-value that may have affected the statistical conclusion. In contrast to earlier findings, we found that the average prevalence of inconsistent p-values has been stable over the years or has declined. The prevalence of gross inconsistencies was higher in p-values reported as significant than in p-values reported as nonsignificant. This could indicate a systematic bias in favor of significant results. Possible solutions for the high prevalence of reporting inconsistencies could be to encourage sharing data, to let co-authors check results in a so-called "co-pilot model," and to use statcheck to flag possible inconsistencies in one's own manuscript or during the review process.

Keywords: False positives; NHST; Publication bias; Questionable research practices; Reporting errors; Significance; p-values.

PubMed Disclaimer

Figures

Fig. 1

Fig. 1

The percentage of articles with…

Fig. 1

The percentage of articles with American Psychological Association (APA)-reported null-hypothesis significance testing (NHST)…

Fig. 1
The percentage of articles with American Psychological Association (APA)-reported null-hypothesis significance testing (NHST) results over the years, averaged over all APA journals (Developmental Psychology (DP), Journal of Consulting and Clinical Psychology (JCCP), Journal of Experimental Psychology: General (JEPG), Journal of Personality and Social Psychology (JPSP), and Journal of Applied Psychology (JAP); dark gray panel), and split up per journal – light gray panels for the APA journals and white panels for the non-APA journals (Psychological Science (PS), Frontiers in Psychology (FP), and Public Library of Science (PLoS)). For each trend we report the unstandardized linear regression coefficient (b) and the coefficient of determination (R2) of the linear trend
Fig. 2

Fig. 2

The average number of American…

Fig. 2

The average number of American Psychological Association (APA)-reported null-hypothesis significance testing (NHST) results…

Fig. 2
The average number of American Psychological Association (APA)-reported null-hypothesis significance testing (NHST) results per article that contains NHST results over the years, averaged over all APA journals (Developmental Psychology (DP), Journal of Consulting and Clinical Psychology (JCCP), Journal of Experimental Psychology: General (JEPG), Journal of Personality and Social Psychology (JPSP), and Journal of Applied Psychology (JAP); dark gray panel), and split up per journal (light gray panels for the APA journals and white panels for the non-APA journals – Psychological Science (PS), Frontiers in Psychology (FP), and Public Library of Science (PLoS)). For each trend we report the unstandardized linear regression coefficient (b) and the coefficient of determination (R2) of the linear trend
Fig. 3

Fig. 3

The average percentage of articles…

Fig. 3

The average percentage of articles within a journal with at least one (gross)…

Fig. 3
The average percentage of articles within a journal with at least one (gross) inconsistency and the average percentage of (grossly) inconsistent p-values per article, split up by journal. Inconsistencies are depicted in white and gross inconsistencies in grey. For the journals Journal of Personality and Social Psychology (JPSP), Journal of Experimental Psychology: General (JEPG), Developmental Psychology (DP), Frontiers in Psychology (FP), Public Library of Science (PLoS), Journal of Consulting and Clinical Psychology (JCCP), Psychological Science (PS), and Journal of Applied Psychology (JAP), respectively, the number of articles with null-hypothesis significance testing (NHST) results is 4,346, 821, 2,607, 702, 2,487, 2,413, 1,681, and 1,638, and the average number of NHST results in an article is 23.4, 23.0, 14.4, 14.5, 12.7, 11.4, 9.3, and 9.2
Fig. 4

Fig. 4

Average percentage of inconsistencies (open…

Fig. 4

Average percentage of inconsistencies (open circles) and gross inconsistencies (solid circles) in an…

Fig. 4
Average percentage of inconsistencies (open circles) and gross inconsistencies (solid circles) in an article over the years averaged over all American Psychological Association (APA) journals (Developmental Psychology (DP), Journal of Consulting and Clinical Psychology (JCCP), Journal of Experimental Psychology: General (JEPG), Journal of Personality and Social Psychology (JPSP), and Journal of Applied Psychology (JAP); dark gray panel) and split up per journal (light gray panels for the APA journals and white panels for non-APA journals – Psychological Science (PS), Frontiers in Psychology (FP), and Public Library of Science (PLoS)). The unstandardized regression coefficient b and the coefficient of determination R2 of the linear trend are shown per journal for both inconsistencies (incons) and gross inconsistencies (gross) over the years
Fig. 5

Fig. 5

Percentage of articles with at…

Fig. 5

Percentage of articles with at least one inconsistency (open circles) or at least…

Fig. 5
Percentage of articles with at least one inconsistency (open circles) or at least one gross inconsistency (solid circles), split up by journal. The unstandardized regression coefficient b and the coefficient of determination R2 of the linear trend are shown per journal for both inconsistencies (incons) as gross inconsistencies (gross) over the years. APA American Psychological Assocation, DP Developmental Psychology, JCCP Journal of Consulting and Clinical Psychology, JEPG Journal of Experimental Psychology: General , JPSP Journal of Personality and Social Psychology, JAP Journal of Applied Psychology, PS Psychological Science, FP Frontiers in Psychology, PLoS Public Library of Science
Fig. 6

Fig. 6

The percentage of gross inconsistencies…

Fig. 6

The percentage of gross inconsistencies in p -values reported as significant (white bars)…

Fig. 6
The percentage of gross inconsistencies in p-values reported as significant (white bars) and nonsignificant (gray bars), split up by journal. For the journals Journal of Applied Psychology (JAP), Journal of Consulting and Clinical Psychology (JCCP), Developmental Psychology (DP), Public Library of Science (PLoS), Psychological Science (PS), Frontiers in Psychology (FP), Journal of Personality and Social Psychology (JPSP), and Journal of Experimental Psychology: General (JEPG), respectively, the total number of significant p-values was 11,654, 21,120, 29,962, 22,071, 12,482, 7,377, 78,889, and 14,084, and the total number of nonsignificant p-values was 3,119, 5,558, 6,698, 9,134, 2,936, 2,712, 17,868, and 4,407
Fig. 7

Fig. 7

The percentage of gross inconsistencies…

Fig. 7

The percentage of gross inconsistencies in p -values reported as significant (solid line)…

Fig. 7
The percentage of gross inconsistencies in p-values reported as significant (solid line) and nonsignificant (dotted line), over the years, averaged over journals. The size of the open and solid circles represents the number of significant and nonsignificant p-values in that year, respectively
Fig. 8

Fig. 8

The total number of downloaded…

Fig. 8

The total number of downloaded articles and the number of published articles that…

Fig. 8
The total number of downloaded articles and the number of published articles that contain NHST results over the years, averaged over all American Psychological Association (APA) journals (Developmental Psychology (DP), Journal of Consulting and Clinical Psychology (JCCP), Journal of Experimental Psychology: General (JEPG), Journal of Personality and Social Psychology (JPSP), and Journal of Applied Psychology (JAP); dark gray panel), and split up per journal (light gray panels for the APA journals and white panels for the non-APA journals –Psychological Science (PS), Frontiers in Psychology (FP), and Public Library of Science (PLoS)). Note that the y-axes in the plot for All APA Journals, FP, and PLOS are different from the others and continue until 1,000, 1,050, and 3,750, respectively. The unstandardized regression coefficient ‘b’ and the coefficient of determination ‘R2’ of the linear trend are shown per journal for both the downloaded articles (down) as articles with null-hypothesis significance testing results (NHST) over the years
Fig. 9

Fig. 9

The average number of exact…

Fig. 9

The average number of exact and inexact null-hypothesis significance testing (NHST) results per…

Fig. 9
The average number of exact and inexact null-hypothesis significance testing (NHST) results per article over the years, averaged over all journals (grey panel), and split up by journal (white panels). The unstandardized regression coefficient ‘b’ and the coefficient of determination ‘R2’ of the linear trend are shown per journal for both exact (ex) as inexact (inex) p-values over the years. APA American Psychological Assocation, DP Developmental Psychology, JCCP Journal of Consulting and Clinical Psychology, JEPG Journal of Experimental Psychology: General , JPSP Journal of Personality and Social Psychology, JAP Journal of Applied Psychology, PS Psychological Science, FP Frontiers in Psychology, PLoS Public Library of Science

Comment in

References

    1. Alsheikh-Ali AA, Qureshi W, Al-Mallah MH, Ioannidis JPA. Public availability of published research data in high-impact journals. PLoS One. 2011;6(9):e24357. doi: 10.1371/journal.pone.0024357. - DOI - PMC - PubMed
    1. American Psychological Association . Publication Manual of the American Psychological Association. 3. Washington, DC: American Psychological Association; 1983.
    1. American Psychological Association . Publication Manual of the American Psychological Association. 6. Washington, DC: American Psychological Association; 2010.
    1. Bakker M, Wicherts JM. The (mis)reporting of statistical results in psychology journals. Behavior Research Methods. 2011;43:666–678. doi: 10.3758/s13428-011-0089-5. - DOI - PMC - PubMed
    1. Bakker, M., & Wicherts, J. M. (2014). Outlier removal and the relation with reporting errors and quality of research. Manuscript submitted for publication. - PMC - PubMed

LinkOut - more resources

Cite
Morty Proxy This is a proxified and sanitized view of the page, visit original site.