Abstract
Background: Clinicians rely upon abstracts to provide them quick synopses of research findings that may apply to their practice. Spin can exist within these abstracts that distorts or misrepresents the findings. Our goal was to evaluate the level of spin within systematic reviews (SRs) focused on the treatment of cannabis use disorder (CUD). Methods: A systematic search was conducted in May 2020. To meet inclusion criteria, publications had to be either an SR or meta-analysis related to the treatment of cannabis use. Screening and data extraction was performed in a duplicate and masked fashion. Study quality was assessed using AMSTAR-2 Results: 16/24 SRs (66.7%) contained at least one form of spin in the abstract. The most common forms of spin identified were type 3—selective reporting of or overemphasis on efficacy outcomes or analysis favoring the beneficial effect of the experimental intervention (45.8%)—and type 8—the review's findings from a surrogate marker or a specific outcome to the global improvement of the disease (37.5%). No significant association between spin and intervention type, PRISMA requirements, or funding source was identified. Weak positive correlations were found between the presence of spin and abstract word count (r =.217) and between spin and AMSTAR-2 rating (r = 0.143). “Moderate” was the most common AMSTAR-2 rating (9/24, 37.5%), followed by “low” (7/24, 29.2%) and “critically low” (7/24, 29.2%). One systematic review received an AMSTAR-2 rating of “high” (1/24, 4.2%). Conclusions: Spin was common among abstracts from the SRs focused on the treatments for CUD. Higher quality studies may help reduce the overall rate as well as standardizing treatment outcomes. To facilitate this, we encourage all authors, peer-reviewers, and editors to be more aware of the various types of spin as they can help reduce the overall amount of spin seen within the literature.
Introduction
Cannabis is the most commonly misused substance within developed nations. 1 In 2017, statistics from the United Nations Office on Drugs and Crime showed that 188 million people globally had used some form of cannabis. 2 Approximately nine percent of cannabis users ultimately become dependent, and these estimates increase among daily users and those who began consuming it during their teenage years. 3 , 4
Cannabis use disorder (CUD) is defined in the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) by nine pathological patterns that are classified under the following categories: impaired control, social impairment, risky behavior, or physiological adaptation. 5 Leading evidence suggests a strong association between CUD and other psychiatric comorbidities, such as other substance use disorders (alcohol, drugs, and nicotine), mood, anxiety, personality disorders, and post-traumatic stress disorder. 6 Because of the high prevalence and toxic comorbidities of CUD, 7 conducting research on this disorder and reporting in a transparent and accurate manner is imperative—namely through the use and evaluation of systematic reviews (SRs).
Staying up to date on evidence enables providers involved in treating substance abuse to impart the highest quality of care to their patients and ultimately yields the best outcomes. Accordingly, providers must sometimes analyze and synthesize large amounts of medical literature to investigate and weigh the rationale for differences among studies on the same topic. SRs help ease this burden by summarizing the results of several primary studies into one and making recommendations for interventions regarding specific disease processes in a concise and specific manner. 8 Often, providers are pressed for time or have limited resources and must settle for reading the abstracts of articles for guidance. 9 , 10 If present, biased reporting in the abstracts of SRs on CUD could skew how providers interpret the findings of these studies and ultimately alter their treatment plan and patient outcomes. 11
The aim of an abstract is to provide an objective and precise representation of research findings that providers may use to “quickly [find] articles that are both scientifically sound and applicable to their practice.” 12 Yet, authors may present their findings in a biased manner to highlight that their experimental treatment was more beneficial than their results actually suggest. 11 This misrepresentation, known as “spin,” may lead readers to form faulty conclusions about a study and result in negative outcomes in patient care. 13 Yavchitz et al. 11 —one of the leading experts on the subject—defines spin as a “specific way of reporting, intentional or not, to highlight that the beneficial effect of the experimental treatment in terms of efficacy or safety is greater than that shown by the results (i.e., overstate efficacy and/or understate harm).”
While early spin studies focused on randomized control trials, 14 the study design has since been broadened and applied to multiple study types across multiple scientific fields 15 – 18 including SRs. 11 , 19 , 20 To manage the subjective nature within the topic, Yachitz et al. 11 developed a classification system to define the different types of spin found within SRs abstracts and ranked them by order of their severity. That is to say, the more severe forms of spin have a higher perceived likelihood of distorting readers’ interpretations. 11 For example, spin type 3, selective reporting of or overemphasis on efficacy outcomes or analysis favoring the beneficial effect of the experimental intervention, 11 is one of the most severe: other examples of severe types of spin in SR abstracts include, but are not limited to, recommendations for clinical practice that are not supported by the findings, misleading titles, and de-emphasizing harms.
In a 2019 editorial, Fihn 21 recognized the increasing prevalence of spin in scientific papers and drew attention to the major implications that spin may have on clinical practices. Despite the increasing number of cannabis-related publications, spin has yet to be studied in the abstracts of CUD SRs. Therefore, the primary purpose of our paper is to explore spin in SRs related to the treatment of cannabis use, as its discovery may be invaluable to medical literature. Further, we will assess if characteristics of studies, including the quality of the SRs, type of intervention, or funding sources play a role in the presence of spin in these abstracts.
Methods
Search string
Based on the search strategy outlined in Figure 1, a systematic review librarian (DW) conducted a systematic search of electronic databases in May 2020. Results of this cross-sectional search were uploaded into an RCT screening platform, Rayyan (https://rayyan.qcri.org/), to determine eligibility for inclusion and eliminate duplicates.

Search strategies to obtain systematic reviews and meta-analyses.
Selection of studies
Prior to independently screening the titles and abstracts for eligibility, two investigators (AC and MN) attended training sessions in study designs in clinical research and deduplication and screening in Rayyan—both of which were organized and delivered by the Vassar Research Team. The investigators also completed the online course “Systematic Review and Meta-Analysis,” which was administered by Johns Hopkins University on the Coursera platform 22 . Discrepancies in screening were resolved by identifying the sources of the investigators’ disagreement, and a consensus was reached in all instances without any need for arbitration from MH or MV.
Inclusion/exclusion criteria
To meet inclusion criteria, publications had to be an SR and/or meta-analysis related to the treatment of cannabis use. Given that CUD recently acquired its own diagnostic code within DSM-5, 5 we made efforts to capture a comprehensive range of treatment modalities by considering all publications related to the treatment of cannabis use; regardless of it meeting the DSM-5 criteria. SRs of pharmacological, psychosocial, and combination treatments were eligible for inclusion when treatment groups were compared to either active (e.g., medication vs. brief interventions) or inactive controls (e.g., waitlist or placebo) in the primary studies. SRs with a limited number of studies, or in which scales could not be standardized, ultimately resulting in a narrative summation, were also included. All publications had to be in English and include only human subjects. Publications that were not SRs were excluded. SRs of other substance use that were not specific to cannabis use and any remaining studies that did not meet the inclusion criteria were also excluded.
Training
Before examining full-length texts, the same two investigators (AC and MN) attended workshops over two days on the nine most severe types of spin that occur in abstracts of SRs 11 and were trained in the appropriate use of the instrument, A Measurement Tool to Assess Systematic Reviews (AMSTAR 2 https://amstar.ca/) to evaluate the methodological quality of the included SRs. 23
Data extraction
Data extraction was completed in a masked, duplicate fashion utilizing pilot-tested Google forms, with discrepancies resolved in a reconciliation meeting with no need for arbitration
We extracted general characteristics of the included reviews as follow: the intervention type (pharmacologic, psychosocial, combination, or other); the date the review was completed; the funding source (industry, private, public, none, not-mentioned, hospital, combination of funding including industry, or combination of funding not including industry); whether or not the review discussed adherence to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) for abstracts; if the journal recommended adherence to PRISMA; the journal's five-year impact factor; and abstract word count, in addition to spin and the AMSTAR-2.
Spin types
Yachivtz et al.'s classification system of spin in SRs was developed using a four-phase methodology consisting of a review of the literature over “spin” and consensus on the various categories and their location within a manuscript between all authors. With the help of members from the Cochrane Collaboration, they then ranked them in order of severity using a Q-sort survey. Out of the 21 types of spin categories for abstracts, we used the 9 most severe and applied them to SRs CUD. They are as follows: (1) conclusion contains recommendations for clinical practice not supported by the findings; (2) title claims or suggests a beneficial effect of the intervention not supported by findings; (3) selective reporting of or overemphasis on efficacy outcomes or analysis favoring the beneficial effect of experimental intervention; (4) conclusion claims safety based on non-statistically significant results with a wide confidence interval; (5) conclusion claims the beneficial effect of the experimental treatment despite the high risk of bias in primary studies; (6) selective reporting of or overemphasis on harm outcomes or analysis favoring the safety of the experimental intervention; (7) conclusion extrapolates the review's findings to a different intervention; (8) conclusion extrapolates the review's findings from a surrogate marker or a specific outcome to the global improvement of the disease; and (9) conclusion claims the beneficial effect of the experimental treatment despite reporting bias.
Amstar-2
AMSTAR-2 is a critical appraisal tool used to rate the quality of SR and it has been used across multiple medical disciplines including substance abuse. 24 – 29 Several studies have validated its use both in terms of its inter-rater reliability 24 and its construct. 30 The appraisal tool rates SRs based on 16 categories and provides users with overall appraisal ratings ranging from high (0 or only one non-critical weakness) to critically low (more than 1 critical flaw). 24 The categories used in the assessment range from study design, literature search strategies, extraction strategies, and risk of bias assessments considered by the authors. The full list of the 16 categories can be found in our supplementals.
Statistical analysis
Results are presented as frequency counts and percentages using descriptive statistics that portray the overall frequencies of spin and its subtypes. In order to assess the relationships between the study characteristics (journal adherence to PRISMA, funding source, AMSTAR-2 rating, and journal impact factor) and the presence of spin within the abstracts, we used Cramér's V as a measure of effect size. In our protocols, we prespecified the possibility of using binary logistic regression and calculated a power analysis to determine sample size. After screening, our sample size of 24 SRs fell short; therefore, we did not include binary logistic regressions in our study. To assess the relationship abstract length had with spin, we used point-biserial correlation. Analyses were predetermined by our protocol and were performed using Stata 16.1 (StataCorp, LLC, College Station, TX).
Oversight, transparency, reproducibility, and reporting
As this study did not include any humans, it did not meet the regulatory definition of human subject research per the U.S. Code of Federal Regulations. Therefore, it was not subject to institutional review board oversight. To ensure that our study is transparent and reproducible, we have made the protocol, extraction forms, data analysis scripts, and other study artifacts available via OSF (osf.io/swqz7/). To further the reproducibility, we used an independent investigation team to re-analyze our data and analysis scripts in a masked nature. This study was conducted in tandem with other studies across a multitude of medical fields. Since these studies adhered to a common methodology, these methods also have been described elsewhere. 31 While drafting this manuscript, the relevant reporting guidelines from PRISMA 32 and Murad and Wang's 33 guidelines for meta-epidemiological studies were integrated.
Results
General characteristics
Our database search obtained 559 manuscripts. Of these, 137 records were duplicates and removed, leaving us with a total of 422 records. After deduplication, AC and MN screened the titles and abstracts of the remaining records for inclusion/exclusion criteria. Of these, 38 were considered for inclusion. Upon requesting full-text copies of these manuscripts from our university librarians, we were provided one additional publication that met our inclusion criteria and was therefore added to our dataset for a total of 39. Fifteen of these were excluded during full-text evaluation and data extraction, leaving us with 24 SRs and meta-analyses included in our final sample. The rationale for excluded studies can be found in Figure 2.

Prisma Flow Chart.

Relative search interest from GoogleTrends for ‘psychosis’ after the publication of cross-sectional analysis linking it with cannabis use (March 19) and the subsequent correction (June 1).
Non-pharmacological interventions (12/24, 50%) were most commonly evaluated in the SRs, followed by pharmacological interventions (5/24, 20.8%) and a combination of the two (5/24, 20.8%). Two (2/24, 8.3%) of the SRs used either CBD or synthetic cannabinoids as treatment. A total of 16 (16/24, 66.7%) SRs were published in PRISMA-endorsing journals. Eight of the SRs were not funded (33.3%), four SRs received public funding (4/24, 16.7%), and four (4/24, 16.7%) did not mention the funding source. All SR characteristics may be found in Table 1.
Spin types and frequencies, (%) in abstracts (n = 24).
Spin in abstracts of systematic reviews and meta-analysis
Of the 24 SRs included in our investigation, 16 (66.7%) contained at least one form of spin in the abstract, of which spin type 3 was the most common, appearing in 45.8% (11/24) of the articles, followed by spin type 8, found in 37.5% (9/24; Table 1). It should be noted that the use of surrogate measurements such as self-reporting surveys or questionnaires, as defined in spin type 8, are common practice in psychiatric research, which we address in our discussion. Six SRs (25%) were found to contain spin type 5—Conclusion claims the beneficial effect of the experimental treatment despite high risk of bias in primary studies. The remaining spin types were found in 4 SRs or less. A weak positive correlation was found between the presence of spin and abstract word count (r = .20; Table 2).
Extracted characteristics from the included systematic reviews and meta-analyses.
5 journals did not have an impact factor (n = 15).
Appraisal of systematic reviews and meta-analysis
After appraising the SRs with the AMSTAR 2, one (1/24, 4.2%) rated “high,” 9 (37.5%) rated “moderate,” 7 (29.2%) rated “low,” and 7 (29.2%) rated as “critically low.” A weak positive correlation was found between spin and the AMSTAR-2 rating (r = 0.14; Table 2). There was no association between spin and intervention types, journals endorsing PRISMA adherence, and funding sources (Table 2). All SRs that performed a meta-analysis for their research questions (12/24, 50%) used the appropriate methods for the statistical combination of results. Only two SRs reported the funding of the primary studies they reviewed. The AMSTAR-2 quality requirements along with the percentage of responses can be found in our supplementals.
Discussion
Our findings of spin in two-thirds (66.7%) of the abstracts from systematic reviews and meta-analysis is a concerning trend in the field of treatments for cannabis use disorder. This is a concerning trend as spin has been studied across multiple medical specialties including anesthesiology, 34 cardiology, 35 , 36 obesity, 37 oncology, 38 – 40 pain medicine, 20 and psychotherapy 41 among others; with spin prevalences ranging between 23% 34 and 80% 20 . Additionally, over half (14/24, 58.3%) of the reviews were rated as either “low” or “critically low,” illuminating an evident lack of high-quality SRs in this field; a factor often associated with spin. 37 , 42 – 46 Our findings corroborate the work of others and suggest that higher quality studies are needed that focus on the treatment of CUD as this may reduce the level of spin found within the literature.
It is important to note that spin may or may not be intentional: it may be a byproduct of authors having to omit words in order to meet defined word limits set by certain within targeted journals. However, spin has important, residual consequences in the treatment of CUD and may adversely impact patient care as caregivers are more likely to view a treatment option as favorable from an abstract containing spin. 13 An example of spin type 3, a category of high severity, was found where an abstract claimed that “several studies provided evidence of the effectiveness of CBD and CBD-containing compounds in the treatment of cannabis withdrawal symptoms and moderate to severe CUD with Grade B recommendation.” 47 However, only four of the 23 primary studies focused on moderate to severe cannabis use. Additionally, only one of these four studies used a reduction in cannabis intake as their outcome measure, and it showed no statistically significant reduction in cannabis use. This is similar to another high profile, cross-sectional study, that led abstract-readers to believe that the frequent use of high-potency cannabis increased the risk of developing a psychotic disorder 48 The body of the article revealed that the findings were made by “assuming causality,” and follow-up articles, including an editorial from the original author, have been published in acknowledgment of the lack of sufficient evidence in this particular study to support its claim (Figure 3). 49 – 51
Spin type 8—the second most common form of spin identified in our study—involves the use of surrogate measures, which is problematic in the discussion of spin in the context of CUD and other substance use disorders. Although surrogate markers, including self-reports, are frequently used in CUD studies and clinical practice and can be used to measure the effectiveness of an intervention, they meet Yavchitz et al.'s criteria for spin type 8. Surrogate markers may introduce additional uncertainty into the interpretation of findings, and may not be a valid indicator of a disease process because it is possible to see an improvement in the surrogate marker without true improvement in the disease process. Previous studies have shown that being in a clinical study may improve patient-reported outcomes with no clinical evidence of improvement. 52 This is especially true in the case of psychosocial interventions, where secondary outcome measures (e.g., improved coping and refusal skills) that do not measure cannabis use directly are often used to gauge progress and evaluate treatment effectiveness. 53 In addition, we acknowledge that biomarker testing for the quantification—amount and frequency—of cannabis use in blood, breath, and urine analyses are still in development or are not widely available. 54 , 55 However, as these become more widely available, they should be included as a primary measure in clinical studies—when it is appropriate to do so without introducing additional stigma for individuals with CUD.
While the use of surrogate markers was common, it underlines an important issue in SRs focused on treatments for CUD—the need for the criteria for treatment effectiveness to be standardized. Despite a majority of studies equating the reduction of cannabis use to treatment effectiveness, few of the primary studies measured this consistently. Studies oscillated between the frequency of cannabis use, the quantity of cannabis use, or point-prevalence abstinence as their measure of reduction. Only a few included symptoms of CUD in their analysis. Consequently, the reviews failed to discuss whether a reduction in cannabis use was sufficiently efficacious to mitigate concerns present at the time the people sought treatment or so they no longer met diagnostic criteria for CUD. Currently, there is no “safe” level of consumption among medical professionals which only further complicates the issue. 53 As sustained abstinence is often difficult for many cannabis users to achieve, 53 determining this level of safe use—and the means by which to measure it—will allow researchers to utilize a more direct measure and help diminish the amount of spin found within these SRs.
Given its potential to influence patient care, the goal should be to minimize the amount of spin found within abstracts as much as possible, and identifying spin requires a collaborative effort between researchers, peer-reviewers, and journal editors alike. Some researchers suggest that publishing more work with both positive and negative outcomes may help de-stigmatize the incentives for authors to spin. 35 To reduce spin in the abstracts of SRs on the treatment of cannabis use, as well as other topics, we suggest journals adopt a specific structure in SR abstracts regarding primary and secondary objectives paired with adjacent results to mitigate the selective reporting of outcomes—which may require a less firm stance on word limits.
Further, providers and researchers focused on treatments of CUD should endeavor to familiarize themselves with appropriate reporting standards in abstracts, such as the PRISMA-A guidelines. 56 Researchers also should operationalize definitions of efficacy and success of CUD treatment. By providing caregivers with clear criteria for success and the degree to which a treatment has been demonstrated to meet those criteria will reduce the prevalence of spin type 8 within this expanding field.
Strengths and limitations
Regarding the strengths of our study, investigators followed a rigorous methodology and published a protocol a priori to foster reproducibility and transparency. The investigators performed screening and data extraction in duplicate while the responses were masked until data collection was complete—a methodological approach that is considered to be the gold standard in SRs. The results of the current study were verified by an independent group to ensure statistical reproducibility.
Possible limitations of our study include: (1) The classification of spin is subjective by nature. In an effort to reduce the subjectivity associated with classification, systematic training on spin types, and ongoing coaching and discussion opportunities were provided to investigators. (2) There were relatively few papers meeting inclusion criteria for our analysis, which is a plausible explanation for the lack of statistically significant findings. (3) The classification of self-reports and other rating scales as surrogate markers may reflect an overidentification of spin type 8. (4) Studies published prior to the 2017 inception of AMSTAR 2, which provided a more comprehensive evaluation checklist for SRs than previous tools, may have inherently achieved lower ratings. (5) The current paper is cross-sectional, and our results should be interpreted as such. (6) Finally, our spin analysis was limited to the top nine most severe forms of spin, as defined by Yavchitz et al. 11 Only including the nine most severe forms of spin provides an insight into the amount of spin in cannabis use literature, but may underrepresent spin as a whole.
Conclusion
Overall—despite its pervasiveness—the amount of spin within the literature regarding the treatment of cannabis use disorder is correctable. We encourage all readers of this article to become acquainted with spin so that they make the most informed decisions. Higher quality studies, adherence to PRISMA-A guidelines, and establishing clear treatment efficacies of CUD may help reduce the overall amount of spin found within the literature.
Footnotes
Author contributions
WA, OT, MH, and MV conceptualized and designed the research study. DW conducted systematic searches and wrote the introduction. MH performed the statistical analysis. AC and MN screened/extracted data and wrote the first draft. All authors reviewed the results and contributed to the manuscript.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
