Abstract
Newspapers preferentially cover initial biomedical findings although they are often disconfirmed by subsequent studies. We analyzed 426 newspaper articles covering 40 initial biomedical studies associating a risk factor with 12 pathologies and published between 1988 and 2009. Most articles presented the study as initial but only 21% mentioned that it must be confirmed by replication. Headlines of articles with a replication statement were hyped less often than those without. Replication statements have tended to disappear after 2000, whereas hyped headlines have become more frequent. Thus, the public is increasingly poorly informed about the uncertainty inherent in initial biomedical findings.
Introduction
The production of scientific knowledge is an incremental process where early, promising but yet tentative findings are validated through replication. Thus, initial scientific results are uncertain per se. Over the past two decades, numerous scientific studies and editorials have highlighted the “reproducibility crisis” in biomedical sciences where many initial findings failed to be replicated (Baker, 2016; Ioannidis, 2005b; Yong, 2012). The highly competitive system for funding research and getting an academic position is often blamed: scientists are increasingly evaluated on the number of their publications (Lawrence, 2003; Reich, 2013). Scientists are therefore rewarded when they publish new, positive findings in high–impact factor journals. This evolution is visible in the way scientific publications are written: there is a marked increase in the use of positive words such as “robust,” “novel,” or “unprecedented” in the summaries (Vinkers, Tijdink, & Otte, 2015). Moreover, editors of prestigious scientific journals favor newsworthy studies by preferentially publishing new, positive, and exciting results (Bucchi, 2015; Franzen, 2012; Smith, 2006). They also issue press releases highlighting these studies. Scientific institutions are also increasing their public relation activities, in particular by issuing press releases (Peters et al., 2008; Schafer, 2011). This trend has been described by scholars in science communication as the medialization of science (Franzen, 2012; Peters, 2012; Weingart, 1998) where scientific communication is adapted to journalistic values and media routines.
Observational studies tend to confirm this inclination toward the media: the interest of the findings is often exaggerated in press releases through various types of omission, simplification, and overgeneralization (Bartlett, Sterne, & Egger, 2002; Sumner et al., 2016) and even in the abstract and discussion of scientific papers (Boutron, Dutton, Ravaud, & Altman, 2010). Exaggerations and distortions present in press releases often spread to newspapers articles (Brechman, Lee, & Cappella, 2009; Schwartz, Woloshin, Andrews, & Stukel, 2012; Sumner et al., 2016). Furthermore, scientific papers described in press releases are more likely to be covered by newspapers (Stryker, 2002). Indeed, most journalists consider peer-reviewed journals to be trustworthy sources of news stories (Hansen, 1994; Stryker, 2002) and tend to copy and paste press releases when covering single scientific papers (Autzen, 2014; Taylor et al., 2015). They also favor the results of initial studies published in high–impact factor journals (Dumas-Mallet, Smith, Boraud, & Gonon, 2017). Yet initial results are poorly replicated (Dumas-Mallet, Button, Boraud, Munafo, & Gonon, 2016; Ioannidis, 2005a, 2008), and journalists scarcely mention these invalidations (Dumas-Mallet et al., 2017). Consequently, because many biomedical findings covered by newspapers fail to be reproduced, the general public mainly receives flawed information about biomedical discoveries (Gonon, Konsman, Cohen, & Boraud, 2012).
Scientific uncertainty in the media has been the subject of many publications (Friedman, Dunwoody, & Rogers, 1999; Lehmkuhl & Peters, 2016; Maier et al., 2016; Peters & Dunwoody, 2016). These analyses have focused on the uncertainty related to controversial studies (Friedman et al., 1999; Nelkin, 1995), the conflicting results (Stocking & Holstein, 2009), and the risks associated with new technologies (Dudo, Dunwoody, & Scheufele, 2011; Kitzinger & Reilly, 1997; Lehmkuhl & Peters, 2016; Ruhrmann, Guenther, Kessler, & Milde, 2015). Other authors observed that news stories often omit important information about the scientific context of the finding they cover (Holtzman et al., 2005; Lai & Lane, 2009; Pellechia, 1997; Singer, 1990). More specifically, some scholars have acknowledged the uncertainty inherent to initial scientific findings (Friedman et al., 1999; Peters & Dunwoody, 2016), but we are not aware of any study investigating its media coverage. It is especially relevant as the coverage of single scientific papers is considered as media routine and is “mostly linear, unproblematic and solidly backed by scientific sources”
As the public is still mainly receiving information about scientific discoveries through the media (Weitkamp, 2014), it is timely to study how journalists present the results of biomedical findings. Here we investigated how journalists presented the results of 40 initial biomedical studies covered by 426 newspaper articles. These studies were selected during a research project designed to explore the validity of a large number of initial biomedical studies covered by the press. In a first step, we selected, through a PubMed search, 663 meta-analyses associating a risk factor (e.g., smoking) with a pathology. We considered all robust meta-analyses published between 2008 and 2012 and related to 12 different pathologies in three biomedical domains: psychiatry (attention deficit hyperactivity disorder [ADHD], schizophrenia, major depression disorder, and autism), neurology (Parkinson’s and Alzheimer’s diseases, multiple sclerosis, and epilepsy), and a set of four somatic diseases (breast cancer, psoriasis, glaucoma, and rheumatoid arthritis). Meta-analyses were considered robust if they included at least seven independent primary studies. Less extensive meta-analyses were discarded. We then identified the corresponding initial studies and observed that only 45.2% of them were confirmed by the corresponding meta-analyses. This low reproducibility rate was independent of the biomedical domain and of the journal impact factor (Dumas-Mallet et al., 2016). Thus, initial studies published in high–impact factor journals were not more reliable.
In a second step, we investigated the press coverage of scientific studies included in our database (initial studies, subsequent studies, and meta-analysis). We used the database Dow Jones Factiva to identify which scientific articles were covered by English-written newspapers. Our database included 5,029 studies of which 161 were covered by newspapers (156 primary studies and 5 meta-analysis articles). We observed that newspapers preferentially covered initial positive studies: 13.1% of initial studies were covered, whereas only 2.4% of subsequent studies were. They also favor studies related to lifestyle risk factors (e.g., smoking). Moreover, newspapers also preferentially echoed studies published in high–impact factor journals. Finally, 51.3% of these 156 primary studies were disconfirmed by the corresponding meta-analyses, and newspapers rarely mentioned these invalidations. The third step of our research project, described here, is aimed at investigating how the results of initial studies are presented in newspapers. The study was designed to answer three research questions:
In each press article, we searched for wordings describing the study as initial and for statements indicating the need for the results to be confirmed by replication. We used the latter as representative of how journalists conveyed the uncertainty that is inherent in initial studies.
We analyzed the influence of six different factors: the length (in words) of the newspaper article, the tone of the title (exaggerated or neutral), the publication year (1988-1999 vs. 2000-2009), the presence of a quotation of an author of the study, the presence of a claim emphasizing the robustness of the scientific results, and the description of the study as initial.
Four countries are highly represented in the newspapers of our database: Australia, Canada, the United Kingdom, and the United States. We investigated if the acknowledgment of the uncertainty of initial biomedical results varies between these countries.
Method
Selection of Newspaper Articles
We started from a database of 5,029 biomedical studies associating a risk factor with 12 pathologies (Dumas-Mallet et al., 2016). Then, we used the Dow Jones Factiva database to find newspaper articles covering the scientific studies of our database (Dumas-Mallet et al., 2017). Briefly, each search began with unspecific keywords (scientist* OR research*) combined with specific ones (e.g., gene, smoking) and the name of the pathology. Each search was restricted to 1 month after the publication date of the study. We only considered newspaper articles written in English and published in the general press. Articles published in the specialized press (e.g., Pharma Business Week) or by any press agency were not taken into account. This search retrieved 1,561 newspaper articles (Dumas-Mallet et al., 2017). The present study focused on a subpopulation of them. Here, we have only analyzed the newspaper coverage of all initial scientific studies covered by 2 or more newspaper articles (40 studies yielding 426 newspaper articles) and the coverage of subsequent studies on the same topics (10 studies yielding 111 newspaper articles).
Content Analysis
We searched for the following elements in each newspaper articles: wordings describing the study as initial, replication statements mentioning the need for confirmatory experiments, and claims overstating the strength of the results. We also classified every headline as exaggerated or neutral. Headlines overstating the study by using words such as “breakthrough” or “key finding” were classified as exaggerated, as well as those likely to mislead the reader by mentioning a possible cure or a diagnostic tool when scientists have only identified a risk factor. When headlines were factual statements such as “Genetic variations found in depression,” they were classified as neutral. When headlines were not classified as exaggerated according to both above-mentioned criteria and used the word “study” or “researchers” or “research,” they were also classified as neutral. Some examples are given in Table 1.
Typical Examples of Hyped Headlines Versus Factual or Prudent Headlines.
Two authors (EDM and FG) independently classified all headlines as hyped or neutral. After comparing their classification and resolving their disagreements by discussion, they obtained a first classification. Because this classification was partly subjective, its reliability was verified independently by a third author (AS), who was not involved in the first coding. His classification was compared to the first one, and the Cohen’s kappa coefficient was calculated without weighting (Sim & Wright, 2005). Disagreements were resolved by discussion to establish the final headlines classification.
The content of each newspaper article was also screened for wordings indicating that the journalist adequately presented the study as initial. To select specific wordings, two authors (EDM and FG) read the 45 news stories covering 4 initial studies related to ADHD and the 111 news stories covering 10 subsequent (i.e., noninitial) studies. Accordingly, we considered wordings such as “for the first time” or “in a preliminary survey” as presenting the study as an initial one but wordings such as “the new finding” as nondiscriminating. Some examples are given in Table 2. Then, both authors independently classified the 381 remaining newspaper articles covering initial studies related to other pathologies. Both classifications were compared to calculate the Cohen’s kappa coefficient, and disagreements were resolved by discussion.
Typical Wordings Suggesting That a Scientific Study Is Perceived by the Journalist to Be an Initial Study.
When the word “discovery” is not associated with a description, it was considered insufficient evidence that the study was reported as initial. bWhen this wording was associated in the same article with a statement that the reported study replicated previous studies, we assumed that “first” qualified “convincing” rather than “evidence,” and this wording was no longer considered as identifying an initial study.
The content of each newspaper article was screened for statements about the robustness or uncertainty of the results. Typical examples of replication statements and robustness claims are given in Table 3. We also identified who made these statements: an author of the study or an independent expert. When these statements were not inserted in a quotation, their authorship was attributed by default to journalists. Finally, we searched each article for any authors’ quotations. To avoid errors or omissions regarding the presence or absence of replication statements, robustness claims, and authors’ quotations, two authors (EDM and FG) independently performed these searches in all 426 newspaper articles. Classifications were compared and Cohen’s kappa coefficients were calculated without weighting. Disagreements were resolved by discussion to establish the final classification. Raw data are available on request to the corresponding author (EDM).
Typical Statements That Replication Is Needed or That the Data Are Robust.
Statistical Methods
Binary logistic regressions were performed using a standard iterative method and calculated using R-CRAN software.
Results
Validation of the Coding Methods
The 426 newspaper headlines were classified as hyped or neutral by two authors (EDM and FG) and then compared to the classification performed by a third author (AS). Cohen’s kappa coefficient obtained for this comparison was 0.71. This value is within the range of a “substantial” strength of agreement (0.61-0.8; Sim & Wright, 2005). In order to determine whether newspaper articles presented each study as initial, two authors jointly screened 45 news stories covering four initial studies about ADHD to select specific wordings (Table 2). Then, both authors independently applied these selection criteria to the remaining 381 news stories covering 36 other initial studies. Cohen’s kappa coefficient obtained by comparing both classifications was 0.896. This value is within the range of an “almost perfect” strength of agreement (0.81-1.0; Sim & Wright, 2005). Finally, the presence or absence of an author’s quotation, a replication statement, and a robustness claim in the 426 newspaper articles were checked independently by two authors (EDM and FG). Cohen’s kappa coefficients obtained by comparing classifications were 0.977, 0.856, and 0.994, respectively.
Overview of Newspaper Articles Covering Initial Studies
The 40 initial studies included in our analysis received, on average, a coverage of 11 newspaper articles (range: 2-50). Their average length was 417 words (range: 14-1,111). Thirteen of these 40 initial studies were validated by the corresponding meta-analysis, whereas 27 were invalidated (67.5%). This poor replication validity is similar to the one more generally observed for initial studies published in peer-reviewed biomedical journals (Dumas-Mallet et al., 2016).
The 426 newspaper articles of our database were published between 1988 and 2009. Among them, 120 covered the 19 initial studies published between 1988 and 1999 and 306 echoed the 21 studies published from 2000 to 2009. Among these 426 newspaper articles, 400 were published by general newspapers printed in 4 countries (39 in Australia, 95 in Canada, 99 in the United Kingdom, and 167 in the United States) and 26 in several other countries (Hong-Kong, India, Ireland, Korea, New Zealand, Pakistan, and Singapore).
We rated 150 headlines as hyped (35.2%). We found wordings suggesting that the journalist presented the study as initial in 243 newspaper articles (57%). The authors of the study were quoted in 302 newspaper articles (70.9%). Finally, we found 91 replication statements (21.4%) and 105 robustness claims (24.6%) overstating the strength of the findings. Most replication statements (81.3%) and most robustness claims (78%) appeared in quotations either from the authors of the study or from experts in the field. The list of the 426 newspaper articles and their characteristics are available on request to the corresponding author.
Factors Associated With the Presence of a Replication Statement
Using binary logistic regression, we tested whether the presence of a replication statement was associated with six predictor variables: the presence of a robustness claim, the presence of an author’s quotation, the presence of wordings presenting the study as initial, the presence of an exaggerated headline, whether the article length exceeded 200 words, and whether it was published after 1999 (Table 4). Three predictor variables were strongly associated with the presence of a replication statement: the presence of an author’s quotation, the presence of an exaggerated headline, and the publication period. Indeed, newspaper articles including an author’s quotation were almost 3 times more likely to mention a replication statement (Table 4). This is consistent with the fact that 44/91 replication statements appeared within authors’ quotations. The negative association between the presence of a replication statement and of an exaggerated headline is statistically significant (odds ratio [OR] = 0.38, p = 0.0025). It is also observed if the exaggerated headline is chosen as the dependent variable and the presence of a replication statement as a predictor variable (OR = 0.37, p = 0.0015). The publication period inversely affected the presence of a replication statement (Table 4) and of an exaggerated headline. Indeed, the percentage of hyped headlines was much lower before 2000 (15%) than during the 2000s (43.1%), whereas the percentage of replication statements was much higher before (35%) than after 2000 (16%). Finally, wordings suggesting that the covered study is an initial one moderately predict the presence of a replication statement (OR = 1.77, p = 0.043).
Factors Associated With the Presence of a Replication Statement.
Note. Logistic regression: overall model fit: χ2 = 63.16.
p < 0.0001.
The logistic regression indicates that the length of the newspaper articles does not affect the presence of a replication statement (Table 4). However, the 97 short articles (<200 words) less often quoted the authors (38%) and less often mentioned a replication statement (9.3%) than longer ones (80.2% and 24.9%, respectively). This is expected because replication statements often appeared within authors’ quotations. This explains why the logistic regression with six predictor variables does not reveal the association of a replication statement with the story length. Indeed, when author’s quotation is excluded from the predictor variables, the logistic regression shows that replication statements are almost 3 times more frequent in longer news stories (>200 words; OR = 2.70 p = 0.012). Finally, the presence of a robustness claim is not associated with the presence of a replication statement.
Subanalysis of Newspaper Articles According to Their Validation Status
Because subsequent studies and meta-analyses were yet to be performed when newspapers reported initial studies, it might seem rather illogical, at first glance, to test an association between the validation status and the presence of a replication statement and/or a robustness claim. However, this test uncovered an unexpected observation: all but 4 of the 91 replication statements were found in the 257 newspaper articles covering the 27 initial studies that were subsequently invalidated (chi-square test: χ2 = 60.2, p < 0.0001). In contrast, we found no difference regarding the presence of robustness claims whether the study was subsequently validated or not.
Hyped Headlines and Replication Statement: Geographical Differences
We tested whether the percentage of newspaper articles mentioning a replication statement and the percentage of hyped headlines varied between the four countries widely represented in our database: Australia, Canada, the United Kingdom, and the United States. We observed that the distribution of these two parameters greatly varied among these four countries (Figure 1). Indeed, while 59/164 U.S. news stories mentioned a replication statement, only 3/97 did so in U.K. newspapers. Conversely, 57/97 headlines were hyped in U.K. newspapers but only 28/164 in U.S. ones. Articles published by Australian and Canadian newspapers are in-between these extreme cases (Figure 1). The fact that U.K. news almost never mentioned a replication statement is not related to their length (412 words on average versus 425 words in U.S. news and 417 in the whole sample) or to a preponderance of tabloid newspapers in our sample. Indeed, 44/97 U.K. news stories were published by quality newspapers (The Daily Telegraph, The Guardian, The Independent, and The Times). Also, authors’ quotations were almost as frequent in U.K. newspaper articles (63/97 65%) as in U.S. ones (72%) and in all newspaper articles (71%). This subanalysis by country of origin further documents that hyped headlines are negatively associated with the presence of a replication statement.

Percentage of news with an exaggerated headline in function of the percentage of news mentioning a replication statement in four countries.
Discussion
Because the majority of early biomedical findings are invalidated by subsequent studies (Dumas-Mallet et al., 2016; Ioannidis, 2005b; Prinz, Schlange, & Asadullah, 2011), newspapers covering them should acknowledge the uncertainty inherent to initial findings. According to a survey of 16 science journalists, 13 actually considered that it is essential to mention that the study they cover has been replicated or needs replication (Holtzman et al., 2005; Mountcastle-Shah et al., 2003). However, the content analysis of 228 news stories reporting genetic findings revealed that this replication status was specified in only a third of them (Holtzman et al., 2005). In agreement with this study, we found only 91 replication statements (21%) in the 426 newspaper articles covering initial biomedical association studies.
Half of these statements were presented as quotations from the authors of the study and a third from scientific experts in the field. Moreover, these statements were infrequent in newspaper articles that did not quote scientific authors and in articles covering initial findings that have later been confirmed by subsequent studies. These observations suggest that scientists play a major role regarding the presence of a replication statement. They might have warned journalists or press release writers that their findings needed to be confirmed, especially when they felt that the data were uncertain, or they might have done it in response to journalists’ questions. In any case, the fact that only 21% of the newspaper articles mentioned a replication statement is consistent with studies showing that journalists consider peer-reviewed journals to be trustworthy sources of new stories (Bauer et al., 1995; Bucchi & Mazzolini, 2003; Hansen, 1994). Unfortunately, the common belief that findings published in influential journals are more trustworthy than those reported in less prestigious peer-reviewed journals has been contradicted by observational studies (Dumas-Mallet et al., 2016; Prinz et al., 2011).
We observed a strong inverse relationship between the presence of a replication statement and the fact that the headline was exaggerated. Because readers get their first or only impressions from headlines, hyped headlines are especially misleading. Previous reception studies showed that “the public who read headlines about genetics most often indicated an expectation that the article would talk about a ‘cure’” (Caulfield & Condit, 2012, p. 214). Newspaper editors, who write the headlines in most cases (Nelkin, 1995), might be tempted to hype them to attract readers’ interest by satisfying this expectation. Our observations suggest that the presence of a replication statement might discourage newspaper editors from hyping their headlines.
We used the presence of a replication statement as representative that journalists actually informed the public about the uncertainty of the finding they cover. However, in this respect the press coverage of each of the 40 studies was rarely homogenous. Indeed, the systematic mention of a replication statement in all press articles covering a single study was observed only for two of them. Regarding the media coverage of the 38 other findings, either none or only some press articles mentioned the need for replication. We cannot tell why many press articles, although they quoted the authors, did not mention a replication statement whereas some covering the same study did. Several explanations can be suggested. First, scientists might have mentioned the need for replication to some journalists only. Second, the press releases covering the study might not have mentioned it. Third, the press release might have mentioned the need for replication, but only some journalists included it in their articles. Fourth, journalists had actually written the replication statement, but their editors subsequently cut it out due to limits of space. These omissions are in line with Kitzinger and Reilly’s (1997) opinion: “scientific uncertainty per se is not attractive to journalists—it is new and definitive findings and controversy that draw media attention” (p. 344). However, cultural habits and media structuring also seem to be involved in this issue, as evidenced by the marked differences we observed between the United Kingdom and the United States.
Limitations
We analyzed the newspaper coverage of 40 initial findings associating a risk factor with 12 pathologies. The generalization of our observations to other pathologies and to other biomedical domains, such as clinical trials of treatment effectiveness, deserves further investigation. Moreover, we selected these 40 initial studies because they were included in robust meta-analyses. Indeed, meta-analyses represent the best estimate of a real effect, or of its absence, and we aimed to correlate the replication validity of initial studies with their media coverage. However, initial studies included in meta-analyses represent a tiny fraction of all initial association studies. Whether our observations also apply to the possible media coverage of all initial studies remains an open question.
Conclusion
Initial biomedical findings are preferentially covered by newspapers although they are often disconfirmed by subsequent studies (Dumas-Mallet et al., 2017). Therefore, their media coverage constitutes a particularly relevant material for investigating how newspapers deal with research uncertainty. Sadly, our study confirms that most newspaper articles do not inform the public about the uncertainty inherent in initial studies. This is not in the long-term interest of science, however. Indeed, contrary to common beliefs, reception studies show that the public views science as more trustworthy when the newspaper coverage of health research acknowledges its uncertainty, especially if this is attributed to scientific authors (J. D. Jensen et al., 2017; J. D. Jensen, Krakow, John, & Liu, 2013).
We observed an inverse relationship between the presence of a replication statement and that of an exaggerated headline. However, the media are not solely to blame for this hype phenomenon (Caulfield & Condit, 2012). In particular, because scientists are behind most replication statements, those who omit to mention the need for replication, are indirectly responsible for hyped headlines. With this in mind, the recent disappearance of replication statements and the concomitant surge in hyped headlines further support the view that the medialization of scientific research is increasing (Peters, 2012) and emphasize the relevance of this concept regarding science-media interactions. Finally, the marked differences observed between countries regarding the acknowledgement of uncertainty inherent to initial biomedical findings in newspapers is intriguing and deserves further investigation.
Footnotes
Acknowledgements
We thank André Garenne for his advice about statistical methods.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
