Abstract
A survey of newspaper workers, librarians, archivists and researchers in Maine reveals that preserving digitized newspapers is deemed most important, followed by born-digital news articles; however, they differ about who should be responsible for preserving this digital news content.
The 2014 “State of the News” report by the Pew Research Center’s Journalism Project describes a renewed energy in the news industry, despite persisting economic challenges. Among several positive developments, the report mentions investments in the news business by philanthropists, venture capitalists and entrepreneurs. Digital-only news publications are on the rise, and many legacy media outlets are shifting to digital-first publishing practices. Considering the rapid expansion of digital-first practices, one key issue on the minds of many is absent from the report: How is this growing catalog of “born digital” news content being preserved for posterity? 1
In this special issue of the Newspaper Research Journal concerned with publications in the digital media environment, this article answers the call for better understanding of the state of digital news preservation. Sometimes called “born-digital” news, these ephemeral publications are susceptible to being lost to history—being judged as “at-risk” by the National Digital Stewardship Alliance in 2013. 2 To begin to fill this gap in scholarship, this research explores how preserving born-digital news content is perceived among newspaper workers, librarians, archivists and researchers working in Maine. Because little is known about how these stakeholder groups regard preservation in the born-digital age, this research set out to illuminate how the preservation of this so-called “first draft of history” 3 is understood by those who produce news, preserve and provide access to its content and use digital news content for research. Specifically, how are newspaper workers, librarians, archivists and researchers approaching the preservation and accessibility of born-digital news content? What is the level of concern among these stakeholder groups that ostensibly have a vested interest in the preservation of digital news? How might knowledge gained from surveying these stakeholder groups be used to address preservation concerns that could be replicated in other communities?
An online survey was distributed to stakeholder group members who produce, archive, provide access to or retrieve born-digital news content. The descriptive survey 4 included questions regarding how digital news content is produced, stored, accessed and made accessible to scholars who use this material for research purposes. Stakeholders were also asked questions related to the challenges associated with preserving, archiving and accessing news content in the digital age.
Literature Review
It should be no surprise that institutions concerned about the future of journalism, such as the Pew Research Center and the Knight Foundation, are not making the study of news preservation a top priority. Arguably, there are more pressing issues facing the news business. Popular articles and scholarly research published in such spaces as Harvard’s Neiman Journalism Lab to PBS’s MediaShift reflect many concerns, including the economic collapse of advertising and subscription revenue streams; shifting job responsibilities that require doing “more with less;” the development of social and mobile media routines; and expanding audience engagement and reach. 5 Advertising revenue has dropped more than 50 percent in a little over a decade, with the largest decline in print and classified advertising. 6 Of the legacy newspapers that successfully negotiated revenue declines, newsroom staffs are operating at a fraction of their previous numbers. 7 However, the news about the news is not all doom and gloom. “Digital players,” such as BuzzFeed and Mashable, are hiring respected and award-winning journalists to join their ranks. 8 The promise of social and mobile media platforms increases opportunities to reach larger audiences than ever before. 9
Changes in newspaper publishing also present an equally ominous challenge for those who preserve and provide access to information. The preservation of digital news content poses numerous challenges to institutions that are working to ensure access to and the long-term safekeeping of digital records that increasingly make up the historical record. 10 Libraries and archives struggle to find ways to capture and archive digital-first publishing. 11 Training the next generation of librarians and archivists also raises a unique set of challenges. 12 This has spurred preservation partnerships focused on understanding the cycle of news production in a hybrid print-digital age 13 as well as designing best practices and guidelines for preparing such documents for long-term care. 14 However, institutions of cultural memory face an immense undertaking, and solutions have not yet been identified, to capture the comprehensive journalistic record in an era of digital-first news.
Today’s challenge to preserve news has a past that provides useful insights. The history of news preservation efforts in scholarship reveals an ongoing conversation about the challenges to preserving news content for historical inquiry and posterity. For example, the following passage from a 1981 article by T.F. Mills refers to the struggle at the time to preserve news for future generations:
Producer and consumer alike traditionally regarded yesterday’s newspaper as having little value except as a fish wrapper or kindling or, more recently, as waste to be recycled into tomorrow’s paper. Those who have recognized the enduring value of newspapers have not only had to promote the collection of the resource, but have also had to consider how to preserve them and make them readily available to researchers.
15
It is instructive to consider how today’s old preservation technologies were interpreted when they were introduced. For example, when microfilm technology emerged in the latter half of the 20th century, Eugene B. Power, in an article for American Documentation, advocated for the adoption of the then-emerging technology. Citing ongoing space issues in libraries and archives, Power offered empirical evidence on the cost effectiveness and storage saving opportunities that microfilm offered. 16 Concerns about discarding bound copies of newspapers and other periodical materials included how film reproductions of newspapers would impact the quality and preservation of these historical records.
At the dawn of the born-digital age, scholarship reflected the tensions that often accompany change. Research investigated the efficacy of new, online search engines over traditional printed indexes; new standards for best practices to search online news sources in an ephemeral, online environment; and access to news photos for historical research in this new era of digital archiving. 17 Today’s scholars concerned with journalism in the digital age are researching shifting newsroom practices from print-first to digital-first publishing environments, new opportunities and methodologies to study the past through digitized newspaper archives and audience engagement studies and archiving conundrums when preserving cross-sited digital works. 18 In related research, a growing body of scholarship about the challenges associated with preserving digital materials for the historical record raises concerns over the future of knowledge. 19 In fact, the journal Media History dedicated a 2014 volume to research about the opportunities and challenges associated with the use of digital archives for 19th and 20th century journalism research. 20
Most scholarship concerned with news as a historical record, not surprisingly, is in the subfield of journalism history. Much of the research concludes that access to and availability of digital archives potentially widen avenues for historical inquiry. In a recent essay, journalism historian John Nerone called on his colleagues to conduct critical histories of the news industry, citing digital archives as an avenue for renewed scholarship. Nerone notes that digital technologies yield new resources for historians, for they “have transformed archives and produced new publication formats”
21
for scholars to mine. For future historians, the current era of born digital news will offer unique insights into changes in journalism practices, especially considering its transformation from
a utopian stage of privileged and supposedly rational discussion among digerati to a partisan blogosphere to a highly commercialized marketplace for scavenged information.
22
Research Questions
In order to discern attitudes and perceptions about news archiving in the digital age, newspaper workers, librarians, archivists and researchers working in Maine were surveyed. For this case study, research questions were designed to gain understanding of current perspectives and practices around the preservation of born digital news content.
What can be learned about institutional procedures regarding the production, preservation and use of born-digital news?
How do these seemingly disparate stakeholder groups understand and value the importance and responsibility of preserving born digital news?
Method
Soliciting Participants and Collecting Data
Using Qualtrics, a Web-based survey software available through the University of Maine, the authors administered three separate surveys to news workers, researchers and library and archival staff, between May 19 and June 28, 2014. The surveys asked a set of similar questions across all stakeholder groups. Separate surveys were designed and administered in order to tailor questions to each group’s unique roles. 23 As an incentive, participants were invited to enter a raffle to receive one of four $25 gift cards upon completion of the survey.
Participants were preselected based on their roles as newspaper workers, scholarly researchers, librarians or archivists working in the state of Maine. Methods to recruit participants included emails sent to as many individuals in stakeholders groups as could be identified. Multiple appeals were made on social media platforms that connect with the state’s newspaper workers, researchers, librarians and archivists. Academic librarian and archival listservs specific to the state of Maine posted the survey, and a presentation at the state’s library conference was made to request participation. A Maine newspaper association promoted the survey throughout their organization and included multiple appeals through its newsletter. Additional reminders were sent by email. In an attempt to overcome low participation among newspaper workers, personal emails were sent to editors and to the state newspaper association’s board members to solicit their help in recruiting their colleagues’ participation.
In an effort to minimize confusion or uncertainty about the meaning of the phrase “born digital,” this definition was included at the beginning of the survey: “For the purposes of this research, born-digital news [emphasis original] refers to news content that originates in digital form.” 24
When asked who should be responsible for preserving digital news, all stakeholder groups most frequently selected ‘news organizations’ among the choices provided.
Findings
The three surveys yielded usable data from 34 newspaper employees, 58 researchers and 113 library and archival staff. Response rates were approximately 22 percent, 36 percent, and 58 percent, respectively, with unknown numbers of participants gained through social media and listserv outreach.
Respondents in the newspaper workers group comprised mostly editors and reporters. Other respondents self-identified as photographers, multimedia producers, publishers, columnists and UX managers or a combination of one or more of these roles. Among the researchers who took the survey, most identified themselves as historians. However, multiple responses from scholars identified with other disciplines, such as anthropology, psychology, communication studies, political science and economics. A handful of additional fields were represented, including Native American studies, public policy, ecology, sociology and literature. Responses by librarians outnumbered archivists by a large margin. 25 The “other” category includes persons who self-identified as circulation staff, support staff and managerial staff.
What can be learned about institutional procedures regarding the production, preservation and use of born digital news?
As content creators, news organizations were asked about the types of born digital news content they produce. Posts to Twitter (79 percent) and Facebook (74 percent), digital-first news articles (74 percent) and online video (68 percent) were the most prominent responses. Notable open-ended responses included photographs, slideshows and multimedia projects. Two percent responded that their newspaper does not create born-digital news content. [See Figure 1]

Born-Digital Content Created by News Organizations
Figure 2 compares responses from newspaper workers and from library and archival staff when asked about the born-digital news content their organizations currently archive. Among all of the born-digital content options presented to these respondents, many more news workers responded that they preserve the majority of the content listed. The archived content reported most often by newspaper workers was born-digital news articles (90 percent of respondents), with the second highest number of responses pointing to news blogs and information graphics (62 percent of respondents). 26 Library and archival staff returned low numbers for most of their born digital preservation efforts, and in several cases they did not indicate that any archiving was taking place (e.g., born-digital news articles or polls from online new outlets). Larger numbers of library and archival staff were unsure of the types of content their institutions archived.

Archived Digital News Content
In an effort to ascertain how scholars in the state of Maine use born digital news, researchers at institutions of higher education were asked about the types of born-digital news content they use in their work. [See Figure 3] Responses revealed that news blogs (38 percent), born-digital articles (59 percent) and information graphics (28 percent) were most common among the listed options. Incidentally, 29 percent of researchers who completed the survey reported that they do not use born-digital news in their research.

Types of Born-Digital News Content Researchers Use
Figure 4 represents responses by newspaper workers and library and archival staff about their organizations’ procedures for storing born-digital content. Newspaper personnel most often reported files on servers (45 percent), in the Cloud (35 percent) and through third party services (35 percent) as modes of storage. However, more than one-third of newspaper workers selected “unsure” or “other” for this question. The small proportion of library and archival staff who responded to this question was much less likely to know their institutions’ procedures. 27

Procedures for Preserving Born-Digital Content
The survey also asked newspaper employees and researchers about the file formats they use to preserve or store born-digital content, shown in Figure 5. Researchers favored the Portable Document Format (PDFs), followed by text files (e.g. Microsoft Word) and printing and filing paper copies. Newspaper workers, conversely, tended to save content as image files (e.g. JPEG or GIF), then PDFs, text files and video files, but they did not indicate a prevalence for creating hard copies of born-digital files. The second highest response from newspaper workers was notably “other,” with such responses as “.html versions of articles,” “Preserve weblink” and “archived via website.” 28

Formats Used to Save Born-Digital Data
How do these seemingly disparate stakeholder groups understand and value the importance and responsibility of preserving born digital news?
Participants were asked about the importance of preserving born-digital news content who should be responsible for these preservation efforts and what institutions might need to carry forward these efforts successfully.
Figure 6 illustrates responses regarding what each stakeholder group considered key content for preservation. The sample sizes are between 103 and 109 for library and archival staff, between 33 and 34 for news workers and between 51 and 54 for researchers. Responses reflect average scores by group on a 5-item Likert scale (1=unimportant, 3=somewhat important, 5=very important). Of the choices provided, the preservation of digitized newspapers topped every group’s list as the most important news source to preserve (newspaper = 4.2; library/archival = 4.5; researchers = 4.1). Following digitized papers, born-digital news articles ranked second across all groups (newspaper workers = 4.1; library/archival staff = 4.1; researchers = 3.9.). Social media content ranked lower than most other news forms, with the following numbers from each group regarding posts to Facebook and Twitter: newspaper = 2.7 (both); library/archival = 2.3 (Tweets) and 2.4 (Facebook); researchers = 2.2 (both). The least valued content among all stakeholder groups was the preservation of online advertisements (newspaper = 2.5; library/archival = 2.1; researchers = 1.9.).

Importance of Preserving Types of Digital News Content
Each stakeholder group was asked, “What organization(s) should be responsible for preserving born-digital news content?” 29 [See Figure 7] The majority of participants chose news organizations over the other institutional options (libraries and archives). 30 News workers and researchers more often selected libraries rather than archives; the opposite was true for library and archival staff. Notable “other” responses included the Library of Congress, the National Archives, database publishers (e.g. Lexis/Nexis) and third-party vendors. Several comments reflected a need to establish a new organization (e.g., NGOs) that would specialize in digital news preservation. More than one respondent suggested the website the Wayback Machine. 31

Who Should Preserve Born-Digital News?
Newspaper, library and archival staffs were asked what their organizations would need to implement the preservation of born-digital news at their respective institutions. In Figure 8, all sample sizes for the library and archival staff responses were between 106 and 109, and for news workers this number is between 31 and 32. Using a 5-point Likert scale, more news workers identified an interest in preservation efforts (4.5), education about best practices (4.5) and time (4.4) as key considerations for preservation, while library and archival staff indicated that financial support (4.4), time (4.4) and qualified personnel (4.2) would be priorities.

Organizational Needs for Preserving Born-Digital News
Conclusion and Future Research
The goal of this research was to determine how newspaper workers, researchers and library and archival staff in Maine understand the pressing issue of preserving news in the digital age. The first research question asked about practices and procedures regarding the production, preservation and use of born-digital news. Respondents working for newspapers in Maine reported a great deal of born-digital news content being produced, reporting that they create a wide range of social media content as well as born digital news articles and news video. Newspaper workers reported the largest numbers when asked about what kind of news content was archived. [See Figure 2] Only 10 percent of library and archival staff responded affirmatively to a question asking if they knew their organization’s procedures for preserving born-digital content, while 61 percent responded “not sure.” When asked for details on what was preserved, respondents appeared to know little about current archival practices at their institutions. This phenomenon could reflect the paucity of respondents who self-identified as archivists, who tend to be the preservation specialists in libraries and archives. Also, respondents to the library/archival survey were overwhelmingly librarians. Another potential explanation lies in the inchoate knowledge and current practices around preserving born-digital records, a point noted in the OCLC Research Survey of Special Collections and Archives. 32 This presents opportunities for raising awareness of key issues and concerns among key stakeholders as well as seeking shared solutions to future preservation efforts among news organizations, libraries and archives.
One finding that suggests the importance of born-digital data is reflected in responses from scholarly researchers. Nearly two-thirds of researchers who responded reported they use born-digital news content in their research, with the remaining responding “unsure” or “other.” The amount of born-digital news content used in research is promising. However, this survey did not ask about specific research agendas. Furthermore, based on survey responses alone, it is difficult to ascertain whether researchers made clear distinctions between digital news from material sources (e.g. scanning paper sources) and born-digital news.
To answer the second research question, participants were asked to identify what born-digital news content should be preserved and who should preserve it. Differences appeared among the groups regarding what should be preserved. On average, newspaper workers valued social media at a higher rate than did other groups. This finding may reflect shifting news practices, including efforts to reach larger audiences through social media platforms (e.g. Facebook and Twitter) and driving traffic to Web-based news content. 33 It seems evident that some news content is valued more than other content, with emphasis on actual news stories rather than its derivatives that come in the form of social media and polling data. This descriptive, qualitative interpretation of the survey indicates a view that not all born-digital news content is created equal. Moreover, as previously noted, respondents across all stakeholder groups may not have discriminated between born-digital and other forms of digital content (e.g., digitized newspapers was the most valued across all groups for its preservation value).
When asked who should be responsible for preserving digital news, all stakeholder groups most frequently selected “news organizations” among the choices provided. It seems logical that those in control of and responsible for producing digital news content would also be understood as equally responsible for preserving it. It is interesting that more newspaper workers and researchers, versus library and archival staff, perceived the responsibility of preserving born-digital news as a task for libraries rather than archives. This finding may reflect a fundamental misunderstanding of the role of archivists as specialists in preservation. Other possibilities include the conflation of heritage institutions (i.e., libraries with archives), given that many libraries include special collections departments or archival specialists on staff. This finding may also correspond to a preference for ready access to this material, often associated with libraries, versus preservation or storage in an archive, which may be perceived as more exclusive and less readily available. When presented with a set of choices to rate what an organization might need to be prepared better to preserve born-digital news content (e.g. time, financial support, training), each stakeholder group indicated that all choices were at least somewhat important, if not very important, as rated on a 5-point Likert scale.
Participants were given an opportunity to answer a final, open-ended question to share any additional thoughts and comments to help illuminate issues related to preserving, accessing or using born-digital news. Responses did not yield any significant or identifiable themes. However, some comments offered useful anecdotal evidence and unique insights for further research. For example, among the newspaper workers, more than one respondent mentioned Google as a tool for research, with another expressing concern about Google’s hegemony. These findings also suggest that some newspaper workers believe that some sort of “automatic” preservation occurs when stories are uploaded online. Results also suggest that some newspaper workers are not aware of any secondary back-up procedure for born-digital data.
Among the researchers who responded to this question, historians tended not to express concern about the form of the primary sources. One historian wrote, “As a historian, I use born-digital news content very little,” 34 and another responded, “I honestly don’t pay that much attention to whether something is born digital or whether it originally appeared in print.” 35 Such responses reflect a concern raised by this research that many might not differentiate between digital news that is born digital versus news that is published online.
Common issues in the open-ended responses in all stakeholder groups ranged from a possible misunderstanding of the survey questions to somewhat impertinent answers. However, anecdotal insights were gleaned that will help develop further study about the struggles associated with the preservation of born-digital data. The inconsistency and confusion in responses could be interpreted as indicating a great deal of ambiguity about what born-digital media are, why it is important to preserve them and who should preserve them.
The limitations associated with survey-driven research are clear: The data gathered through voluntary participation limits what claims can be made. For example, it cannot be known if a representative number of newspaper workers responded to this survey. With a response rate of 22 percent, one can only speculate that a low response rate may reflect attitudes among newspaper workers in that they prioritize reporting the news and have little to say about preserving it. In other words, this may simply reflect job role perceptions among Maine newspaper workers. News preservation is not in a typical news worker’s job description, so perhaps many felt unqualified or uninterested in completing the survey.
To help clarify Maine’s born-digital news preservation practices, follow-up interviews are planned. Roughly one-fourth (60) of the respondents agreed they could be contacted for post-survey interviews. This method allows for more complex and detailed answers to questions and can alleviate confusion associated with certain terms and concepts, such as “born digital.” Furthermore, unstructured interview design allows for asking broad questions in order to formulate additional questions to solicit more detailed and nuanced responses. 36
Despite limitations, this survey produced a useful snapshot of stakeholder attitudes and opinions about born-digital news in Maine, and this work will inform future research. Like every state, Maine possesses a unique media and institutional landscape, and this research only yields findings from a primarily rural, economically challenged northeastern U.S. state. Ultimately, more research is needed on many fronts to understand perceptions about born digital news content better. Future research must include a larger sample size to understand the state of preserving and archiving born-digital news. Furthermore, institutions that employ media workers, librarians, archivists and research scholars should also study this issue. The cost of ignoring problems associated with preserving born-digital news is great; neglecting this issue threatens future generations who will be left with an incomplete record of news from this historical moment in the digital age.
