Abstract
This commentary draws on a decade’s experience of teaching data journalism within a variety of contexts to describe the lessons learned regarding different pedagogical techniques and choices about the aspects of data journalism to teach. What emerges is a difference between classes aimed at a general audience, who might be sceptical and/or ignorant of the diversity of data journalism practice and those aimed at a more specialist audience aiming to go into the increasing numbers of roles dedicated to data-driven techniques.
Introduction
Like a gas, data journalism teaching will expand to fill whatever space is allocated to it. Educators can choose to focus on data journalism as a set of practices (De Maeyer, Libert, Domingo, Heinderyckx & Le Cam, 2014), a form of journalistic output (Veglis & Bratsas, 2017), a collection of infrastructure or inputs (Tabary, Provost & Trottier, 2015) or a culture (Boyles & Meyer, 2016; Karlsen & Stavelin, 2014; Lewis & Usher, 2014; Parasie & Dagiral, 2013). Or, they might choose to spend all their time arguing over what we mean by ‘data journalism’ in the first place (Cushion, Lewis & Callaghan, 2017; Royal & Blasingame, 2015). We can choose to look to the past of computer-assisted reporting (CAR) and precision journalism (Meyer, 2002), emerging developments around computational and augmented journalism (Diakopoulos, 2012), and everything that has happened in between.
In this commentary, I outline the different pedagogical approaches I have adopted in teaching data journalism within different contexts over the last decade, from one-off guest classes at universities with no internal data journalism expertise, to entire courses dedicated to the field. In each case, there was more than enough data journalism to fill the space—the question was how to decide which bits to leave out, and how to engage students in the process.
Teaching Data Journalism—Fast
In 2010 at the start of a class at City University London, I asked the room—over 100 postgraduate students from a dozen courses—how many had studied a humanities subject such as English or History in their undergraduate studies. Most of the hands shot up. I then asked how many had chosen to study maths, a science or computing. This time the number of arms going up was in single figures—a significant minority in the room.
It is widely recognized that journalism students do not typically begin their education with an innate interest in numbers. This should come as no surprise: journalism after all has an image of a craft that revolves around words and storytelling; journalism students stereotypically aspire to be in front of the camera, not under the hood of a webpage. Teachers of data journalism will be familiar with the complaint from students that ‘I’m not good at maths’ or, perhaps more worryingly, ‘I’m not good with technology’.
Data journalism, then, starts at a disadvantage: its biggest names—the likes of Philip Meyer, Adrian Holovaty and Nate Silver (Bradshaw, 2017)—are not names most students would recognize. And big data stories such as the WikiLeaks revelations, Panama Papers and the MPs’ expenses scandal in the UK seem impenetrable and unreachable for the average 19-year-old, something for ‘later’.
Challenging Preconceptions
The first challenge in any data journalism class, then, is to challenge any pre conceptions that students might have about the discipline, and lower the barriers that make it seem otherworldly, to show that it is not something for ‘someone else’ or ‘another time’.
Because data journalism itself is a broad church—much broader than its predecessor computer-assisted reporting—it is important to show a wide range of examples of data journalism that students can engage with. Yes, there are the seminal examples mentioned above, but also articles that every journalist needs to write about the latest round of school performance data, or crime statistics. There are articles from newspapers, but also radio stories and TV stories where the data work might not be so immediately obvious. And there are entertaining and engaging data features from fashion, music, film and sport, as well as hard news.
Introducing data journalism in this way—and connecting it with students’ interests—helps establish an opportunity for students to make data journalism relevant to their own interests, rather than make them feel that they are going through the motions for our sake as academics. Ultimately, we are hoping to pique their curiosity.
Start at the End
For many years, I began my introductory data journalism classes with basic spreadsheet techniques, followed by visualization sessions to show them how to bring some of the results to life.
In 2016, however, I decided to try something different: what if, instead of taking students through the process chronologically, we started at the end—and worked backwards from there?
The class worked like this: students were given a spreadsheet of several tables already ready to be turned into a chart (I chose data on the Oscars, because it was around the time of the awards ceremony and because it was not hard news). They were then shown how to use an online visualization tool (such as Datawrapper or Infogram) to turn that into something that could illustrate a news story or broadcast about the forthcoming awards.
Rather than worrying about numbers, they could focus on storytelling: would they tell a story about gender, or ethnicity, religion or money? What type of chart would they choose? How would they use colour?
The process only took 30–60 minutes, but it served to establish something important: motivation. Once students could see the end result, and got excited about the effect of different colour combinations or animations, they would be in a much more receptive mindset to begin learning the techniques that could help them get the data in a convenient format in the first place.
In the second part of the class, then, we began to learn spreadsheet techniques, such as aggregating figures using pivot tables and calculating percentages.
The point was made: this was not maths—this was finding stories to tell.
To support students’ learning, I also wrote the short e-book Data Journalism Heist (Bradshaw, 2015) and gave them access to the full text. As the title suggests, the intention was to provide skills which could be learned in 3 hours and would allow beginners to ‘get in, get the data, and get the story out—and make sure nobody gets hurt’.
Data Journalism in Context
While there is a growing need for journalists with specialist data skills, a significant proportion of other journalists use data journalism techniques as part of routine work (Rogers, Schwabish & Bowers, 2017). A reporter on a fashion magazine, for example, may need data skills to understand some trend reports; a newspaper journalist may need to be able to show which local schools have dropped furthest in the latest rankings, or to fact-check the claims of a local politician; a researcher on a TV programme may need them to provide a background briefing, or to analyse freedom of information (FOI) responses.
Similarly, with many journalism students aspiring to work in magazines or present broadcast bulletins, it makes most sense to teach data journalism within those contexts (the exception, specialist courses, is covered below).
For this reason I have always taught data journalism at undergraduate level within a broader context: for many years within a module on ‘online journalism’, then within a module on ‘specialist reporting’ (future plans involve it being taught in a new module on ‘investigations and campaigning journalism’).
The principle is similar to that outlined above: establish a motivation for telling a particular story, present a problem (‘I need to find out whether this issue is getting better or worse’), then give them a solution to that problem, that is, data journalism techniques.
Freedom of information is one obvious way to provide an urgency to learn data skills: in that specialist reporting module, for example, students are supported in sending an FOI request in their first week. When, later in the module, they have all received the results of that request, it provides a pedagogical opportunity to introduce those data skills—not as an abstract exercise, but as something which comes about naturally through student enquiry (‘How do I turn this response into a story I can tell?’).
A similar context can be provided by hackdays, especially when done in partnership with media organizations. Hackdays—events whereby participants collaborate on projects over a short period—are often designed to bring together people from different disciplines to create an editorial product, and they are increasingly regularly used by the news industry.
In 2015, for example, as a general election approached, I organized an election hackday to bring together BBC journalists, journalism students and members of a local Hacks/Hackers meetup group. The condensed time period helps give focus to students’ efforts, while the loose structure of hackday events removes some of the pressures towards overambition: the point of the ‘hack’ in hackday is that results are expected to be rough and ready and not perfect (participants can then decide whether to pursue them further), and participants are expected to bring a diversity of skills and abilities: some will have technical skills but others will bring editorial focus or expertise.
The professional context adds further motivation, but there is another important result of collaborating with media organizations too: although students often arrive fearing that their skills fall short of those in the industry, they leave learning that the opposite is true: journalists working in the news industry invariably learn more from the students at these events than vice versa. The students realize that even their rudimentary data skills actually have value, and what’s more: they set them apart in their interactions with professionals.
All of this involves technical challenges—but so does broadcast journalism, or writing news articles. If students are prepared to learn about white balance, audio levels and focus, or the demands of the inverted pyramid and the nut graf, then asking them to learn about pivot tables or how to calculate a percentage drop is entirely reasonable, and we should not apologize for doing so.
Teaching Data Journalism—Slow
In contrast to the one-off classes and undergraduate modules involving data journalism, there are also a growing number of postgraduate courses which specifically focus on data or interactive/online journalism skills, alongside an increasing demand in the industry for journalism graduates with advanced data journalism skills. These courses typically attract a different type of student and afford more time and space to work with pedagogically.
My own experience of teaching on such courses comes from three contexts: in 2009, I launched an MA in Online Journalism at Birmingham City University with an explicit focus on data-driven techniques (the term ‘data journalism’ was yet to be popularized), and a year later I acted as an advisor to the MA in Interactive Journalism offered at City University London (I later delivered classes as a visiting professor). Finally, in 2017, I decided to launch a dedicated MA in Data Journalism at Birmingham City University, reflecting the expansion of the field in recent years.
In this section, I will talk about the factors that shaped course design, and how student output compared to the objectives of the course.
Course Design: Fitting It all In
If the challenge in teaching ‘fast’ data journalism is how to boil it down to the essentials and motivate word-oriented students, teaching ‘slow’ data journalism brings a very different challenge: how to do justice to the vast diversity of the field. This diversity is what makes data journalism qualitatively different from its antecedent computer-assisted reporting: to spreadsheets and databases, data journalism has added visualization and mapping, interactivity and coding, networks and application programming interfaces (APIs).
There are broader forces at work, too: changes in journalism as a whole mean data journalists are not exempt from the requirement to learn how to tell stories with video, audio, text and images; they must be able to work within a range of newsrooms and routines that are increasingly networked, collaborative and informed by analytics.
Other journalism courses struggle too with such a proliferation of change in the industry—but data journalism courses in particular must manage the same competing tensions while attempting to remain at the forefront of the next wave of change. At the time of writing, there is a wave of innovation driven by ‘big data’, automation, algorithms and artificial intelligence (AI), alongside a new and expanding body of literature attempting to address the ethical, legal and broader critical issues raised by data-driven practice.
Within this context, I believe it is important to design a syllabus that is flexible enough both to respond to the pace of change and accommodate the different professional backgrounds and objectives of those participating.
Learning outcomes tied to universal skills (that can be adapted to different technologies) help in this respect: two that have underpinned my teaching through out the past decade are, first, gathering information and, second, communicating it to an identified audience within a particular professional context. These outcomes apply regardless of whether the students are being taught to gather their information with FOI laws or scraping, whether they communicate the results through a video package or a mobile app. What’s more, they emphasize that students must explicitly consider what are the right tools for their particular editorial challenge: not every data journalism story can be found using an Excel spreadsheet.
Alongside these fundamentals, there is a third, somewhat catch-all, outcome: demonstrating an understanding of strategy, distribution, law, ethics and other critical issues.
In terms of strategy, for example, data journalists increasingly operate in an environment as networked as the data they deal with: from international inter-organizational collaborations such as the Panama Papers investigations (described as ‘An evolving division of labour that prioritizes inter-organizational networked journalism relationships’ (Hermida & Young, 2016)), to crowdsourcing-driven data projects and open source investigations by organizations as diverse as the UK’s Bureau Local and ProPublica in the USA. Within these contexts, an under standing of community management and practice is a skill in short supply, and students are asked to think about how those techniques can be used to better inform their work from pre-production onwards. Thankfully there is an increasing body of work in this field, from the Reuters Institute’s edited collection on the rise of collaboration in investigative journalism (Sambrook, 2018) to Seth Lewis and Nikki Usher’s work on the Hacks/Hackers network (Lewis & Usher, 2014).
Communities of Practice and Lifelong Learning
A common anxiety experienced by those starting out in data journalism (and indeed modern journalism more generally) is the worry of having so many things that you feel you should be learning about.
It is wonderful to have access to almost infinite knowledge—and yet it is also oppressive. Learning how to manage what Kierkegaard described over a century ago as the ‘dizziness of freedom’ has taken on a new and urgent importance in education. Thankfully, within the field we have a useful parable to illustrate this: we use the term ‘unicorn’ to refer to the person who can do everything. When anyone asks, ‘Why do you call them unicorns?’ we reply: ‘Because they don’t exist’.
I talk about this paradox with students at the very start of the course, and return to it throughout: the student should never expect to know or learn everything. They will spend their lives learning new skills, and that is part of the fun. Just as the traditional journalist might 1 week be expected to interview a lottery winner, and the next attend a crime scene, so the data journalist can be learning spreadsheet skills 1 week, and text analysis the next. The real data journalism skill is to get better and better at learning new things—and to always be curious.
To this end, it is important to draw on networks of support both for guidance and the sorts of collaboration outlined above. To encourage this, within my courses I have for some years now adopted an explicit emphasis on identifying and engaging with such ‘communities of practice’ (Eldridge II & Steel, 2016).
Perhaps the best-known example of a community of practice within data journalism is the National Institute for Computer-Assisted Reporting’s NICAR-L mailing list, on which data journalists and CAR practitioners around the world share questions, answers, tips, experiences and opportunities (including jobs and internships). But beyond that mailing list, there are hundreds of other networks which journalists—and aspiring journalists—can benefit from participating in, from Slack channels (such as those of the Bureau Local and Stories With Data) and ‘civic coding’ mailing lists (coders who want to create tools for public good), to language-specific resources such as the R-Journalists mailing list, subject-based communities and real-life events found through platforms such as Meetup and Eventbrite, hackdays and conferences.
Where events do not exist, there is an opportunity to create them: we established a Hacks/Hackers meetup group in Birmingham and worked with the BBC data unit in the city to hold an annual Data Journalism UK conference in the city, with a pre-conference hackday to get students engaging with that wider industry and practice.
Students are encouraged not only to attend these events—taking part in the ‘trading zones’ (Lewis & Usher, 2014) of data journalism—but also to organize and speak at them, too: every September, a Hacks/Hackers meetup is held where incoming students can hear what the graduating class made for their final projects, while it provides a perfect excuse for students to invite their favourite speakers from the industry.
Coding and Computational Thinking
There are two common complaints that you will hear from employers in the industry looking to hire data journalists: applicants either have impressive technical skills, but few ideas around how to spot and tell stories—or the opposite problem: a sound news sense, but a lack of ability to realize those ideas technically, masked through a frustrating tendency to ‘bluff’ about their technical prowess.
The challenge for a university seeking to offer data journalism education is this: how do you develop both editorial and technical skills in data journalism graduates when there is a lack of skilled educators? (Constaras, 2017; Davies & Cullen, 2016).
Sending journalism students to computing classes is not, I believe, the answer: we do not outsource teaching of media law to the law faculty, or ask broadcast engineering staff to teach video journalism, after all. From having to operate within a content management system and coding within newsroom timescales, to the ethics of visualization or legal issues relating to scraping, data journalism has its own set of priorities and constraints. And then there is the language problem: in data journalism, we are not training programmers in a particular language, but rather with the ability to adapt to any number of languages that might be used in the industry.
This was a particularly tough problem to tackle when designing the MA in Data Journalism: should I teach one language or many? Which should I teach where?
Ultimately, I came back to the idea that data journalists should have the confidence to be able to learn a range of new skills relevant to editorial problems, rather than be given a limited range of skills frozen in the time that their education took place.
The key to this approach came with the concept of computational thinking. This provides a framework for breaking down challenges into manageable chunks (decomposition), abstracting those problems, recognizing patterns and compiling some sort of workflow—an algorithm—to accomplish those (Wing, 2006).
I decided to introduce this concept in the second week of the course—as part of a second and final session on spreadsheets. I had already chosen to spend only two classes on spreadsheet techniques before moving on to other data journalism techniques, and my reasoning was this: once you understand fundamental spreadsheet techniques, most data journalism work consists of breaking down an editorial problem into separate steps, and searching for the functions that will accomplish that.
This then laid the foundations for the remaining classes, where a range of coding techniques (R, JavaScript, command line and SQL) were introduced as ways of exploring key concepts in the field, from responsive design and inter activity to APIs and dealing with large datasets. (In a second semester module, the pattern continued, as students added a further language—Python—as they explored scraping within a more investigative module that provided an important space for students to further apply and develop the technical skills developed in their first semester.)
My objective was not that students become masters of all these languages and techniques, but rather than they not feel intimidated by any of them. The analogy that I presented every week was this: I was hoping to open a series of doors for them, one door each week. In their independent study, project work and newsroom activity, they would need to choose which doors to walk through and explore further, in order to solve their own editorial challenges—in the process developing important problem-solving techniques.
And in future, when they encounter different editorial challenges, they could return to the other open doors and feel confident that they could go through those too, just as they had learned new skills before. This was data journalism as a set of practices, a collection of habits, a toolkit of problem-solving techniques, which are adapted to each new problem.
To my delight, it worked. At the end of the module, the students submitted their work. It covered a wide range of techniques and skills: one student used the JavaScript library D3 to create distinctive cartograms; others used third party tools. Some used spreadsheets to find their stories; others used R. In other words, each had chosen a door to walk through and explore further in relation to their chosen editorial problems (ranging from reactive news stories and election coverage to investigations and explainers)—and they had done well: the average mark was around 10 percentage points above the typical average for a Masters module; none scored lower than a Merit (the middle of three bands used to score work at that level in the UK). The feedback for the module was equally positive: students felt well prepared for the challenges they would face and, importantly, not overwhelmed.
The Data Newsroom
The launch of the new MA in Data Journalism alongside another new MA in Multiplatform and Mobile Journalism at Birmingham City University—with a combined live newsroom—provided an opportunity for data journalism students to reflect on the organization of data teams and how those fit into the wider newsroom in which it operates.
Data journalism teams can take a number of shapes, from the one-person operation to the dedicated investigations team (Usher, 2016; Uskali & Kuutti, 2015): not only do we need to prepare our students for those, but we also need to prepare them for the inevitable reorganizations and tensions that data journalists face as they negotiate relationships with their colleagues in the wider news organization.
Students from the MA in Data Journalism, then, are asked to form a data team alongside the Multiplatform and Mobile Journalism students. Decisions must be made about allocating time and expertise across platforms and stories: do the data journalists operate entirely separately on longer-term projects? Or do they operate as a service department, providing complementary material such as visualization and interactivity for the day’s news? This question provides an opportunity to explore literature on the organization of both multiplatform newsrooms and interactive teams—Uskali and Kuutti’s framework provides a useful overview of the options (Uskali & Kuutti, 2015), while Nikki Usher’s book Interactive Journalism provides further ethnographic detail (Usher, 2016). During the period of the newsroom, students might be coached to vary their approach, increasing or reducing their independence from the rest of the team, in order to get a varied experience of the different dynamics of different arrangements.
Conclusions
This commentary has outlined two different approaches to data journalism teaching based on the time and dedication being provided by the students involved: ‘fast’ and ‘slow’.
In delivering ‘fast’ data journalism teaching, it is important to begin with the contexts that the students situate themselves within, challenging preconceptions and focusing on the practices relevance to the student’s professional aspirations. A pedagogical process that incorporates data journalism as a problem-solving activity rather than a self-contained practice can also help break down barriers for students who see it as outside of their sphere.
In teaching data journalism across a range of contexts and with students who arrive with different experiences, interests and motivations, I have found it is important to begin first with those motivations.
In delivering ‘slow’ data journalism, however, it is important to acknowledge that the scale and speed of development of the field surpass any space or time that can be devoted to it within a classroom setting. Perhaps more than any other form of journalism, data journalism encapsulates a networked and dynamic mode of production and learning which requires both a lifelong learning approach, and developing strategies of problem-solving and networks of support that foster that ongoing professional development. Conceptual frameworks, such as computa tional thinking and communities of practice, can be useful in this regard, while course and assessment design which is flexible enough to accommodate different editorial challenges can ensure that students are given the freedom to develop different technical skills that fit relevant editorial demands, rather than the other way around.
