Abstract
Understood as the practices that seek to centre and empower individuals and communities in the collection, usage and sharing of data, the concept of participatory data stewardship (PDS) is often praised for its potential to challenge datafication and facilitate public involvement in decision-making processes mediated by and about data. However, little is known about the role of data literacies – the skills, knowledge and practices involved in accessing and using data both practically and, more critically, with a view to civic action and social change – within PDS. Based on semi-structured interviews with community researchers (CRs) taking part in a PDS project conducted in Widnes, a town in the UK, this article examines the importance of developing CRs’ data literacies in the context of their involvement in the project. Key findings suggest that, whilst CRs recognised significant gaps in their data literacies, with data often being referred to as an abstract and obscure concept, they had both strong motivations to better understand data and expectations for how this may be used to improve their community. Bridging media literacy research on critical data literacies with PDS research, this paper argues that, if we are to expect PDS to potentially empower communities in a datafied society, then members of these communities need to be supported to develop their data literacies. Implications for research, practice and policy are discussed.
Keywords
Introduction
This article reflects on the role of data literacies – the skills, knowledge and practices required to access, interpret and manage data in a datafied society (Polizzi, 2025; Selwyn and Pangrazio, 2018) – as a fundamental requirement for taking part in participatory data stewardship (PDS). Understood as the practices that seek to ‘empower people to help, inform and, in some instances, govern their own data’ (Patel, 2021: 7), PDS is a concept that encompasses activities involving non-experts in data systems and decision-making. Grounded in interdisciplinary scholarship on data governance, it describes a substantive ideal whereby communities are empowered to make decisions through and about data (Patel, 2021).
As part of a PDS project designed to develop a data hub for a marginalised community in Widnes, UK, members of this community – referred to as community researchers (CRs) – were trained in research methods and notions of data and datafication, relevant to their data literacies, to design and conduct their own research, which involved interviewing other residents about wellbeing and the role of data in making decisions about their community. The broader project employed a participatory action research (PAR) approach to involve CRs in the collection and analysis of data gathered from other residents. This article draws on the semi-structured interviews conducted by the research team with CRs, before and after their training, to explore the importance of developing their data literacies within the PDS project. This paper, therefore, does not employ a PAR methodology, nor is its scope to evaluate the effectiveness of the training. Rather, it examines the role of data literacies within PDS.
Key findings suggest that CRs recognised significant gaps in their data literacies, with data often referred to by CRs as an abstract and obscure concept. At the same time, CRs had both strong motivations to better understand data and expectations for how this may be used to improve their community, both of which we argue are a fundamental precondition for developing data literacies with a view to civic action. Relatedly, if we are to expect PDS to potentially empower communities in a datafied society, then it is essential to provide these communities with opportunities to develop their data literacies.
After a section on datafication, this article reviews literature on data literacies and PDS, highlighting gaps and limitations. We then present the methods employed in this article and, after presenting key findings from our interviews with CRs, we discuss their relevance and implications for research, practice and policy.
Datafication
We live in an age of datafication, with social systems and institutions operating, and decisions being made, through the collection, use and sale of vast amounts of data (Mejias and Couldry, 2019). Think, for example, about the use of data and predictive analytics, based on algorithms, by healthcare professionals to prioritise in-patient care and improve efficiency (Lein, 2024). At the same time, children’s lives are recorded via different means ranging from social media posts from their parents to schools’ databases used to monitor pupils’ academic development (Livingstone et al., 2020). Recent regulatory developments – including legislation like the UK Government’s (2023) Online Safety Act – attempt to limit the profiling and repurposing of children’s data, especially for commercial uses. However, the datafication of children's lives remains pervasive, exceeding the reach of current regulatory interventions. Meanwhile, vast datasets about groups like refugees are collected by governments and other organisations to facilitate their access to support, and the same applies to other populations, including individuals in social welfare systems (Wille and Jacobsen, 2023).
Datafication is increasingly made more complex by artificial intelligence (AI), including generative AI (GenAI) based on large language models (LLMs), their multimodal equivalents and other neural-network-based AI systems trained on vast datasets (Chen et al., 2024). Unlike earlier probabilistic models, these systems introduce new forms of opacity and algorithmic bias – the lack of fairness in outputs generated by algorithms trained on data that ‘reflect the negative biases that exist in society and the people who create them’ (Noble, 2018: 1). In fields like healthcare, education and social services, these technologies raise considerable concerns, since they can amplify systemic inequalities and influence decisions affecting people's access to support and opportunities (Alowais et al., 2023; Gökçearslan et al., 2024). Furthermore, these issues are not confined to public institutions’ use of data but exacerbated by how tech companies collect and monetise data at scale, making datafication more pervasive.
Digital devices, the internet, online platforms and AI tools are designed in ways that allow tech giants like Google and Facebook to collect and sell data, including user-generated content, for advertising purposes with a view to generating profits (Verdegem, 2022; Zuboff, 2019). As a result, digital technologies have contributed to a digital environment in which citizens’ engagement with information and social interactions are commodified and exploited, contributing to surveillance capitalism – the widespread collection and commodification of personal data by corporations (e.g., Zuboff, 2019). This form of economic surveillance is intertwined with government surveillance, since it generates data sought by government agencies (e.g., for law enforcement) and consulting firms assisting political campaigns (Lyon, 2022). Meanwhile, the use of data and AI is discursively championed by institutions and corporations for facilitating efficiency and socio-economic development (Carrière-Swallow and Haksar, 2019). These discourses, however, are oblivious to the risks presented by datafication.
Against the backdrop of surveillance capitalism, the aforementioned risk of algorithmic bias manifests across multiple domains of the digital environment, including social media platforms’ personalised recommendations, predictive analytics in public services and the outputs of GenAI tools (Verdegem, 2022). Problematically, this bias can be especially harmful when informing decisions that discriminate against the most vulnerable groups. A 2019 study found, for example, that a commercial risk prediction tool in US hospitals systematically de-prioritised Black patients, reducing the proportion of those receiving additional emergency care from 46.5% to 17.6% (Obermeyer et al., 2019). Meanwhile, scandals like Cambridge Analytica, and allegations about platforms like TikTok potentially sharing user data with foreign governments, exemplify concerns about data misuse leading to violations of privacy and voter manipulation (Confessore, 2018; Tidy, 2024).
This is why increasing attention is being paid to the importance of (a) regulating data and the digital environment – as reflected in the EU’s (2016) General Data Protection Regulation and the UK Online Safety Act (UK Government, 2023) – and, as unpacked below, (b) developing the digital and data literacies that individuals and communities need to navigate such an environment (UK Government, 2023).
Data literacies
Broadly defined as the ability to access, evaluate and create media, media literacy is often understood as an umbrella term that incorporates, and overlaps with, multiple types of literacy including information, news, digital and data literacies (Livingstone et al., 2008). While media literacy refers to both traditional media and digital technologies, notions of digital literacy relate to digital technologies, incorporating functional and critical dimensions (Polizzi, 2025). Functional digital literacies rely on the skills and knowledge required to use digital technologies practically. Critical digital literacies relate to the ability to evaluate online content and the digital environment, including the business models of internet corporations and their implications for people’s privacy (Fry, 2014; Polizzi, 2025).
Notions of data literacy (or critical data literacy) can be understood as types of digital literacies that incorporate context-dependent functional data skills and knowledge about the affordances of digital technologies alongside a critical understanding of datafication and how data is used (Polizzi, 2025; Pangrazio and Selwyn, 2019). There are different research traditions, with some of these, as in the case of work informed by computer science, framing data literacy as accessing and managing data in primarily functional terms, and others, as with research informed by psychology, framing it in cognitive terms (Yang and Li, 2020). By contrast, within education scholarship and the broader media literacy field, a strand of work referred to as the New Literacy Studies has approached media, digital and data literacies as inherently contextual, rather than universal skills (Kress, 2003; Mills, 2010). These literacies are understood as complex social practices, with work in this space stressing their plurality and diversity among different groups and settings (Gee, 2023). As such, the possession and deployment of literacy skills and knowledge is not just bound to the social context but intertwined with questions of power and ideology, including who has access to specific literacies and which of these are more dominant than others (Gee, 2023; Mills, 2010).
These questions are even more prominent within media literacy work informed by critical theory and critical pedagogy, which frames critical literacy, understood as the ability to question power and authority, as inherently linked to civic action challenging dominant ideologies (Kellner and Share, 2019). As found by research lying at the intersection of the New Literacy Studies and critical pedagogy, this can take the form of young activists deploying media literacy skills to create and share videos and other multimodal content to challenge forms of discrimination like racism or Islamophobia (Jenkins et al., 2016). Equally, it can take the form of acts of resistance against the data collection practices of governments and corporations, while minimising one’s own exposure to forms of surveillance (Polizzi, 2025; Selwyn and Pangrazio, 2018; Shresthova, 2016). From this perspective, critical data literacies incorporate functional and critical skills and knowledge that enable citizens to undertake civic action underpinned by tactics to protect their privacy and resist the power of corporations and institutions (Polizzi, 2025; Selwyn and Pangrazio, 2018)
Forms of civic action can range from activism – including data activism, committed to rethinking the use of data to promote social justice (Kennedy, 2018) – to privacy advocacy and engagement in debates about or forms of participatory data governance (Pangrazio and Selwyn, 2019). Meanwhile, examples of data tactics may include managing privacy settings, deliberately obfuscating personal information, using messaging platforms that are more encrypted, and sharing knowledge about data with others (Polizzi, 2025; Selwyn and Pangrazio, 2018). However, there are inherent limits to how much knowledge citizens can develop about datafication, given its opacity, as elements of institutions’ and corporations’ data collection processes remain publicly unknown, lacking transparency (Reischauer et al., 2024). What is more, the functional and critical skills and knowledge crucial to citizens’ data literacy practices can only be mobilised if they are underpinned by civic imagination and intentionality, including the motivations and expectations necessary to take action. That is, while functional skills can enable people to use data alongside a critical awareness of the power relations that shape datafication, it is their aspirations for social change that ultimately orient their functional and critical data literacies towards collective or democratic ends (Mihailidis, 2018; Polizzi, 2023).
Within media literacy research informed by the New Literacy Studies and critical pedagogy, these aspirations often take the form of advocacy for social justice against dominant ideologies and power structures (e.g., Jenkins et al., 2016; Selwyn and Pangrazio, 2018). From this perspective, protection of privacy – particularly in the context of institutional and corporate data collection – is crucial to critical data literacies and their potential for civic action. However, we live in societies with limited viable alternatives for citizens to opt out from sharing their data when using public services or commercial products (Van Dijck, 2020). Furthermore, large portions of the population lack the foundational skills and knowledge to undertake data literacy practices that could enable them to exercise some agency over their data within the confines of what is possible to know and manage. For example, research shows that many adults in the UK do not know how to manage their privacy settings or understand how their data is collected and used by social media platforms (Carmi and Yates, 2023; Ofcom, 2025; Yates et al., 2021).
Promisingly, there is a growing awareness, both in research and practice, that more opportunities are needed for citizens to engage in participatory processes of data governance if we are to create more just and fair societies – an issue discussed below in relation to PDS and data literacies.
PDS and data literacies
PDS refers to a set of governance practices in which communities participate in decisions about how data about them is collected, accessed and used. Rather than treating data subjects as passive, PDS emphasises the importance of collaborative governance – including co-design, deliberation and shared decision-making – to distribute authority between data holders and affected communities (Taylor and Purtova, 2019; OECD, 2021). Drawing on literatures and histories of participatory democracy, data governance and Public Engagement with Science and Technology (PEST), the concept reflects longer-standing commitments to social empowerment, accountability and democratic inclusion in sociotechnical decision-making (Arnstein, 1969; Bell and Reed, 2022; Fung, 2006, Patel, 2021).
PEST emerged from mid-twentieth-century efforts to increase science literacy, in line with the public deficit model (Bauer, 2009). This model places primary responsibility on what individuals do (or do not) know, with limited scientific knowledge among the public imagined to lead to public rejection and failure of scientific advancements. Since then, the PEST field has shifted to a ‘science in society’ model, where science literacy plays a primary role in enabling participation in decision-making rather than in ensuring success (Bauer, 2009; Stirling, 2008). PDS applies similar participatory principles to the governance of data infrastructures, drawing on notions of participatory democracy to challenge existing methods of decision-making (Barlett et al., 2024; Kelly et al., 2023; Patel, 2021).
Models such as mini-publics and citizens’ juries, in which citizens are convened to deliberate on policy issues, constitute useful precedents for PDS, illustrating how structured processes can enable inclusive debate and collective decision-making (Curato et al., 2021; Escobar and Elstub, 2017). Even though these participatory methods are often critiqued for their idealism or limited impact (Parvin, 2021), PDS has shown promise at the community level. For example, community-developed data charters – like Camden Council’s (2023) co-created Data Charter – invite residents to come together to agree on norms for responsible data access and sharing. Similarly, initiatives such as data trusts, commons and cooperatives like MIDATA – a Swiss citizen-led platform for managing health data – adopt participatory decision-making structures to facilitate deliberation on how data is collected and used to generate value (Delacroix and Lawrence, 2019; Pentland et al., 2021).
These cases illustrate how PDS translates theoretical commitments from democratic governance into concrete institutional forms. PDS involves citizens in deliberation on sociotechnical power and datafication, paying greater attention than other methods of participatory democracy to the interaction between individuals and the data systems shaping their lives (Kelly et al., 2023 ). It is therefore fair to assume that data literacies must be integral to the success of PDS. However, this empirical question has remained under-explored. Arguably, we live in an age in which data literacies are essential for participation in decision-making processes mediated by data. Perhaps unsurprisingly, there is some literature on the importance of developing the functional data literacy skills required within the public sector to access, locate and manage data (e.g., Fattah, 2024; Ongena et al., 2022). Less is known, however, about the role of functional and critical data literacies within PDS and other participatory forms of democratic governance.
What is expected of citizens’ participation in the political process can vary, with Arnstein’s (1969) foundational ‘ladder of citizen participation’ often serving as the basis for modern descriptions of public participation. Arnstein (1969) argues that public participation falls along a ladder ranging from informing the public about a topic to fully empowering them to make decisions on it. This model has been critiqued for oversimplifying the dynamics of participation and implying a linear progression that does not reflect the complexity of real-world initiatives (Cooke and Kothari, 2001; Cornwall, 2008). Furthermore, impact on decision-making and empowerment can vary among participants, which makes it difficult to place a given initiative within a single stage of participation. Drawing on Arnstein’s work, Patel (2021) positions the public in data systems within a continuum of involvement ranging from informing and consulting to involving, collaborating and empowering communities in making decisions through and about data. While an example of empowerment includes the proliferation of data cooperatives, limiting the role of citizens to that of being informed about decisions informed by data provides the least opportunity for public empowerment (Aapti Institute, 2024).
Patel (2021) does not discuss data literacies as part of her model. However, she champions transparency and explainability, which refers to how the development and results of algorithms can be made understandable to non-experts. Promisingly, in Patel’s (2021) vision the task of informing the public is vital if the purpose aligns with the objective of fostering transparent governance of data. It follows therefore that, whether the function of PDS is to boost transparency or empowerment, data literacies are bound to play a key role in the design, implementation and evaluation of PDS initiatives, especially if these are to challenge datafication and promote data justice. However, given the dearth of research in this space, empirical evidence is first needed of the importance of data literacies within PDS.
Research questions
Considering the potential of PDS in challenging datafication and its under-explored intersections with data literacies, this study draws on a PDS project called Round ‘Ere (described below) to address the following question, based on the experiences of CRs involved in the project: RQ: How important is the development of data literacies in the context of PDS?
Methods
Research design
This study was part of a larger PDS project called Round ‘Ere and funded by Liverpool City Region Combined Authority. Conducted in partnership with Liverpool City Region Civic Data Cooperative (CDC) and Capacity, the project adopted a PAR methodology, which is based on the direct involvement of participants in conducting one or more aspects of a research project (Cornish et al., 2023). The study reported here presents empirical evidence, collected as part of the same project, that is not based on the use of PAR. In spring 2023, 16 residents from Widnes – a UK industrial town near Liverpool often referred to as marginalised or left behind (Smith, 2023) – were recruited and trained, both in research methods and concepts relevant to their data literacies, to co-design and run a piece of research exploring perceptions of wellbeing and the role of data in making decisions about wellbeing within their community. The larger project employed a PAR approach to ask these residents – referred to as CRs – to collect their own primary data from fellow residents. To examine the importance of data literacies for PDS, the study presented here draws on the semi-structured interviews conducted by the research team with the CRs involved in the training and PDS project. By contrast, key findings from CRs’ data collected from fellow residents can be found in the project's final report, focusing on the intersection of wellbeing and data (Capacity and LCR Civic Data Cooperative, 2024).
After completing the training delivered by the research team, CRs voted on and designed a semi-structured interview guide method to collect data from their community. Over the summer 2023, they interviewed over 200 of their fellow residents, unobserved but with regular support from the research team. Three workshops were then run to analyse their findings with them and co-design a wellbeing data hub with local policymakers and voluntary sector representatives. Meanwhile, the research team conducted repeated semi-structured interviews with CRs before and after the training, and after their collection of data from other residents, to track their experiences of taking part in the project.
To clarify, the Round ‘Ere project did not consist of the evaluation of a PDS intervention, nor was the training provided to CRs formally evaluated. Rather, the project was conducted to gather evidence to inform the development of an intervention – a data hub designed to allow residents in Widnes to access data about wellbeing support and their own community. Based on the interviews that we conducted with CRs, this article is not about the effectiveness of adopting PAR or of designing a data hub as a potential tool for the realisation of PDS. Rather, it deals with the question of how important it was to develop CRs’ data literacies in the context of their involvement in the project.
Recruitment and training
In spring 2023, the research team, consisting of academics from the University of Liverpool, recruited and trained 16 CRs in listening skills, research methods and notions of data and datafication relevant to their data literacies, with a total of 14 CRs finishing the project. CRs were purposely recruited to primarily maximise diversity in gender, age and socio-economic status. The final sample included 14 participants from Widnes, including 10 women and four men, with two participants under 25 years of age, seven between the ages of 25 and 50, and five over the age of 50. To preserve CRs’ full anonymity, no further demographic details are reported here. We intended to recruit a minimum of 10 CRs and purposely over-recruited to account for potential attrition. Recruitment included hosting stalls in local shopping centres and libraries, distributing physical posters in community locations, sponsoring a local rugby game, handing out flyers and publishing online posts in local Facebook groups. Written informed consent was sought from participants, and the project was approved by the University of Liverpool Research Ethics Committee – reference numbers: 11798 and 12124.
CRs were trained over three day-long sessions and one evening session, with the training drawing on the Clubmoor Toolkit – a community-led research toolkit produced as part of another participatory project (Heseltine Institute for Public Policy, Practice and Play, 2021). The training included introductions to quantitative and qualitative research methods alongside practical information on how to collect informed consent, conduct an interview and design a recruitment plan. In addition to the research training, CRs undertook exercises to reflect on personal bias, what wellbeing meant to them and, more importantly for the purposes of this article, notions of data and datafication. This component of the training was designed to provide a space for CRs to (a) reflect on the state of their own data literacies, including gaps in their skills and knowledge, and (b) raise their awareness of datafication – thus primarily developing their critical data literacies rather than their functional data skills or knowledge. To do so, we drew on exercises publicly available through the Our Data Bodies (2020) project, designed to help participants explore the types of personal and demographic data that is systematically collected by institutions and corporations.
As part of these exercises, we used cards, worksheets and post-its to encourage CRs to reflect on how this data is generated, gathered and potentially used or shared, with a focus on its ethical and societal implications. In addition, they were asked to complete another activity, designed by the research team, to think about how their day-to-day activities – from when they wake up to when they go to bed – are or may be translated as digital information. Through this activity, they were asked to reflect on how this information could be used to improve their individual wellbeing – their physical, mental and emotional health – and the collective wellbeing of their community (Akhter et al., 2023). This focus on wellbeing provided a complementary connection to their development of data literacies through the training, with CRs considering the broader implications of datafication alongside the transformative potential of data to support healthier and more empowered communities.
After the training, CRs voted on the research method they thought would be most appropriate to explore wellbeing and data with their fellow residents, choosing semi-structured interviews. After designing the interview guide together with the research team, over the summer 2023 CRs interviewed over 200 of their fellow residents. We then ran three workshops to analyse, together with CRs, the findings from their collected data – reported in Capacity and LCR Civic Data Cooperative (2024) – and co-design a wellbeing data hub with local policymakers and voluntary sector representatives. The first workshop focused on initial findings from the interviews conducted by CRs. The second and third workshops included all stakeholders and CRs to discuss the findings and the potential usage of a data hub (Table 1).
Round ‘Ere process timeline.
Data collection and analysis
Sixteen CRs took part in a first interview with two members of the research team and authors of this article (GP and ER), 13 in a second interview, and 13 in the final interview. Two CRs did not complete the project due to either time conflicts or disinterest. In total, 14 CRs took part in the project from start to finish. One CR missed their second interview, and a different CR interview was not successfully recorded. In total, 42 interviews were analysed. GP and ER split the interviews between them, with frequent discussion and refinement of the interview guides throughout the three phases of interviews. Interviews were held in-person and online, to the preference of the interviewees. The first interview guide included questions around understanding and perceptions of wellbeing, data and the community of Widnes, as well as expectations for the project. The second interview guide also included questions about data and wellbeing perceptions alongside perceptions of the training and planning for the research project. The third interview guide included questions about findings from the CR project and perceptions of the overall research process.
To answer the questions above, this article draws on the first two rounds of our interviews with CRs (before and after the training), since no questions about CRs’ perceptions of data or data literacies were asked during the third interviews. Once collected, transcribed and anonymised (with acronyms used below such as ‘P1 I1’, referring to ‘participant 1, interview 1’), the data was thematically analysed by the research team through a collaborative and inductive approach to coding and re-coding themes exploratively (Braun and Clarke, 2006). GP and ER analysed all the interviews in tandem including multiple stages of coding, synthesisation and re-coding until agreement was reached. Through this process, initial descriptive codes were iteratively refined, compared and grouped into broader themes capturing patterned meanings across the dataset. All analyses were conducted using NVivo.
Findings
This study explores the importance of developing CRs’ data literacies in the context of taking part in the PDS project Round ‘Ere. Three main themes emerged from the thematic analysis of our interviews with CRs. As presented below, these themes include (a) CRs’ perceptions of data as a vague and obscure concept, (b) gaps in their data literacies, and (c) their expectations of how data could be used.
Data as a vague and obscure concept
CRs’ perceptions of data were mainly discussed during their first interviews and, to some extent, after the training – that is, as part of their second interviews. During their first interviews, most participants struggled to define the word data, remarking on the abstract nature of the concept. More specifically, many found it challenging to provide even a basic definition. As explained by P2 I1: ‘I can’t even think of a word when you say data’. Meanwhile, others limited themselves to describing it in primarily statistical or computational terms. For example, the first word that came to P4 I1’s mind was ‘computers’, while P15 I1 described it as ‘information that is stored and used to produce statistics’.
Like P15 I1, some participants talked about data in relation to the ways in which it can be used to track patterns and trends among a given population. When asked what data makes them think about, P10 I1 said: Figures, information, and information gathering, and names, addresses, what you like doing, what you don’t like. Just data about people, information collected about people that is somehow used or analysed. Research data.
Similarly, other examples of data provided by CRs included ‘age of people [… and] ethnicity’ (P3 I1). At the same time, some of those who talked about it in computational terms emphasised, from a security perspective, the potential of losing data or of this being stolen, describing it as information that is intangible and, therefore, hard to store securely. As explained by P5 I1: Data, for me, is writing things down and storing it in a file… I don’t think too much about it. But I do know [that data …] can get lost [or end up …] in the wrong hands… [Once, I had] two computers and [… was] migrating [data] from one to the other… Then I try to update it and I just completely lost everything on that computer.
This is not to say that all participants thought of data only in relation to computers or statistics. Despite finding it challenging to provide clear and concrete definitions of data, some CRs acknowledged, more holistically, that it serves the social purpose of providing evidence to make decisions, including about specific groups and communities. As remarked by P12 I1, data relates to the process of ‘gathering information in order to make informed decisions’. As such, as described by P11 I1, it includes ‘information that pertains to [people] and can ha[ve] an impact on them’.
The social and more holistic purposes of collecting data raise questions about power, including who has access to data and how this is used or may be misused. The next section deals with these questions by highlighting some of the data literacy gaps that were found among CRs and how the training helped them develop a critical understanding of datafication. Interestingly, participants remarked during their second interviews that their perceptions of data as a vague concept had not significantly changed after the training. As P1 I2 put it: ‘I still can’t really get past the idea of just thinking of it as information’, thus emphasising the inherently abstract nature of the concept. The next section, however, shows that a critical understanding of datafication is also limited by the largely opaque nature of data collection practices.
Gaps in CRs’ data literacies
During the interviews, most CRs acknowledged gaps in their own functional and critical data literacy skills and knowledge and those of other residents in their community. Asked whether they had the skills to access and use data, P1 I1 responded: ‘by large, perhaps no’. Similarly, as remarked by P15 I1, people may ‘understand [the word] data, but whether they understand the legalities of GDPR will be a different matter’. Indeed, besides perceiving the concept of data as vague and obscure, most participants recognised considerable limits in their own and other residents’ knowledge of how data is collected about them, and how difficult they find it to access and manage their own data, when this is made available to them. As acknowledged by P16 I1, this is not to place the onus on citizens. Rather, it raises questions of education and the extent to which people receive the support they need to develop their data literacies. As explained by P16 I1: You don’t realize where [your data] goes or what happens with it because I think a lot of people are probably maybe not naive, but uneducated about what happens with that information, where it goes and what it is used for.
P16 I1 went on to describe the struggle they had experienced when requesting, and then having to access by themselves, their own medical records, which could only be accessed via an online platform: It’s called PATCHS, their medical platform, and you go on to create an account, and then, you know, security and all those kinds of things… I struggled a little bit and I thought “if I struggle even a little bit, how do some people act [… including] elderly people that need to access these things?” I don’t think that everybody is tech savvy.
This is not to say that there may not be exceptions. P13 I1, for example, was one of the few CRs with more knowledge about datafication, critically reflecting on its social implications during their first interview – that is, prior to the training: In this sort of cultural moment, when we think about data perhaps the first thing we think about is big data. We think about corporations collecting our data. And I think arguably this idea of distrust is very central to the idea of data in the modern age in many different ways. You know, just look at … like Cambridge Analytica, and … the fact that we have such big online presences… So, I think there's a lot of anxiety… For a lot of people, data can be a scary thing to think about – like data collected about me, where is it going? What can they do with it?
P3 I1’s remarks suggest that the power asymmetries in how corporations collect data go hand in hand with concerns around trust and security. The quotation above indicates that data security, in particular, does not just relate, as reported in the previous section, to how data may be stored or lost from a user and more functional perspective. More critically, it also relates to the potential misuse of data by those in positions of power. This is why the often opaque nature of data collection practices, including those of institutions and corporations, casts doubts on the extent to which it is possible to be fully knowledgeable about datafication. Such opacity, in turn, can generate anxiety and uncertainty among individuals, regardless of whether they may be relatively more knowledgeable, like P3 I1, or less aware of what happens to their data.
What is more, while most participants discussed the need to develop people’s functional and critical data literacies, P13 I1 reflected on the extent to which these literacies may not always be useful to most people. As they put it: Equally, you could argue someone’s blood count, or a blood test result, is not going to be useful to a layperson. That’s only going to be useful to medical professionals. So arguably, your neighbour who’s not a doctor, why would they need to have that data if they can’t interpret it?
It follows that not all data may be helpful to most people, and that access to data depends in part on the distinction between expertise and lay or non-expertise. This distinction makes it even more likely for the ordinary citizen to be left in a dependent relationship with practitioners, such as those in medical professions, who are in a position and have the skills to gather, or grant them access to, their data. As such, while there was a recognition among participants that more needs to be done to develop their own data literacy skills, it was also acknowledged that qualities or soft skills such as empathy and kindness should be at the forefront of what is expected of these professionals, particularly in the context of requesting and managing others’ data. As argued by P2 I1: They need to kind of understand that they’re a person, it's not just like, data… So, empathy would be important… how would they feel if their data was being shared like this? And then maybe they can handle it more appropriately.
CRs’ thoughts around their own and others’ limited knowledge and ability to access and use data were explored primarily during their first interviews – i.e., prior to their training. Promisingly, discussing data as part of the training enabled CRs to improve their critical understanding of the pervasiveness of datafication. When asked whether their knowledge of data had changed since their first interview, P10 I2 responded: ‘what I found helpful was realising how much data we use in a day, like me driving the car, my miles are being clogged, and what music I’m listening to. So, yeah, it’s changed’. Similarly, it was thanks to the training that other CRs also became more conscious and reflective about the social implications of datafication and the extent to which data collection processes may be opaque. As emphasised by P6 I2, ‘sometimes [… data] is not accurate and you’re not aware of what’s going on behind the scenes’. Asked whether anything had changed in terms of their understanding of data, P8 I2 said: ‘no …, it’s gathering information… [What] has changed massively [is] realising how much data we use every day’. This is why, as remarked by P4 I2, ‘you just got to be careful that you protect the data that you’ve got’.
Expectations of how data could be used
The fact that many CRs struggled to define the concept of data, or that many recognised gaps in their own and other residents’ data literacy skills and knowledge, does not mean that they had no or low expectations of how data could be used to improve the collective wellbeing of their community. On the contrary, throughout the project most CRs showed strong motivations for transforming the processes through which local decisions are made about their community through data, hoping that people in Widnes could potentially access and use that data to inform those decisions. As remarked by P13 I1, ‘people should … be empowered to have ownership over data regarding themselves’. Similarly, reflecting on the usefulness of the training, P10 I2 said: ‘I think … what came out of [… it] for me … was really around thinking about all of this data and then how does it actually improve the community [… and] what's anyone going to do with it’.
Relatedly, many CRs expressed hope about the Round ‘Ere project and the positive impact that it could have on their community through the development of a data hub designed to enable residents to access data about Widnes. P15 I1 said: ‘hopefully, it will influence the city region [and] the town local council … [and] will improve people’s wellbeing’, including, among different aspects, the physical environment and infrastructure as well as access to social benefits and medical support. This is why, as emphasised by P13 I1, it is important that people are supported to develop ‘a clear[er…] understanding of what the[ir] data [may be used for]’. Relatedly, it is essential that people learn not just about the constraints of datafication but also about the opportunities involved in gathering and using data. As explained by P10 I1: People don’t realize that … data … can help us improve services, help us improve the community… It can be harmful, but it can [also] be useful… A lot of people nowadays are very … closed on what they give out, and rightly so. But if you’re closed off, and you’re not going to give any information out, how are we going to improve?
It follows that processes of data collection and sharing have the potential to serve purposes of social good and the collective wellbeing of communities, yet provided people’ data is gathered and used in line with ethical standards and principles of confidentiality. Irrespective of how much they know about data, as emphasised by P12 I1, ‘people can be hesitant to give out information, but as long as they’re reassured that it’s protected and anonymous, then [… they may be] more willing to [do so]’. Equally, as P12 I1 went on to explain, if we are to expect people to gain an understanding of both the risks and benefits of how they their data is collected and shared, then more support and learning opportunities are needed at the community level to develop their data literacies: I think it’d be good if [… more support] could be on offer… It could be … a collaboration between, like, GP surgeries or health centres or libraries or, you know, colleges [and] universities … [where] there is access to … learning if people need a bit of support.
Discussion and conclusions
This study draws on a larger PDS project called Round ‘Ere designed to develop a data hub for residents of a marginalised town called Widnes in the UK. As part of this project, which adopted a PAR methodology, CRs were recruited and trained in research methods and notions of data relevant to their data literacies before collecting interview data from their fellow residents about wellbeing and the role of data in making decisions about their community. This article does not draw on the PAR approach employed by the project, nor does it present findings from the research conducted by CRs, with these findings published in the project’s final report focusing on the intersection of wellbeing and data (Capacity and LCR Civic Data Cooperative (2024). By contrast, based on semi-structured interviews conducted by the research team with CRs before and after the training, this article explores the importance of developing data literacies in the context of PDS.
As reviewed earlier in this article, PDS is a useful concept that has the potential to challenge datafication and facilitate public involvement in decision-making processes mediated by and about data (Patel, 2021). However, there is a dearth of research on the role of data literacies within PDS. Drawing on the Round ‘Ere project to start to fill this gap, this study found that, whilst CRs recognised significant gaps in their own functional and critical data literacies, with data often being referred to as an abstract and obscure concept, they had both strong motivations to better understand data and expectations for how this may be used to improve their community. Not only did most CRs struggle to define the word data during the project but, especially before the training, they also described the concept in primarily statistical and computational terms, with some acknowledging, more holistically, that it serves the social purpose of providing evidence to inform decisions. Promisingly, even though no formal evaluation of the training was conducted, insights from the interviews with CRs suggest that the training enabled CRs to think more critically about the power asymmetries inherent in the collection and use of data by corporations and institutions.
As explained above, data literacies incorporate functional and critical skills, knowledge and practices essential to access, interpret and manage data. Not only do these literacies relate to privacy, safety and security on personal, institutional and commercial levels, but also to civic action and empowerment, as examined within media literacy research inspired by critical pedagogy and the New Literacy Studies (e.g., Jenkins et al., 2016; Selwyn and Pangrazio, 2018). From this perspective, media, digital and data literacies are understood as crucial to challenging datafication and both institutional and corporate power (Selwyn and Pangrazio, 2018; Shresthova, 2016). Promisingly, the Round ‘Ere project was an opportunity for CRs to learn and think more deeply about the critical and social aspects that relate to datafication, with a focus not just on the risks that the collection and use of data present but also on their potential benefits. Thanks to the training and their involvement in the project, CRs became more aware of the pervasiveness of datafication and opacity of institutions’ and corporations’ data collection practices, realising how much data they generate daily and remarking on its potential misuse. At the same time, they had aspirations for how data could be better used to make decisions about and improve services in Widnes, including the physical environment and access to medical support, thus contributing to the wellbeing of their community.
On the one hand, the literature on PDS has praised its potential to facilitate participatory forms of data governance but has under-explored the role of data literacies within PDS (e.g., Barlett et al., 2024; Kelly et al., 2023 Patel, 2021). On the other hand, despite the limits to how much citizens can learn about datafication due to the often limited transparency of institutional and corporate data practices, media literacy research informed by the New Literacy Studies and critical pedagogy shows that (critical) digital and data literacies rely on functional and critical skills and knowledge, including an understanding of the broader digital environment and the ability to manage one’s own data and privacy to resist these practices (Jenkins et al., 2016; Selwyn and Pangrazio, 2018). Furthermore, we know from this body of work that a precondition for deploying critical data literacies and tactics are the motivations and expectations necessary to take action, including the aspirations for social change that orient people’s functional and critical engagement with data towards collective and democratic ends (Mihailidis, 2018; Pangrazio and Selwyn, 2019; Polizzi, 2023; Shresthova, 2016).
This article’s findings show that, while CRs recognised gaps in their own and other residents’ data literacy skills and knowledge, they showed motivations to better understand data. Relatedly, they expressed expectations for how data could be used to improve their community – a central element of the Round ‘Ere project and PDS more broadly. At the same time, they emphasised the inherent obscurity of the concept of data, which is exacerbated by the opacity of data collection practices. As such, this study bridges PDS work with media literacy research on critical data literacies, thus addressing a gap in the literature. CRs discussed the need for more support and learning opportunities to be provided in Widnes – for example, through partnerships between GP surgeries and libraries – to equip them and their fellow residents with key functional and critical data literacy skills and knowledge, including a balanced understanding of the benefits and downsides of data.
CRs did not, nor were they expected to, join the project as data activists, and their data literacy gaps are perhaps not surprising, given the extent to which large portions of the UK population lack fundamental data literacy skills and knowledge, including to manage their privacy settings or understand how their data is collected and used by social media platforms (Carmi and Yates, 2023; Ofcom, 2025; Yates et al., 2021). Promisingly, CRs’ motivations and expectations to improve their community provided them with a strong foundation for appreciating the value of the training and their involvement in the project as opportunities to become more critical and vocal about datafication. Kennedy (2018) highlights the importance of attending to the everyday lived realities of non-experts in understanding the impacts of datafication and promoting data activism. While any assessment of whether any of the CRs became data activists by the end of or after the project was outside the scope of the project, it seems fair to suggest that developing the (critical) data literacies of non-experts is essential for such an endeavour.
The findings presented in this article invite researchers, practitioners and policymakers to think more deeply about the importance of developing citizens’ data literacies within localities that, similarly to the industrial town of Widnes, may be considered marginalised and affected by similar data literacy gaps. Relatedly, if we are to expect PDS to potentially empower communities in a datafied society, this can only happen if these communities have the data literacies they need to understand and engage, both practically and critically, with data.
Within the PEST field, the public deficit model, which posits that limited scientific knowledge among the public leads to the failure of scientific advancements, has shifted over the decades to a model where science literacy is seen as crucial for enabling citizens’ participation in decision-making processes (Bauer, 2009; Rempel et al., 2018; Stirling, 2008). In the age of datafication, citizens’ involvement in decision-making mediated by and about data relies on the adoption of participatory democratic structures and the development of their data literacies. Participatory forms of data governance, just like other forms of participatory democracy, suffer from issues of large-scale feasibility, remaining at the discretion of isolated local governments and initiatives (Parvin, 2021). This means that PDS and citizens’ data literacies are not sufficient in ensuring that communities are given opportunities to impact data systems. However, they are mutually dependent since, together, they have the potential to shift power dynamics relating to how decisions are made through and about data. Irrespective of whether we understand the role of the public in data systems in ways that range from informing or consulting to involving, collaborating or empowering communities in making decisions (Patel, 2021), the very existence of PDS cannot be disentangled from notions of data literacy.
Limitations
This study's small-scale nature and sample mean the findings above cannot be generalised to different contexts within or outside the UK. However, this article offers context-specific insights into the importance of developing citizens’ data literacies for PDS that may apply to settings similar to the industrial town of Widnes, thus paving the way for further research in this area. Another limitation relates to the risk of potential self-selection bias, since the CRs who took part in the project may have had existing motivations to develop their data literacies. Finally, while the research team consisted of three academics from different age groups and from the UK, Europe and North America, our own biases, as progressive middle-class individuals, may be reflected in our analysis of the data.
Future directions and implications
Future work should build on this study to further explore the importance of developing data literacies among different communities involved in PDS initiatives. More research is needed on the ways in which digital literacy training can be designed and delivered as part of PDS interventions, and on the role that both formal and informal learning play in developing citizens’ data literacies as part of PDS. Furthermore, this article has practical and policy implications, with three main implications outlined below.
Implication 1. This study is grounded in the conviction that more participatory and deliberative initiatives are needed to ensure that PDS is not just a useful exercise to explore the potential of challenging power structures and datafication but a useful framework for reconfiguring those structures in ways that gravitate around the lived experiences of different communities. Provision of data literacy interventions and training is paramount for undertaking this task.
Implication 2. The UK, where there are considerable data literacy gaps among the population, has a vibrant media literacy landscape. As in the case of other countries in Europe and beyond, such a landscape relies, to some extent, on formal education structures alongside initiatives delivered by third-sector organisations outside these structures. Their delivery of such initiatives, however, is affected by significant challenges such as limited funding and the absence of a comprehensive framework for coordinating their work (Polizzi et al., 2025; UK Government, 2026). More efforts are needed to ensure that data literacy provision is delivered through broader media literacy provision across the UK and that such provision is better funded and supported.
Implication 3. To reach more citizens and communities from different backgrounds, it is important to maximise the delivery of data literacy provision both within and outside the scope of PDS. The Round ‘Ere project, which was designed with this provision in mind, focused on equipping CRs with the knowledge necessary to critically reflect on and challenge datafication. Future work supporting the potential of PDS in reconfiguring decision-making in a datafied society should build on this study to design, roll out and evaluate PDS interventions that incorporate both functional and critical data literacies.
Footnotes
Acknowledgments
The authors would like to thank the two organisations that were partners on the project: the Liverpool City Region Civic Data Cooperative (CDC) and Capacity. Without their valuable support, this project would not have been possible. We also wish to thank the CRs who participated in the study.
Ethical approval and informed consent
The project was approved by the University of Liverpool Research Ethics Committee (reference numbers: 11798 and 12124), and written informed consent was obtained from participants.
Author contributions
This article's conceptualisation and literature review were co-led by Gianfranco Polizzi (GP) and Emily Rempel (ER), with support from Simeon Yates (SY). All three authors contributed to the study's research design and delivery of the training, with GP and ER undertaking the data collection and thematic analysis of the data. All authors drafted the manuscript, with GP leading the writing.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was funded by Liverpool City Region Combined Authority.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
