Abstract

Despite differences in focus and purpose, articles in this issue address common themes. They value theory building alongside theory testing; explore links between evaluation and social-science theory; try in different ways to move beyond the oft-criticised ‘linearity’ of traditional evaluation practice; encourage practitioners to adopt a more expansive framing of their ‘objects of evaluation’ that situates initiatives, policies and programmes in a wider socio-political context often over extended timescales; and they assume that interactivity, reflexivity and ‘learning-by-doing’ is as much part of evaluation practice as measurement or description of causes and effects.
Gabriela Camacho Garland and Derek Beach are interested in the ‘inner workings of an intervention’ in order to answer, ‘how does it work?’ type questions. For such questions, the authors argue that Process Tracing, or more precisely ‘process theories of change’ (pToC), is especially relevant. This is because of its ‘relational character’ insofar as ‘real-world interventions [. . .] are a series of interactions with other actors [. . .] over a period of time during which a program is implemented’. If such relational qualities are ignored, the risk is that a theory-based evaluation will focus mainly on ‘the delivery of planned activities’. Garland and Beach consider a pToC as necessary both to establish what constitutes evidence and to produce ‘actionable knowledge’. The challenge when relying on theory-based approaches, as many evaluators know, is to decide how much detail is needed to theorise the links between interventions and results. The authors’ discussion of ‘sequences of actions’ in ‘key episodes’ seeks to steer a mid-course between ‘drowning in detail’ and oversimplification. The majority of this article is devoted to ‘a practical step-by-step guide to theorizing a pToC’ – to defining interventions and their contribution; ‘identifying potential contribution pathways’; and ‘unpacking’ pathways into ‘key episodes’.
Methodological integration in evaluation – combining aspects of Realist approaches, QCA (Qualitative Comparative Analysis), Process Tracing and Contribution Analysis – has been evident for some time in articles published in this journal. Such integration draws selectively not only on different ‘theory-based’ brands but sometimes on very different evaluation ‘schools’ including statistical models and even experimental trials, depending on evaluation purposes. Process Tracing shares much with other theory-based approaches such as Realistic Evaluation or Contribution Analysis, as is clear, for example, when Gabriela Camacho Garland and Derek Beach discuss ‘context’ or what constitutes ‘causal links’. The attention that the authors pay to the importance of focusing on ‘key episodes’ has resonance across evaluation approaches and methods – not only across different theory-based approaches.
While most evaluators now see ‘theory’ as a necessary part of their practice this is mainly from a methodological standpoint in relation to particular policy, programme or project interventions. Theory encapsulates what we assume, deduce or hypothesise about the intended or maybe unintended consequences of these interventions. Drawing on their own evaluation and research with indigenous communities in Australia, John Guenther, Ian Falk and Michael Cole adopt a broader understanding of theory. An understanding they elaborate by way of ‘three illustrative examples of social programme evaluation in Australian contexts’.
Rather than exclusively regarding theory as an aid to decision-making or programme design, the authors are interested in theory building that occurs as part of an evaluation, as ‘a form of generalisation’. Specifically they are interested in how qualitative (or mixed-method) evaluation, particularly in the social sphere can by positing new ‘propositions’ or ‘principles’, contribute to useful social theory in the social sciences. Notions of evaluation use are key to the authors arguments. Sound theory should make a generalisable contribution to a broader understanding of future as well as current programmes, and use necessarily extends beyond current users to also encompass potential future users. This stance as Guenther, Falk and Cole suggest has implications for the form of evaluation questions posed: ‘Open, future oriented, general and issues-focused questions can lead to new knowledge beyond the constraints of the programme’. The evaluation process is also likely to be different. Building social theory, which may include testing a theory of change, is likely to be an ‘iterative’ and ‘reflexive’ process that spans more than one programme; focusing not only on programmes but also on the ‘the issues they [these programmes] are trying to address’.
Tim Strasser and Joop de Kraker focus on the challenges associated with evaluating ‘local grass-roots initiatives’ – often part of ‘translocal networks’ – that pursue ‘social innovation’ in the face of today’s climate emergency and related crises. The authors sum up their stance early on: ‘conventional evaluation approaches employ linear models of causality, focus on individual projects or programmes and primarily serve accountability to funders, which can limit or even corrupt the transformative potential of social innovation’. To an extent the authors are revisiting the familiar ‘accountability versus learning’ debate, seeking to reconcile this dichotomy in the face of the uncertainties of social innovation by shifting ‘the focus of accountability from delivery of pre-defined plans to the learning process itself’. This article outlines a ‘practice tool’ that builds on a previously developed and empirically tested ‘conceptual model’ of ‘transformative social innovation’ – the 3D Framework. This framework suggests how ‘network leadership’ contributes to ‘transformative capacities’ that can have ‘transformative impacts’. The tool – SCALE 3D – poses questions for each of the three elements of the 3D Framework, in order to support evaluation and strategy development by those engaged in ‘translocal initiatives’. It is noteworthy that the framework and the tool simultaneously address design, strategy formulation as well as evaluation and monitoring. The authors describe how the prototype tool was developed and tested, taking on board experience and feedback along the way.
As Seweryn Krupnik, Anna Szczucka, Monika Woźniak and Valérie Pattyn, the authors of the next article note, QCA will already be familiar to readers of this journal. (For those who are not familiar with QCA, this article, in particular, the discussion in the section titled ‘Configurational theorizing . . .’ provides an accessible introduction.) According to the authors QCA is especially suited to theorising about the ‘configurations’ that help ‘understanding why programmes work’. The authors argue that: ‘drawing lessons’ in public policy is often ‘updated after multiple feedback loops’. Hence the logic of ‘consecutive’ rounds of QCA that ‘can serve cumulative knowledge-building about the evaluand, more so than when engaging in a single “standard” QCA cycle’. The potential of ‘consecutive rounds of QCA’ to iteratively develop ‘configurational theory’ is illustrated with two evaluation cases of R&D (Research and Development) subsidies in Poland. The authors note that ‘configurational theorising’ is not necessarily tied to QCA as a method. An awareness of multiple causal factors working through different configurations has far wider applicability. Krupnik, Szczucka, Woźniak and Pattyn conclude that consecutive QCA is especially useful when a ‘series of project or program evaluations [. . .] are conducted at different moments in time’. Implicitly consecutive QCA is also useful when the projects and programmes that make policies operational are implemented over an extended timescale. It is not only that we can understand ‘why programmes work’ by iterative evaluation of a particular programme. Often it is only across different programmes that plausible theories and impactful configurations can be theorised and verified.
Daniel Esser and Heiner Janus examine the accountability/learning relationship in the German foreign aid setting, which as they point out is now the world’s second largest international development aid system. The argument here contributes to ‘an ongoing debate among evaluation scholars’ about evaluating the substantive phenomena of organisational learning and accountability – both important in international development – through a new lens. The authors begin by discussing the accountability/learning debate as it has appeared over the years, much of it in the pages of this journal. They note that some protagonists have emphasised trade-offs, some the dominance of either learning or accountability, and some see these two important institutional processes as of equal value and reconcilable.
Esser and Janus understand organisational learning as ‘situated and social processes of transforming individual experience into organisational knowledge on development practice’. Accountability, on the other hand, is seen as shaped by the institutional specifics of German foreign aid implementation where there are ‘two major implementing organisations [GIZ and KfW] functioning as “accountability consolidators”’. The authors discuss the accountability/learning debate from a New Public Management and a Development Studies perspective before settling instead on Organisational Sociology and in particular the ideas of Erving Goffman. This privileges ‘socialisation’ and ‘sanctions’ in organisations within a setting of ‘organisational norms both explicit and tacit, which attain causal power’. Esser and Janus use Goffman’s ‘frontstage/backstage heuristic’ notions to frame how actors ‘perform’ both backstage, that is, within implementing agencies, and frontstage, that is, inter-organisationally between agencies and with the relevant government ministry BMZ.
Harking back to earlier articles in this issue, where evaluation sometimes ‘builds’ or ‘tests’ policy-specific theory and sometimes aspires to contribute to new social theories, here we have a well-developed body of existing sociological ‘theory’ adapted to provide a new and challenging evaluative lens. Perhaps these distinct routes by which ‘theory-in-evaluation’ features in evaluation practice begins to disentangle the multiple and undifferentiated meanings sometimes loaded onto ‘theories of change’ and ‘theory-based-evaluation’.
Concerns for accountability in the field of research have fuelled a growing interest in research impact. As Daria-Maria Gerke, Katrin Uude and Thorsten Kliewe observe, scientific impact is much easier to measure – and indeed conceptualise – compared with societal impact. However, it is societal impact that is demanded by those seeking to demonstrate a ‘return on investment’ from public investments in scientific research. While scientific impact involves mainly academic interactions, societal impact involves ‘multidimensional interactions between researchers and societal actors’ over extended timescales. The authors note that the multiparty character of societal impact has sometimes been labelled as ‘co-production’; a term that can encompass terms such as ‘participatory research’, ‘research-practice partnerships’, ‘citizen science’ as well as interdisciplinary and transdisciplinary research. Gerke, Uude and Kliewe draw on existing concepts of ‘value co-creation’ (VCC) that already incorporate ‘co-production’ and ‘value in use’ to better conceptualise and capture the ‘societal impacts’ of research. Starting from a ‘state-of-the-art’ literature review of societal impact and co-creation, Gerke and colleagues undertake ‘theory synthesis’ as a stepping-stone to constructing a ‘model’ that aims to ‘explain why and how outcomes are achieved’. This synthesis is framed in terms of six ‘propositions’ that aim to conceptualise ‘the interplay of societal impact and co-creation with regard to impact creation’ and a further five propositions that specify requirements for a ‘generic research impact assessment framework’. Rather than ‘defining measurable indicators’ in advance, the authors favour a participatory evaluation process that involves implicated stakeholders, able to create indicators adapted to ‘each project’s specificity’.
In this article, Gerke, Uude and Kliewe, like Esser and Janus previously, import their theoretical building blocks from elsewhere, in this case from ‘marketing’ theory – on reflection understandable given the concern here is with ‘value creation’ in a market economy. This and other articles in this issue demonstrate the eclectic nature of theory building in evaluation practice more generally. An argument surely for an interdisciplinary capability in many evaluation teams. An argument also for the importance of more evaluation teams having capacities to engage with – or even co-produce evaluations with stakeholders and citizens in participatory ways. After all ‘societal impact’ is not the sole preserve of scientific research.
Bente van Oort, Hilda van ’t Riet, Adriana Parejo Pagador, Rosana Lescrauwaet Noboa and Carolien Aantjes are concerned with evaluating advocacy work, important ‘to strengthen strategies, extract lessons learned, and determine the next steps’, and for actors such as nongovernmental organizations (NGOs) to demonstrate accountability to funders and stakeholders. This is a sector where there is a ‘low level’ of evaluation activity, partly because of the constantly changing nature of advocacy itself. The authors propose a type of ‘learning-oriented approach’ that they describe with the acronym PPE – or ‘participatory process evaluation’. This ‘emphasizes stakeholder engagement and collaboration in the evaluation process, with a focus on understanding the process of program implementation and the perspectives and experiences of those involved [. . .]’. In the view of the authors existing evaluations that rely, for example, on goal-orientated ‘theories of change’ even when adapted to emphasise processes do not go far enough. Instead they suggest that ‘evaluation should provide reflection into the complex process of change, personal decision-making mechanisms, and capacity development trajectories during the program life span’ as well as capture ‘additional outcomes’ not anticipated when ToCs are prepared. The authors regard project personnel as the ‘end users’ of a more participatory approach that analyses ‘project learning processes’.
This approach was field-tested with Cordaid, a large Dutch development NGO and in particular with their Global Health Global Access programme. The Cordaid case study describes how this ‘participatory’ and ‘learning’ approach was designed and implemented. Reportedly this learning-oriented evaluation approach supported practice improvements and identified ‘gaps and opportunities for improvement but not necessarily [. . .] the effectiveness and impact of the program’.
This article raises many of the classical dilemmas of participatory evaluations across many sectors and settings. For example, how acceptable to funders is a ‘goal-free’ evaluation approach that emphasises processes while downplaying outcomes? Are ‘theories of change’ necessarily predetermined or is there not in many evaluations a need for ToCs to be subject to rounds of revision and re-design as an evaluation unfolds? What are the risks – as well as the strengths – of ‘internal evaluations’ for examples in terms of bias or openness to multiple perspectives?
The authors recognise their approach could well be ‘combined’ with more ‘outcome-oriented’ approaches. This underlines the importance of evaluation portfolios rather than single all-purpose evaluations. The consequence of the ‘impact turn’ in evaluation has been to prioritise the accountability requirements of funders often at the expense of opportunities to improve practice. This will be especially the case if there is only one evaluation implemented. The authors have demonstrated the potential benefits in terms of learning and practice improvement of a strong participatory, learning-oriented evaluation. In some circumstances, this may suffice but not always. It is important, however, to be reminded that evaluations that aim to improve practice are also important.
