Abstract

Like many readers of this journal I often ask myself why we still seem to know so little about the effectiveness of public management reforms labelled as the ‘New’ Public Management over 30 years ago. Christopher Pollitt goes some way towards answering this question in his thoughtful review of the ‘logic’ of performance management. In particular Pollitt focuses on ‘alternative logics’ to the rational assumption that there is an unproblematic relationship between target setting, incentives and performance improvement. He notes that it is well-understood in organisational research that decisions can be influenced by values and emotions as well as rational calculation; and by a sense of ‘appropriateness’ as well as of ‘consequences’. Many factors are likely to influence how far different actors such as ministers, top officials, operational staff, legislators and indeed citizens understand and engage in any performance management system. Tactics such as gaming, cheating, and the symbolic use of performance data are common; underpinned by a range of alternative logics ‘that dilute or distort the simple, instrumental rationality of “hit your target and get rewarded”.’ Evaluations are needed to understand what is going on behind the stated logic of incentivised performance. However this is not easy given the multitude of factors that play out in different contexts. Pollitt suggests that a ‘realist’ approach is particularly suited to this evaluation challenge because of its ‘focus on the logic(s) of those responding to the intervention’. He also recognises the difficulty of evaluating ‘cheating’ and other forms of performance misrepresentation. In order to see beyond the various alternative logics he discusses, Christopher Pollitt poses the realist questions: ‘What are the generative mechanisms of this particular design of PMS?’ and ‘How do they interact with the particular context?’ As he concludes: ‘A thorough study of alternative logics has been a long time coming.’
Gill Westhorp introduced her notions of ‘complexity consistent’ and multi-layered theory in a previous article published in issue 18(4) of Evaluation. This advocated a combination of complexity theory and realist evaluation approaches insofar as both share certain assumptions about causation. Westhorp also advocates using substantive theory from various disciplines. In so doing she addresses criticisms of much ‘theory informed’ evaluation that risks over-dependence on the theories of stakeholders whilst ignoring substantive theory based on research: in this article Westhorp followed both tracks. She takes her ideas forward by applying them to the case of a small-scale early-childhood intervention programme and iteratively develops a ‘theory of negative impacts’ based initially on stakeholder hypotheses and a narrative literature review. Various theories - attachment theory, social judgements theory, social capital and social inclusion/exclusion theory - are identified, each at different ‘systems’ levels from the parent/child through to the societal. Theories are layered such that ‘lower’ levels of theory describe causal processes that account for outcomes at ‘higher’ systems levels. These theories are refined in turn by empirical investigation and a subsequent synthesis of broader research literatures which, following realist principles focuses on outcomes, mechanisms and contexts.
It is worth standing back and reflecting on Gill Westhorp’s analysis from other standpoints. There is now a widespread consensus that complex analyses are needed to evaluate and explain programmes which are themselves complex. What we often lack is methods to take this project forward. Gill Westhorp’s two articles are therefore a welcome step in advocating a suite of complexity-appropriate methods. David Byrne also reminded us in the last issue of this journal (the Special Issue on ‘case studies’) that most social systems are by their nature complex. In considering the evaluation of such complex systems Byrne also pointed to the opportunities to combine complexity informed thinking with realist methodology. This is perhaps not surprising given the increasing prominence of ‘realist’ ideas in theories of causation across the sciences.
The fusion of complexity thinking with ‘realist’ evaluation approaches is also a feature of Ray Pawson’s new book The Science of Evaluation: A realist manifesto. Pawson not only deepens his exploration of realist methodology but also devotes three chapters to evaluating complexity. ‘Realist’ ideas and methods have featured prominently in the pages of this journal throughout most of its existence. We have been proud to publish key articles by Pawson and his collaborators – from the fascinating early ‘debate’ between Pawson and Tilley and a vociferous critic of ‘realist’ methodologies, David Farrington (see Volume 4(2), 1998); through to the discussion of ‘realist diagnostic workshops’ in Volume 18(2), 2012. We have also published many other articles by those applying and building on ‘realist’ evaluation thinking. We thought it right therefore to mark the publication of an ambitious new book by Ray Pawson by commissioning an extended review essay from Brad Astbury. In his review Astbury looks beyond the most recent Pawson text and confronts Pawson’s canon with the arguments of other methodologists and thinkers. Building on a short article published in Evaluation in 2011, Pawson also devotes his opening chapter to earlier realist and sociological thinkers who stand as his precursors. Astbury’s review whilst not uncritical highlights the importance of the territory that Pawson has mapped out over the last 20 years or more: ‘Pawson’s manifesto deserves to be read widely and vigorously debated.….’.
Dagmar Simon and Andreas Knie from the Germany’s Wissenschaftszentrum present findings from a major comparative study of Research Evaluation Systems (RES) in the Netherlands, the UK and Germany. The article considers the evaluation of scientific disciplines and institutes as one instrument of policies that aim to improve the performance of national science systems and thereby national capacities to compete in global markets. According to the authors ‘Evaluations have a ‘steering’ role in governance-arrangements that have to manage many sources of control and influence’. The article poses the question: ‘Can evaluations advance organizational development processes in these institutions?’ The question is asked because so many of the reforms being encouraged in Higher Education require organisational change in governance, management and coordination processes. However there is a tension in instruments that both pursue national policy goals whilst also supporting autonomy and self governance at both institutional and disciplinary levels. At the heart of all three HE evaluation systems that Simon and Knie discuss are peer reviewers, ‘who exercise a dominant role’ and are mainly concerned with disciplines rather than institutions. This is despite the fact that ‘some peers indeed see it as their task to make recommendations for the further development of institutes/faculties’. Furthermore even if peers do not address organisational issues directly, governing bodies may draw organisational lessons even from recommendations that are mainly concerned with disciplinary or individual performance. The authors argue that although there are differences in RES across countries, organisational issues are insufficiently addressed. In their view there is a need to further strengthen the organisational development orientation of HE evaluation systems.
This journal showcases many different evaluation traditions: Gillian Fletcher and Suzanne Dyson are in the constructivist tradition broadly following Guba and Lincoln’s Fourth Generation evaluation approach. They also emphasise participatory and collaborative forms of evaluation research, another interrelated and important strand in evaluation scholarship and practice. Fletcher and Dyson describe the methodological challenges they faced when evaluating projects that promote cultural change and social justice: one concerned with violence against women; and the other ‘safety’ and’ inclusion’ for ‘people of all sexualities and gender identities’. The authors ‘share a commitment to social justice and transformative learning’ and understand their methodology as creating a ‘transformative learning environment’. They adopted an ‘embedded’ role closely engaging with project stakeholders: they see themselves as ‘unbiased’ but not ‘objective’. As such their evaluative strategy aimed to support project functioning as well as demonstrate project successes and failings. Essentially the article falls firmly into the ‘improve’ rather than ‘prove’ category of evaluation purposes. In their ‘constructivist’ evaluative practice the ‘evaluator holds up a mirror, in which project process and assumptions can be re-viewed for ongoing learning’. Fletcher and Dyson discuss both the benefits and difficulties of the approach they adopted. Benefits for example included gaining greater access to ‘inside’ knowledge than would otherwise have been possible; difficulties included managing their role and the expectations of others. On the other hand the authors remain interested in outcomes. They claim that as a result of the evaluation there were ‘stronger outcomes than if the evaluation findings were presented only at the end of the project cycle, too late for reflection and review’.
How evaluation has been taken on by the church of Sweden is pithily summed up by Verner Denvall and Stig Linde in the title of their article: ‘Knocking on heaven’s door’: The evaluation community goes to church. Like Christopher Pollitt, the authors are interested in ‘performance’ but in a single institutional setting of the Church of Sweden rather than in world-wide public administrations. They analyse how one particular set of global ideas about performance and quality improvement with a ‘customer’ focus - TQM (Total Quality Management) was rebranded as Church Q in the Swedish Church. For the authors TQM is to be understood as an idea that ‘travels’. They identify tendencies in modern organisational thinking that argue that the diffusion of innovations lead to an ideal or even homogenised type of organisation; and others tendencies that support diversity; that emphasise the translation and adaptation of common ideas in specific settings. Denvall and Linde are interested in how pre-existing ideas and practices transform ‘travelling’ ideas such as TQM in the Swedish Church. They studied 3 parishes within the Church each of which exemplify different starting conditions. None of the parishes take on the full intent Church Q. ‘The TQM focus on measurement, performance, and quality is lost in the translation .….’ For example the TQM emphasis on top-management commitment and leadership does not sit well with Church traditions of ‘parochial independence’. In the broader world of evaluation this article can also be read as contributing to debates about programme ‘fidelity’ and the customisation or contextualisation of interventions – albeit this is not always backed by the compliance mechanisms that faith-based organisations can sometimes deploy!
