Should pay-for-performance schemes be locally designed? evidence from the commissioning for quality and innovation (CQUIN) framework

Abstract

Objectives

It is increasingly recognized that the design characteristics of pay-for-performance schemes are important in determining their impact. One important but under-studied design aspect is the extent to which pay-for-performance schemes reflect local priorities. The English Department of Health White Paper High Quality Care for All introduced a Commissioning for Quality and Innovation (CQUIN) Framework from April 2009, under which local commissioners and providers were required to negotiate and implement an annual pay-for-performance scheme. In 2010/2011, these schemes covered 1.5% (£1.0bn) of NHS expenditure. Local design was intended to offer flexibility to local priorities and generate local enthusiasm, while retaining good design properties of focusing on outcomes and processes with a clear link to quality, using established indicators where possible, and covering three key domains of quality (safety; effectiveness; patient experience) and innovation. We assess the extent to which local design achieved these objectives.

Methods

Quantitative analysis of 337 locally negotiated CQUIN schemes in 2010/2011, along with qualitative analysis of 373 meetings (comprising 800 hours of observation) and 230 formal interviews (audio-recorded and transcribed verbatim) with NHS staff in 12 case study sites.

Results

The local development process was successful in identifying variation in local needs and priorities for quality improvement but the involvement of frontline clinical staff was insufficient to generate local enthusiasm around the schemes. The schemes did not in general live up to the requirements set by the Department of Health to ensure that local schemes addressed the original objectives for the CQUIN framework.

Conclusions

While there is clearly an important case for local strategic and clinical input into the design of pay-for-performance schemes, this should be kept separate from the technical design process, which involves defining indicators, agreeing thresholds, and setting prices. These tasks require expertise that is unlikely to exist in each locality. The CQUIN framework potentially offered an opportunity to learn how technical design influenced outcome but due to the high degree of local experimentation and little systematic collection of key variables, it is difficult to derive lessons from this unstructured experiment about the impact and importance of different technical design factors on the effectiveness of pay-for-performance. Balancing the policy goal of localism with the objective of improving patient outcomes leads us to conclude that a somewhat firmer national framework would be preferable to a fully locally designed framework.

Keywords

de-central design pay-for-performance P4P design

Introduction

The use of pay-for-performance (P4P) schemes in health care is now widespread¹ despite the lack of evidence that it improves health outcomes.^2–5 A key lesson from the published literature is that the design of these schemes and the context into which they are introduced are key determinants of their impact.^6–10

Design issues that have previously been addressed by the literature include the choice between process and outcome indicators, the number and mix of indicators to include in schemes, risk adjustment of indicators, whether to aim incentives at teams or individuals, the use of rewards or penalties, the size of the incentives, adoption of absolute or relative performance targets, and the frequency of performance evaluations.¹¹ How the themes to be addressed by an incentive scheme are selected has been given less attention so far.

It has been suggested that design of an effective P4P scheme requires the ‘right combination of information and ignorance’ of payers and providers.¹² If the payer has no knowledge of individual provider’s health production functions – the medical processes that improves health, how can an incentive scheme that will improve the quality of care successfully be designed? Involving local stakeholders in the design of P4P can reduce this asymmetry of information between payer and provider, and there is some evidence that schemes that involve stakeholders in the selection and definition of targets to include in the scheme show larger positive effects than schemes that do not⁴ although such results are potentially vulnerable to selection bias. These potential benefits should be balanced, however, with the moral hazard problem created by decentralizing P4P design; how can the central policymaker ensure that the locally developed schemes address their initially intended objectives?

This study contributes to the literature on P4P design, by offering a unique insight into the outcome (in terms of scheme design) of a decentralized design process that allowed local decision makers to design their own P4P schemes within a national framework.

We study the Commissioning for Quality and Innovation (CQUIN) scheme – an initiative announced for encouraging English NHS organizations to focus on quality in the Department of Health (DH) White Paper High Quality for All.¹³ The White Paper outlined how the CQUIN framework would introduce financial incentives for quality improvements by allowing commissioners to pay providers for improved outcomes, ‘build on national and international best practice’, and ‘be flexible to suit local circumstances’ (p.42).

The associated Impact Assessment,¹⁴ a mandatory assessment of the potential costs and benefits of new legislation undertaken prior to introduction and conducted as part of the policy design and approval process, considered options for how this aim of local flexibility could be best achieved. The option of locally negotiated payment schemes was compared to a nationally mandated option, and it was concluded that, although likely to generate higher costs for the development of schemes, locally negotiated schemes were to be preferred.

Echoing the literature on P4P design with respect to the potential benefits of decentralized design of incentive schemes, two main arguments were put forward by the DH for preferring a locally designed option: a lack of central information about local needs for quality improvement (locally developed schemes, it was thought, would identify heterogeneous goals and priorities for quality improvement across providers and ensure adaptation of schemes to local quality issues); and high local enthusiasm and commitment would be generated by the involvement of local stakeholders in the design process.

Against this background our research focused on three questions. These are whether local development of incentive schemes is successful in: identifying heterogeneous local needs and priorities for quality improvement; generating local enthusiasm; and leading to schemes that address the original national objectives? In addition, we use our findings to derive lessons for the future design of P4P schemes that aim to address central objectives but decentralize the actual design of schemes.

To enable us to assess both the design outcome of local schemes and their likelihood in generating local enthusiasm, we apply a mixed methods approach.

The CQUIN framework and policy context

To assess the extent to which the central requirements for the locally developed schemes were followed, a detailed description of the motivation for, expectations of, and content and context of the CQUIN framework is required.

In the English NHS, health care is free at the point of delivery, but delivered to patients through an ‘internal market’ that splits the NHS into a commissioner and a provider side. At the time we conducted our analysis, 151 primary care trusts (PCTs) were responsible for commissioning health care services for their local populations from provider organizations supplying acute hospital, mental health, community and ambulance services. Commissioners and providers negotiated an agreement using the ‘national standard NHS contract’ – a framework for legally binding contracts introduced in 2007.^15,16 Providers could service more than one commissioner. In this case one ‘Co-ordinating PCT’ negotiated the contract on behalf of all PCTs using services from the provider.

The CQUIN payment framework required that the national standard contract made a proportion of provider income conditional on the achievement of locally agreed quality improvement and innovation goals for providers of acute, ambulance, community, mental health and learning disability services. According to the DH’s pre-introduction Impact Assessment,¹⁴ the intention of the policy was ultimately to support the provision of better patient outcomes through increased patient safety, clinical effectiveness, improved patient experience and more innovative service delivery. The policy was intended to support this aim through a ‘cultural shift’ by ‘embedding quality improvement and innovation as part of the commissioner-provider discussion everywhere through making the payment system reflect quality’.¹⁴

In the CQUIN terminology, a ‘scheme’ is a package of ‘goals’ and ‘indicators’ that providers agree with their commissioners. A ‘goal’ is a description of the intended objectives being incentivized by the CQUIN scheme. These ‘goals’ must be measurable using at least one, but often more, defined ‘indicators’. For example ‘discharge planning and communications’ is a frequently included goal, for which the proportion of patients being readmitted within a given time is a frequently used indicator.

By provider we mean a provider of either acute hospital, mental health, community or ambulance services. By commissioner we mean the PCT designated as the ‘Co-ordinating PCT’ to negotiate the annual contract on behalf of all PCTs. When referring to negotiations between providers and commissioners we mean the negotiation of the scheme between representatives from both sides. These comprised quality managers (often with clinical backgrounds), finance and information staff and, often, Medical and Nursing Directors. Occasionally, we shall also refer to the negotiators as the ‘scheme developers'.

The Impact Assessment considered two main options for CQUIN; a nationally mandated scheme with payments linked to performance on a set of national standardized indicators, or a scheme in which the operationalization of quality was agreed upon locally within a national framework.

The Impact Assessment suggested that a nationally standardized scheme was potentially counter-productive given that it would be ‘imposed upon the provider rather than coming directly from NHS staff and reflecting local priorities’.¹⁴ Locally negotiated schemes were thought to be more likely to generate local ‘enthusiasm'. It was thought that a nationally standardized set of indicators might also miss areas of quality and innovation important for local needs, and would result in different levels of ‘stretch’ for different organizations, although it was recognized that the latter could be addressed by setting different targets for different organizations. In addition, the evidence base for choosing national indicators for quality and innovation was not deemed sufficiently robust to decide on a set of indicators that would be likely to generate quality improvements across the country.

Locally agreed schemes were seen as being less likely to inappropriately distort resources away from areas not covered by the scheme. They were also seen as being more likely to set a realistic level of ‘stretch’ given providers’ current level of performance and would expose providers to less financial instability than a scheme with nationally set targets. Finally, it was feared that a national scheme would potentially induce gaming, for example in the form of adverse selection of patients to avoid financial risk. The Impact Assessment recognized that this could be mitigated by risk-adjusting the quality indicators. Nevertheless, it was assumed that locally agreed schemes would be less likely to lead to gaming, as schemes would be tailored to local circumstances.

The Impact Assessment did recognize that locally selected performance indicators could impede benchmarking across providers. However, it was anticipated that locally negotiated schemes would use existing national ‘standard definitions of indicators' where possible, and merely supplement these with indicators designed to target specific local needs otherwise. It was thus hoped that a larger degree of standardization would arise over time and make benchmarking possible.

In the end, a model that emphasized local discretion in design of CQUIN schemes was selected. The content of local schemes was to be negotiated between a commissioner and a provider but had to conform to specific requirements set out in a guidance on how to use the framework¹⁷ to ensure alignment with the national objective of improving quality of care in the English NHS.

The ultimate goal of the CQUIN framework was to improve patient outcomes. However, the CQUIN guidance advised that process and structure indicators were also allowed if there was a ‘clear link to quality’.¹⁷ According to the CQUIN guidance, all schemes had to contain at least one area for improvement in each of the four dimensions of quality: safety; effectiveness; patient experience; and innovation. The innovation contribution was expected to be focused on the last two stages of the following three stages of innovation: invention (the initial conception of a new idea); adoption (first application of the idea to actual practice); and diffusion (adoption of the innovation by others).

Although the CQUIN scheme is mainly a framework for the development of local indicators, the CQUIN guidance for 2010/11 specified two national goals to be used in acute care. Strategic Health Authorities (regional bodies responsible for enacting DH policy) could also influence the contents of CQUIN schemes by mandating or suggesting goals and indicators.

The CQUIN framework specified a fixed share of contract value that had to be linked to achievement of the agreed quality indicators. Therefore, commissioners had the option of negotiating few goals/indicators, which would have relatively high power, and negotiating many goals/indicators, which would have relatively low power. The relative importance of different indicators could be reflected in the scheme by associating each indicator with a weight reflecting the proportion of the total CQUIN payment attached to that indicator. Commissioners also had to take account of the reporting and monitoring costs when deciding on the nature and number of indicators.

The size of the incentive in the CQUIN scheme was increased from 0.5% of a provider’s annual contract income in 2009/2010 to 1.5% in 2010/2011 and to 2.5% from April 2012. For 2010/2011 this corresponds to £1.0bn¹⁸ making the financial value of the scheme comparable to that of the Quality and Outcomes Framework introduced for general practitioners in England in 2004¹⁹ which for 2010/2011 made £1.1bn available for bonuses.²⁰

For non-acute providers, participation in CQUIN was voluntary in Year 1, but this became mandatory from April 2010. In addition, guidance for 2010/2011 emphasized stretch targets and outlawed payment for data collection (as opposed to improved performance) only. However, this was subsequently relaxed in recognition of the need to reward data collection effort as part of the first phase of quality improvement in some areas.

Methods

The paper relies on a data set collected for the evaluation we undertook for the Department of Health of the CQUIN framework.²¹ For this, we obtained copies of every CQUIN scheme from all but 10 providers that did not make their locally developed schemes available, even though it was a requirement from the DH for them to do so. The content of the missing schemes would not influence the main findings we present here. We classified the schemes according to headings agreed with the DH in order to provide a national, structured picture of CQUIN schemes. The data were collected from August 2010 to February 2011.

We use a mixed methods approach in which quantitative methods are used to address quantitative questions such as scheme size, complexity and agreement between policy intent and scheme design outcome and qualitative methods are used to explain the quantitative findings and address the research question on local enthusiasm which cannot be answered on the basis of the available quantitative data.

Before undertaking the quantitative analysis of the CQUIN schemes we cleaned the data for differences in spelling and obvious misclassifications, and aggregated goals that had very similar content. For example, we combined the goals relating to ‘patient satisfaction and ‘patient involvement’ because, although these goals are distinct in principle, commissioners had used the same indicators to operationalize these goals. Then, to assess the size and complexity of the CQUIN schemes we counted the number of goals and indicators in each of the four sectors.

To assess which goals were perceived by commissioners and providers as most important, we identified the 10 goals that were included most frequently. As another indicator of importance we calculated the total weight attached to the indicators making up these goals.

We then identified all of the indicators used for one commonly included goal (discharge planning) to examine the amount of variation in local indicator development.

Finally, we analysed the proportion of local indicators classified by scheme developers as incentivizing the quality domains of structure/process, outcome, action plan or data collection. We also assessed the proportion of indicators classified by developers as covering each of the quality dimensions: safety, effectiveness, patient experience and innovation. As CQUIN schemes according to the guidance had to contain at least one indicator in each domain, we further investigated for what proportion of schemes this requirement was fulfilled and what proportions of schemes lacked an indicator in each of the four domains.

Ethical approval for the qualitative arm of the study was obtained from Nottingham Research Ethics Committee 2 (MREC number 10/H0408/11). In this part of the study, we explored the processes surrounding CQUIN using observation of 373 meetings (comprising 800 hours of observation) in 12 case study sites, as well as 230 formal semi structured interviews, digitally recorded and transcribed verbatim, with NHS staff from those sites. In addition, less formal conversations were conducted with NHS staff immediately prior to, or just after, the meetings. Of the formal interviews, 69 were from acute providers, 43 from community providers, 65 from mental health providers and 53 from commissioner organizations. 55 staff at director or deputy director level were interviewed and 128 managers, although the latter covered a wide range of roles from finance and administrative staff to clinical managers (mainly nurses). Our approach was to observe meetings between commissioners and providers at which CQUIN was discussed. These discussions were usually part of a wider meeting to discuss performance against contract more generally. Attending these meetings on a regular basis enabled us to track the process of CQUIN scheme development and negotiation, implementation, monitoring and final payment. Combined with interviews, this approach also enabled us to compare the stated intentions of interviewees with what happened in practice.

One of the drawbacks with this approach was that front line staff were usually not present at these meetings. We used a combination of asking meeting attendees for contacts and support from NIHR clinical research network staff, to identify and recruit frontline clinicians.

Furthermore, since we were undertaking a parallel study looking at another incentive initiative (Best Practice Tariffs),²² we also drew on findings from interviews conducted for that study (93 of which were outside of case study sites) to enable us to recruit and interview frontline clinicians beyond our 12 case study sites.

A case study was defined as a health economy, comprising a commissioner (including the related specialized commissioning input) and the providers from whom it commissioned care.

All interviews were digitally recorded and transcribed verbatim. Analysis initially involved coding transcripts using NVivo software and identifying themes. A constant comparative method was used to interpret the data.²³ Key concepts were identified using an open coding method. Once coding was complete, the codes that had common elements were merged to form categories. Analysis moved from within-case (i.e. a single health economy) to cross-case analysis to identify both site specific and more general issues.

The data was collected by five researchers and disagreements in interpretation were discussed until a consensus was achieved. This process involved meeting to discuss relevant data extracts, challenging assumptions that were not supported by or explicit in the data and constraining interpretation to exclude prior assumptions.

Results

Scheme size and complexity

The data set contains 337 CQUIN schemes. The schemes apply to 151 acute care providers, nine ambulance service providers, 93 community care providers and 84 mental health and learning disability service providers.

Overall, the goals in the schemes cover 113 distinct topics. Some of these topics occurred in more than one sector. Overall, for acute care there were 92 distinct topics. For ambulance care there were 29 distinct topics, for community care 63 distinct topics and for mental health 57 distinct topics.

The local discretion in choosing goals for inclusion in CQUIN resulted in a diverse range of goals and indicators. Across all of the 337 schemes, the goals were operationalized using a total of 5001 indicators. Of these, 3142 indicators were uniquely defined, distinct, indicators. There were 1546 distinct indicators used in CQUIN schemes for acute care. In ambulance service, community care and mental health schemes respectively, the numbers of distinct indicators were 78, 999 and 645.

Only acute care schemes contained national indicators. These constituted 12% of the total number of indicators in acute care schemes. Fifty-seven per cent of the indicators in the acute care schemes were locally developed, and the remaining 31% were regional indicators. In ambulance services the majority (90%) of indicators were regional, while the majority of indicators in community care (82%) and mental health (64%) were local.

The local flexibility in the design of CQUIN schemes also led to schemes that were often highly complex. Table 1 displays CQUIN scheme size and complexity as measured by the number of indicators per scheme in each of the four sectors. For acute care, a single CQUIN scheme could have up to 25 different goals with the median scheme containing 11 different goals. These goals were operationalized with up to 52 different indicators per scheme, with the median scheme containing 16 different indicators. In general, the schemes were less complex in other sectors. Using the number of indicators as a measuring rod for scheme complexity, mental health CQUIN schemes were the second most complex schemes, followed by community care and then ambulance care schemes.

Table 1.

Goals and indicators in CQUIN schemes by sector.

Service	Dimension	Mean	SD	Min	Median	Max	Number of schemes
Acute care	Indicators	18.4	9.4	3	16	52	151
Acute care	Goals	11.3	4.4	2	11	25
Ambulance care	Indicators	9.6	4.7	5	8	19	9
Ambulance care	Goals	6.8	3.2	4	6	12
Community care	Indicators	12.4	5.9	3	12	28	93
Community care	Goals	7.4	2.7	2	7	13
Mental health	Indicators	12.7	8.7	1	10	37	84
Mental health	Goals	7.7	3.8	1	7	18

Identification and operationalization of local needs for quality improvement

Analysing the frequency and commonality of locally selected indicators allow us to assess to what extent priorities vary across local health economies. Table 2 shows that the most frequently included goal was patient/user satisfaction and involvement. In addition to the nationally mandated patient experience goal included in all acute care schemes, 37% of local goals in acute care were concerned with patient/user satisfaction or involvement. This goal was included in 89% of schemes in ambulance care, 77% of schemes in community care and 86% of schemes in mental health. The greatest commonality in goals across sectors was between acute and community care, where seven of the 10 most frequently occurring goals in each sector are used in both sectors. Ambulance services, and especially mental health, have more sector specific goals amongst their top 10 goals.

Table 2.

Top 10 goals in local schemes by sector.

Goal	Acute		Ambulance		Community		Mental health
Goal	% of schemes	% of weights	% of schemes	% of weights	% of schemes	% of weights	% of schemes	% of weights
Patient/user satisfaction/involvement	37	4	89	15	77	15	86	21
End of life	39	3	44	6	58	10	0	0
Falls	37	3	56	6	38	4	0	0
Tissue viability/pressure ulcers	47	4	0	0	54	9	0	0
AMI &/stroke	41	3	56	8	0	0	0	0
Smoking	39	3	0	0	39	4	0	0
Discharge planning/communications	47	5	0	0	25	3	0	0
Maternity	37	3	0	0	27	3	0	0
HoNOS/PbR	0	0	0	0	0	0	57	10
Data quality	0	0	22	2	31	9	0	0
Recovery planning	0	0	0	0	0	0	39	7
Dementia	0	0	0	0	0	0	38	4
Essen climate scale	0	0	0	0	0	0	36	6
Long term conditions/care planning	0	0	0	0	34	6	0	0
Asthma	0	0	33	3	0	0	0	0
Alternate care pathways	0	0	33	17	0	0	0	0
Cardiac care	0	0	33	3	0	0	0	0
Surgery	32	2	0	0	0	0	0	0
Neonatal units	30	9	0	0	0	0	0	0
Nutrition	0	0	0	0	30	4	0	0
Structured activity	0	0	0	0	0	0	30	4
Access to services	0	0	0	0	0	0	29	2
Service specifications	0	0	0	0	0	0	29	8
Crisis resolution home treatment	0	0	0	0	0	0	24	2
Reduction in average length of stay	0	0	0	0	0	0	23	3
Staff development/improvement	0	0	22	6	0	0	0	0
Safeguarding	0	0	22	4	0	0	0	0

AMI: Acute myocardial infarction.

On average, in each sector, goals on a sector’s top-10 list occur in about 40% of the sector’s schemes. This suggests a relatively high degree of agreement within a sector on which local goals require attention. In addition, this degree of agreement between schemes suggests that providers could potentially be benchmarked against a solid base of peers with similar interest, if similar indicators had been used across schemes. However, the DH’s hope that locally negotiated schemes would rely on standard performance indicators where possible, and only develop local indicators when no suitable standard/national indicator was available, does not seem to have been fulfilled. Indeed, of the 5000 indicators in use in the 2010/11 CQUIN schemes more than 60% of these indicators were unique.

As evident from the previous analysis, ‘discharge planning and communication’ was a popular goal included in about half (72) of the CQUIN schemes for acute care, in 22 community service schemes and in 10 mental health schemes. To operationalize the goal, the 74 acute care provider CQUIN schemes made use of 114 indicators, of which 82 were unique. The median number of indicators used to operationalize the discharge planning goal in local acute care schemes was three, with a minimum of one and a maximum of eight indicators used.

To illustrate the diversity in how local scheme developers approached the task of operationalizing the common goal of improving discharge planning and communication, Table 3 lists some of the indicators that were used to operationalize this goal. We only show the local indicators that appeared in more than one acute care scheme. The exact wording of the indicators used in the local schemes is re-produced in the table. The list of indicators reveals that several of the indicators are concerned with very similar aspects of discharge planning, but the freedom to develop local indicators meant that this commonality of interest could not lead to benchmarking exercises across providers, because very few schemes included the same indicator. For example, many of the indicators considered the information sent to patients’ GPs when patients are discharged. While the indicator in one scheme requires discharge summaries to be received by the patient’s GP within 24 hours, another scheme requires discharge letters to be received by the patient’s GP within two weeks of discharge, and yet another requires inpatient and outpatient letters to be received by the GP within one week.

Table 3.

Local indicators used to operationalize a discharge planning goal in more than one acute care CQUIN scheme.

Definition of Indicator	Frequency
Patients to receive a copy of their electronic discharge summary on day of discharge.	4
Discharge summaries to be received by the patient’s GP within 24 hours.	4
Discharge summaries to contain the recommended CRG minimum dataset.	4
Discharge letters to be received by patient’s GP within 2 weeks of discharge.	4
Agree a solution and timescale for the implementation of an electronic discharge summary.	4
Estimated date of discharge discussed within 24 h of admission.	4
Ready to go – no delays.	4
Discharge of inpatients prescribing.	3
Improved patient safety by implementation of electronic discharge summaries.	3
In-patient letters to be received in general practice within 1 week in 2010–11 as per standard NHS contract.	2
Accuracy of medicines on discharge.	2
Discharge information from A&E and day case surgery to GPs.	2
Increase effectiveness of accident and emergency discharge information.	2
Increase in nurse and midwifery led discharge.	2
Increase the number of patients in NHS provided care who have their discharge managed and led by a nurse or midwife where appropriate.	2
Managed discharges.	2

CRG: Clinical Reference Group.

The table also displays the difficulties that may arise when the responsibility of defining indicators is delegated to local levels. The lack of detail in the specification of indicators means that the evaluation of performance is impeded even when focusing only on the provider for which a specific indicator was selected. This issue was also identified in the qualitative part of our study:

‘The problem we had was when we signed off the original CQUIN. … it didn’t specify the definition of average length of stay, because it could be mean, median, mode … it wasn’t that we were reckless or negligent, it’s just that people...were just thinking, “We want to reduce average length of stay, that’s possibly a good thing” … when we agreed the CQUIN prior to sign-off the contracts, which would have been signed in March 2010, we hadn’t actually identified what the metrics were … and we didn’t agree those metrics until September 2010 …’ ID116 Director of Finance.

Indicator development is time consuming and ideally involves piloting.¹² The interviews indicated that negotiating and developing CQUIN schemes, combined with data collection and monitoring, involve a huge amount of effort. But in most sites, expertise in indicator development was lacking. Thus, the interviews showed that indicators were often developed in haste, by people who had little or no experience in this area. As Table 3 and the quote illustrate, this led to indicators that vary greatly in the level of precision which in turn led to in-year problems because of lack of clarity on what was being measured and late introduction of indicators. As indicated by the quote, it also led to indicators not being agreed until part way through the year in which performance was to be rewarded.

Potential for generating local enthusiasm

The decision to use local indicators was explicitly intended to enable local clinician involvement in indicator development, which was anticipated to generate local enthusiasm around CQUIN and subsequently lead to better outcomes. Indeed, this was one of the main reasons for selecting a model of locally negotiated schemes.

However, our qualitative evidence does not suggest that this was the case. For the first year of CQUIN schemes, the tight timetable was reported as leaving no room for engagement of frontline clinicians, and provider representatives entered into agreements with little or no input from these staff. Despite the declared intentions of those involved in constructing the schemes, the process for 2010/11 was again characterized by a lack of frontline clinician input.

Interviews with frontline staff highlighted their lack of involvement with the process, scepticism about the benefits and a belief that data collection detracted from time spent with patients. Managers recognized that more effort was needed to involve clinicians and saw this as a priority. However, despite this recurring theme year on year, little progress appeared to have been made in this respect. As one clinician explained it:

‘I can tell you it’s not been generated from the bottom up. It’s not even been generated from the middle up, it’s next to top …’ ID135 Clinical Director Provider

The quote below, whilst acknowledging the problem of engagement suggests that new goals might improve this state of affairs:

‘Yes, I think that engagement with clinicians is not great with regards to the 10/11 CQUIN. … generally speaking I think they think they’re not very good, that it’s not a great tool to drive up quality. So we have to try and get some of that confidence back. Hopefully with the new set [of CQUIN goals] it will be better.’ ID52 Assistant Chief Nurse Provider

Compliance with guidelines for scheme design

To ensure that the locally developed schemes would contribute to achieving the national objectives for CQUIN, a number of requirements for the local schemes were set out, but the locally developed schemes largely failed to follow these guidelines.

Table 4 shows that, although CQUIN schemes were intended to generate changes in outcome, in all sectors, the majority of indicators were classified (by the scheme developers themselves) as process or structure indicators. As stated in the CQUIN guidance, process and structural indicators may still be justified provided they link clearly to outcomes. However, the dataset reveals that, based on the scheme developers’ own classifications, just 19% of locally selected structure or process indicators were evidence based.

Table 4.

Scheme developers’ classifications of local indicators by quality domain and dimension (%).

Sector	Acute	Ambulance	Community	Mental health	Total
Domain	Outcome	34	13	32	15	29
	Process/structure	62	88	62	76	65
	Action plan	5	0	6	8	6
	Data collection	2	0	1	3	2
Dimension	Safety	44	25	39	21	38
	Effectiveness	56	50	53	47	53
	Patient experience	35	88	39	41	37
	Innovation	8	13	12	15	10

It can also be seen from Table 4 that in all other sectors than ambulance services, the majority of local indicators are concerned with improving effectiveness. It is also clear from the table that indicators for innovation are the least common.

In the qualitative interviews, respondents expressed concern with using CQUIN schemes to incentivize innovation because this might involve taking risks to develop new ways of working and such behaviour was discouraged when linked to a financial risk.

The interviews also revealed confusion about the definition of innovation, although an attempt was made in the CQUIN guidance to specify that indicators were not expected to promote the invention dimension of innovation, but rather drive adoption and diffusion:

‘I suppose it depends what sort of definition of innovation you might be using. We had one in 10/11 which we got in 11/12 again, which was about GP discharge summaries, … we’ve introduced an electronic discharge letter system … So you could say that that could be seen as innovation in terms of clinical innovation.’ ID113 Senior Manager Provider

One of the problems with different definitions and interpretations of classifications for indicators, is that similar indicators might be classified differently in terms of domains, which raises questions about the extent to which asking for indicators to be classified in this way is meaningful or desirable. For example the data shows that indicators measuring processes of care, for example the proportion of stroke patients given a swallow test screening within 24 hours of admission, were sometimes classified as addressing patient safety, effectiveness and patient experience but in other cases not seen as addressing patient experience.

Table 5 confirms that confusion around the definition of, and reluctance to incentivize, innovation was a general problem. The table displays the percentage of local schemes that fail to include an indicator from each quality domain and the percentage of schemes that lack an indicator for each of the four domains. In all but ambulance care, more than half of the locally negotiated schemes failed to fulfil the requirement of containing at least one indicator from each domain.

Table 5.

Percentages of local schemes not meeting CQUIN guidance design criteria.

Sector	Failed to contain one indicator in each domain	Percentage of schemes that lacked an indicator for:
Sector	Failed to contain one indicator in each domain	Safety	Effectiveness	Patient experience	Innovation
Acute	62	10	6	8	58
Ambulance	50	50	0	0	50
Community	70	23	20	17	59
Mental health	57	20	11	17	46

Discussion

The CQUIN framework was intended to allow providers and commissioners to agree on P4P schemes locally, to ensure that performance goals and indicators included in the scheme would address local needs for quality improvement and generate local enthusiasm through local clinical involvement in the development of indicators. This process was anticipated to lead to quality improvements across the NHS, and over time. This paper has analysed whether the incentive schemes that emerged from this local development process successfully identified variation in local needs and priorities for quality improvement, generated the anticipated local enthusiasm, and led to schemes that address the original objectives for the CQUIN framework.

Regarding to our first research question, which concerned the successfulness of a decentralized scheme development process in identifying heterogeneous local needs and priorities for quality improvement, we found that the exercise initiated by the introduction of CQUIN has been successful in doing so. The result is a wide and varied range of goals and indicators across localities and it is unlikely that a centrally mandated scheme would have resulted in a similar selection of topics and adaptation to perceived local needs.

However, with respect to our two other research questions, our findings suggest that locally developed schemes did not generate the anticipated local enthusiasm, and that the type and prioritization of indicators generally does not live up to the standards set out centrally with the aim of achieving the national objectives for improvements in quality and innovation in the English NHS. Although this does not necessarily imply that the schemes will not lead to the desired outcomes, we have identified a number of aspects of the current implementation of the scheme that can potentially impede the ability of the scheme to generate the desired improvements in the quality of care.

The stated intention of involving local clinicians in the scheme design process to generate enthusiasm is well supported by the literature. In 1992, Pettigrew et al.²⁴ highlighted professional support as the single most significant factor for changing clinical practice, managing innovation and promoting team-working. Similar findings appear in more recent reviews on organizational change in the NHS and internationally,^25–27 and a literature review of the relationship between organizational factors and performance²⁸ concluded that ‘organizational change needs to be developed from within, not just imposed from outside.’ However, although schemes were developed locally, clinicians expressed dissatisfaction with the scheme development process which often did not involve frontline clinicians, and was acknowledged by managers and clinicians to be deficient in this area. This coupled with the absence of mechanisms for meaningful engagement suggests that the schemes are unlikely to lead to the anticipated local enthusiasm. Wallace et al.²⁹ have emphasized that for strategies to be effective in changing clinical practice those involved need to believe in the expected impact of the intervention on practice. Linked to this, the importance of managerial communication with staff in the NHS during a period of change³⁰ should not be underestimated.

The Impact Assessment published alongside the announcement of the CQUIN Framework anticipated that standardization of indicators across schemes would emerge. However, based on the second round of schemes it seems unlikely that this will occur without firmer regulation in place. For example, although the CQUIN guidance stipulated that all schemes should include an indicator for each of four quality domains (safety, effectiveness, patient experience and innovation), the majority of schemes failed to meet this requirement. The variation in the classification of identical indicators across schemes questions whether letting scheme developers perform this classification is desirable. Also, although the guidance emphasized a focus on outcome, many locally agreed indicators concern structure and processes and are based on, at best, weak evidence of effectiveness. This may also contribute to limited clinical engagement.

Furthermore, the indicators that emerged from the local development process were often unclear or lacking precision which led to difficulty in the follow up of performance. The reliance on locally developed indicators also hinders benchmarking across providers, even for schemes with common goals for which national indicators did exist.

A potential solution for the costly and sometimes delayed process of developing indicators locally was suggested in the CQUIN guidance. The guidance advised commissioners and providers to use official indicators where possible, and only use or develop local indicators when no national alternative was available. The guidance contained a link to the NHS Information Centre website that lists indicators that have already been developed to measure quality. This list currently includes at least 18 indicators related to discharge planning. Although it is not possible to find national alternatives to all of the local indicators, it is obvious that several of the locally developed indicators could have been substituted by a national indicator. For example, many of the locally developed indicators are concerned with patients’ experience of the discharge or the level of information given to patients. In these cases there are obvious national indicator alternatives e.g. PE17 (Patients who reported that they were involved in decisions about their discharge from hospital), PE18 (patients who reported that when leaving hospital they were given written or printed information about what they should or should not do), PE19 (patients who reported that staff explained the purpose of the medicines they were to take at home in a way they could understand) and PE25 (patients who reported they were told who to contact if they were worried about their condition or treatment after they left hospital) in the Measuring for Quality Improvement (MQI) list. Similarly, schemes choosing to focus on the level of readmissions could have used the national indicators for the areas where these are defined.

Despite the existence of these national indicators, and the guidance to use these where possible, few interviewees suggested that they consulted these when developing schemes. Freedom to use local indicators reflects the national policy direction which emphasizes local freedoms and priorities in public services. Furthermore, in a context where policy makers are keen to engage clinicians in commissioning health services, emphasizing local freedom and discretion is more likely to achieve this aim, compared with restrictive approaches that mandate sets of indicators. However, as our findings illustrate, the development of local indicators as part of the CQUIN process has been somewhat problematic, leading to a muted potential impact of the policy.

The requirement that CQUIN schemes should be stretching for providers to achieve meant that new goals and schemes were introduced each year. However, regularly having to negotiate new goals and schemes leaves little time for engaging clinicians. This raises questions about the desirability of developing local indicators as well as the feasibility of engaging clinicians within such short timescales, especially as there was no opposition in principle to national goals. They were seen by some people as preferable to local goals, due to the ability to use these for benchmarking and/or because specific local goals had proved to be problematic or were viewed as inappropriate.

While there is clearly an important case for local strategic and clinical input into the design of pay-for-performance schemes, this should be separate from the technical design process, which involves defining indicators, agreeing thresholds, and setting reward levels. Defining good performance indicators requires evidence-based knowledge of the relationship between structures, processes and outcomes, and providers’ ability to affect these measures. Agreeing thresholds requires a baseline and understanding how sensitive performance is to the size of the financial incentive. Setting reward levels involves knowing the value of the desired performance improvements to the commissioner and society (essentially their willingness to pay) and the costs to the providers of making the improvements. Previous research only provides limited knowledge about how this technical process should be carried out. It is thus unlikely that such expertise can be expected in each locality. The CQUIN framework potentially offered an opportunity to learn how technical design influenced outcome, but due to the high degree of local experimentation and little systematic collection of key variables, it is difficult to derive lessons from this unstructured experiment about the impact and importance of different technical design factors on the effectiveness of pay-for-performance.

Balancing the policy goal of localism with the objective of improving patient outcomes leads us to conclude that a firmer national framework would be preferable. This might take the form, for example, of a ‘pick list’ of national indicators from which commissioners and providers can choose a subset to fit their current priorities. Our interviewees did not reveal major opposition to national indicators and this could be combined with standardized reporting on the local schemes. This standardized reporting could include ex post reporting of the results and the levels of payment as well as ex ante reporting of the design of the schemes. These would allow systematic learning across commissioners and providers on how to design a local pay-for-performance scheme, increasing the likelihood that the goals set out in High Quality Care for All could be achieved in the future.

Footnotes

Acknowledgements

This project was funded by the Department of Health Policy Research Programme, with additional support from the NIHR Health Services and Delivery Research Programme. The views expressed in this report are those of the authors and do not necessarily reflect those of the DH or NIHR.

Conflict of interest

None declared.

References

Paris V, Devaux M and Wei L. Health systems institutional characteristics. A Survey of 29 OECD Countries: OECD iLibrary, 2010.

Mehrotra

Damberg

Sorbero

MES

. Pay for performance in the hospital setting: what is the state of the evidence? Am J Med Qual 2009; 24: 19–19.

Rosenthal

Frank

. What is the empirical basis for paying for quality in health care? Med Care Res Rev 2006; 63: 135–135.

Van Herck

. Systematic review: effects, design choices, and context of pay-for-performance in health care. BMC Health Serv Res 2010; 10: 247–247.

Eijkenaar F, Emmert M, Scheppach M, et al. Effects of pay for performance in health care: a systematic review of systematic reviews. Health Pol 2013; 110(2,3): 115–130.

Epstein

. Will pay for performance improve quality of care? The answer is in the details. N Engl J Med 2012; 367: 1852–1853.

Jha

Joynt

Orav

. The long-term effect of premier pay for performance on patient outcomes. N Engl J Med 2012; 366: 1606–1615.

Ryan

. Hospital-based pay-for-performance in the United States. Health Econ 2009; 18: 1109–1113.

Sutton

. Reduced mortality with hospital pay for performance in England. N Engl J Med 2012; 367: 1821–1828.

10.

Roland

. Pay-for-performance: not a magic bullet. Ann Intern Med 2012; 157: 912–913.

11.

Eijkenaar

. Key issues in the design of pay for performance programs. Eur J Health Econ 2013; 14: 117–131.

12.

Nicholson

. Getting real performance out of pay-for-performance. Milbank Q 2008; 86: 435–457.

13.

Department of Health. High quality for all. London: Department of Health, 2008.

14.

Department of Health. Impact assessment of commissioning for quality and innovation payment framework. London: Department of Health, 2008.

15.

Petsoulas

Allen

Hughes

. The use of standard contracts in the English National health Service: a case study analysis. Soc Sci Med 2011; 73: 185–192.

16.

Department of Health. Standard NHS contract for acute services, London: Department of Health, 2007.

17.

Department of Health. Using the commissioning for quality and innovation (CQUIN) payment framework, London: Department of Health, 2008.

18.

Department of Health. Exposition book 2011–2012, London: Department of Health, 2011.

19.

Roland

. Linking physicians’ pay to the quality of care – a major experiment in the United Kingdom. N Engl J Med 2004; 351: 1448–1454.

20.

The Health and Social Care Information Centre. Quality and outcomes framework achievement data 2010/11. N:\fd\sage\HSR\HSR_18_2S\Final-xml (2011, accesssed July 2013).

21.

McDonald R, Kristensen SR, Sutton M, et al. Evaluation of the commissioning for quality and innovation framework. Final report. Nottingham: University of Nottingham, 2013.

22.

McDonald, R, Allen T, Fichear A, et al. A Qualitative and quantitative evaluation of the introduction of best practice tariffs: an evaluation report commissioned by the Department of Health. Nottingham: University of Nottingham, 2013.

23.

Strauss A and Corbin J. Basics of qualitative research: grounded theory procedures and techniques. London: Sage, 1990.

24.

Pettigrew

Ferlie

McKee

. Shaping strategic change: making change in large-scale organisations, London: Sage, 1992.

25.

Redfern

Christian

. Achieving change in health care practice. J Eval Clin Pract 2003; 9: 225–238.

26.

Ferlie

. Large-scale organizational and managerial change in health care: a review of the literature. J Health Serv Res Policy 1997; 2: 180–189.

27.

Shortell

Bennett

Byck

. Assessing the impact of continuous quality improvement on clinical practice: what it will take to accelerate progress. Milbank Q 1998; 76: 593–624.

28.

Sheaff

. Achieving high performance in health care systems: the impact and influence of organisational arrangements, London: London School of Hygiene and Tropical Medicine, 2006.

29.

Wallace

Freeman

Latham

. Organisational strategies for changing clinical practice: how trusts are meeting the challenges of clinical governance. Qual Health Care 2001; 10: 76–82.

30.

Tourish

Hargie

ODW

. Communication between managers and staff in the NHS: trends and prospects. Br J Manag 1998; 9: 53–71.