Abstract
Objectives
Performance measurement systems are increasingly used to reward and improve provider performance. However, such initiatives may also inadvertently induce a range of unintended and dysfunctional side-effects. This study explores the unintended and adverse consequences induced by the Iranian national hospital grading programme, which incorporates financial incentives for meeting nationally defined standards.
Methods
We interviewed key informants across four key groups with a legitimate interest in healthcare performance: four purposively selected hospitals; four health insurance organizations; the Iranian hospital accreditation body; and one grading agency. The transcribed interviews and field notes were analysed thematically, and subsequently, member checking was conducted.
Results
Seven dysfunctional consequences were identified: misrepresentation of data by hospitals; increased anxiety and stress among hospital employees; tunnel vision; financial pressures on poorly graded hospitals; incentives to purchase unnecessary equipment; erosion of public trust; and restricting access to hospital services by patients. These were caused by the way the grading system was implemented: poor standards of audit; the way in which the audit process was conducted; and the timing of audits. The pay for performance element of the grading system and the focus on structural aspects in the standards made improvement in grading particularly difficult for those hospitals that had been assessed as under-performing.
Conclusion
Although the Iranian hospital grading system has resulted in a significant increase in the adoption of national standards, it has nevertheless induced a range of perverse outcomes. To mitigate these requires further refinement and recalibration of the system.
Keywords
Introduction
Healthcare performance measurement systems apply different mechanisms for improving the performance of individuals and organizations. Across the globe, the publication of performance data and pay for performance (P4P) mechanisms are being used increasingly to stimulate and reward provider performance. 1 Evidence about the effectiveness of such mechanisms is mixed. Some commentators suggest there is a lack of robust evidence,2–4 while others conclude that the public release of performance data has triggered a range of quality improvement activities in hospitals, including revising staffing policies,5,6 improved assessment of patient care and operating room schedules,6–8 the redesign of service pathways,5,9 the development of new practice guidelines 7 and improved care for pneumonia and heart disease. 10 P4P mechanisms have also been linked to measurable improvement in quality of care,11,12 decreased mortality 13 and enhanced service delivery and responsiveness. 14
Performance measurement systems can inadvertently induce a range of unintended and dysfunctional side-effects. US evidence demonstrates that poor risk-adjustment and incentives can cause physicians to avoid taking on complex cases 15 or may result in inappropriate clinical care 16 such as the unnecessary prescription of antibiotics 17 or the inappropriate use of interventions to prevent venous thromboembolism. 18 In the English NHS, the hospital ‘star rating’ performance assessment system was shown to have created a range of unintended consequences. 19 Less evidence is available about dysfunctional consequences of P4P schemes, but what is available indicates that adverse effects include the exclusion of severe cases 20 or minority of patients, 21 when performance measures are not adjusted adequately for the case mix of patients. 22 Similar issues have been reported in developing countries. 23
Iran has been operating a hospital P4P system for several years. With a population of over 77 million, Iran has a highly centralized healthcare system. The Ministry of Health and Medical Education (MOHME) is the governing body for healthcare which is delivered mainly by public-governmental organizations (in charge of most primary health and secondary care) and a small private sector (secondary healthcare mainly in the big cities). People can choose their type of health insurance. There are two types of health insurance: basic and supplementary/complementary. Basic insurance organizations, all public, cover patients at public hospitals for basic services (such as care for general diseases, routine surgery and medicine) while the supplementary ones, as private entities, cover patients for services provided in private hospitals or allied services such as dental care. Iran has a strong history of primary healthcare, achieved mainly through its Primary Health Care Network. 24 However, secondary care has been challenged by lack of a referral system in urban areas, incomplete health insurance coverage and high out-of-pocket payment. 25
In Iran’s fee for service payment model of hospital financing, the P4P mechanism is a powerful financial incentive for hospitals. As shown in Table 1, a one-point increase in a hospital’s grading means that they can increase patient stay charges by 50% and other charges by 9%. According to the financial regulations, hospitals’ revenue can be used for development purposes and a part of this can be distributed as a financial bonus among staff, mainly the surgeons. Hence, the P4P mechanism can potentially work as a powerful incentive both at the level of the organization and the individual. The charges are paid by the health insurance organizations (nominally upto 90%) and the insured patients (co-payment, of at least 10%). The MOHME requires hospitals to display their grading certificate on notice boards in wards and departments in order to publicize the grades to patients and their families.
Iran’s hospital grading system, which includes a mixture of clinical, administrative and structural standards and measures, has influenced hospital behaviour along a range of dimensions and triggered improvement activities, including better infection control, improvements to buildings, the purchase of new equipment, decreases in waiting time and better medical record keeping. 28 The P4P mechanism in particular has been a key incentive for such changes. 29 However, the public dissemination of the grading results does not appear to have influenced patients' choice of hospital. 30 So, in common with similar studies conducted in other countries, market share does not appear to be affected by the publication of hospital performance data.
The hospital grading system is Iran’s only on-going health sector organizational performance measurement programme administered mandatorily by the government. Against this background, we present the results of a study which focused on the potential unintended and dysfunctional consequences of the Iranian hospital grading system.
Methods
Characteristics of the interviewees and their organizations.
The interviews started with a general question about the respondent’s experience of hospital grading. Then we asked their views about the possible impact of the system on hospitals’ performance. Where the respondents mentioned any negative impact in their answers to these questions, the interviewer (AA) probed for more detail. Thirty-nine semi-structured interviews included discussion of potential dysfunctional consequences. The interviews were held in the interviewees’ office or workplace and were tape-recorded.
Field and observation notes were also used to provide contextual information and for cross-checking the veracity of responses. 33 These notes were based on observations of a grading audit in hospital A and also participating for three days as a grading team member in a hospital not included in the sample and followed by a de-briefing session held among the surveyors at the end of the third day.
The recorded interviews were transcribed verbatim, and all text and notes analysed thematically 34 in terms of types of dysfunctional consequences. Follow-up telephone interviews were conducted with a sub-section of the interviewees in order to validate and refine the emerging themes.
Results
Data were grouped into seven general themes in relation to the dysfunctional consequences of the grading system for the different respondents. The dysfunctional consequences included both those induced by the P4P element of the scheme as well as those caused by the public release of the grading results.
Misrepresentation of data and information to the auditors
Hospital staff reported that they or colleagues working in the hospital had made temporary changes specifically for the audit days in order to improve the hospital’s grading. Three types of misrepresentation were identified, ranging from mild to severe: relatively long-term changes in hospitals, but performed just before the grading process was conducted; short-term changes; and no change, but behaviour aimed to deceive the grading teams.
Relatively long-term changes in hospital activity, but immediately before the grading
Some changes in working practices, including the repair or purchase of medical equipment, or the purchase of uniforms and clothes were delayed until a few days before the grading. This was felt to be a dysfunctional consequence because hospital managers deprived patients and staff of some of the services until the day of the grading in an attempt to present a more positive (but nevertheless erroneous) image of the hospital. Sometimes our broken equipment, such as a radiography machine or air conditioner, is not repaired until the week before grading. Also some necessary instruments are purchased just in this time. (Head nurse, Hospital D)
Short-term changes
Many organizational changes were only temporary and specifically for the purposes of obtaining a higher grading rather than to enhance service quality. Hospital cleanliness, cleanliness of patient bedding and clothing, improvement in the quality of meals for patients and staff, the cleanliness of staff uniforms, the use of name badges on uniforms, the safer storage of medicines and tighter monitoring of expiry dates, more control over trolleys and surgical instruments, and improved behaviour around patients were reported to be the most common short-term changes. It was reported that following the grading sessions there was less concern to attend to these issues with services soon returning to ‘business-as-usual’. On the audit days we change all bedding, patient clothes, and clean everywhere but on other days there is no action, just our routine. (Matron, Hospital A)
No real change, and attempting to conceal this from the grading teams (fraud)
In some areas, the hospital failed to make any real change but tried to deceive the grading teams by covering up manifest failings in service delivery. Examples included concealing from view medicines that had exceeded their expiry date or; borrowing monitors from other hospitals for the audit days, sometimes without even bothering to install them properly; falsifying radiology archives; and purchasing modern imaging machines without obtaining the necessary film. In one extreme case hospital staff deceived auditors by role-playing: We did not have a social worker in our hospital. On the grading day we made a fake room for social work with a fake sign on the door and an employee from the telephone operation room was set up as the social worker. We got the score that day. (Radiology technician, Hospital B)
Workplace stress and anxiety
Staff reported that they believed that stress was inevitable and generally did not complain about it, as a degree of stress is associated with any type of performance measurement system. However, some staff mentioned that the stress was the result of the scheduled surveys and mentioned that most of the work was done in a panic immediately before the auditors’ visit. One week before grading, the hospital is on alert; I mean it is out of its routine and in a state of panic: different meetings [about:] what we lack, what we should buy, what we should change, where signboards are lacking. (Finance manager, Hospital B) Grading is stressful because if the grade is not good some nurses will be fired. They know this so they are very nervous before the grading days until the results are announced. (Matron, Hospital D)
Tunnel vision
Hospitals neglected to focus on some important aspects of quality and performance as these were not measured and rewarded in the grading system. A lack of attention to the quality of nursing care, especially mental care, was one of the dysfunctional consequences reported by insurance organizations and nurses. There is no focus on patient morale. They [grading auditors] ask hospitals just about doing patients’ injections and giving their medicine … They never ask nurses to spend half an hour with patients to see what their problems are and what they can do for them. (Health insurance inspector)
Indirect financial pressures on poorly graded hospitals
The financial incentives attached to grading system appeared to serve more as a punishment for poorly graded hospitals than a reward for those with good grades due to the low charges administered by the MOHME. Most interviewees believed that the current economic situation was creating a difficult financial climate for hospitals, and if a hospital was awarded a low grade, it would exacerbate its financial difficulties further. If hospitals develop financial problems they may also lose their medical staff, as physicians opt to work elsewhere: By grading us 2, they made a lot of problems for our debts, and our services. Due to our delay in paying physicians' per case, they prefer to take patients from the hospital to their private clinics, to earn money much quicker. (Finance manager, Hospital B)
Pressures to buy unnecessary equipment
A common belief at hospitals, insurance organizations and the grading organizations was that the grading system had increased competition among hospitals to invest in equipment and new buildings at the expense of funding other aspects of quality of care. Interviewees believed that some of this equipment was unnecessary but was purchased purely because it was included in the standard checklists: Our checklists say that hospitals should have a basin washing machine, while some hospitals use disposable basins. We asked them to buy the machine. (Grading officer, medical university)
Erosion of patient trust
Erosion or loss of trust resulted from problems with misrepresentation. According to staff, patients lost their trust in hospital staff when they witness the temporary changes taking place merely for the purposes of obtaining a higher grading score, such as improvements in staff behavior and better quality meals. Indeed, it was reported that patients were often suspicious of what appeared to be ‘honest’ staff behaviour during and immediately following the grading days. Patients lose their trust in nursing staff, when they see that all these efforts are just for a few days of grading. Even our 80 year old patients realise. I have heard this everywhere and here as well that, for example, the quality of meals gets better just for these two days. (Matron, Hospital D)
Restricted access to hospital services
Access to public hospitals’ services was restricted because the supplementary insurance organizations had terminated their contracts with public hospitals. The key factor for this was the grading system’s P4P element. The supplementary health insurance organizations believed that the grading system surveys were superficial and resulted in over-generous results/grades in public hospitals. This meant the supplementary insurers paying more for lower quality of care which caused disputes between the supplementary health insurance organizations, public hospitals and the accreditation body, finally resulting in the cancellation of contracts between supplementary insurance companies and many of the public hospitals in the large cities during the 2007–2010 period. The public hospitals are getting very generous grades while they do not deserve such. You cannot find a physician in some of them at night… we would not pay them as we pay for a grade one hospital, so again we stopped our contracts… (Contracts officer, supplementary health insurance organization)
Discussion
We explored the unintended and perverse dysfunctional consequences induced by the Iranian hospital grading system. A range of dysfunctional consequences arose because of the design of the grading system, the P4P mechanism and specific contextual factors pertaining to different stakeholder groups. The dysfunctional consequence that was likely to have affected the greatest number of people was related to the restrictions in access to hospital services. The disputes between payers (supplementary insurance organizations) and public hospitals on the grades awarded resulted in termination of contractual agreements between the two parties. MOHME had no legal power over the supplementary insurers, as private entities, to force them to extend their contracts with hospitals or pay hospitals based on the government announced charges. Therefore patients paid more for hospital services when these contracts ceased as they were no longer covered by the basic insurance organizations. This occurred because of the associated P4P scheme and was exacerbated by concerns over the validity of grading results. International evidence shows that publishing performance information on hospitals may restrict access because physicians may prefer to avoid the risks of admitting severe or minority patients,15,20–22 which may distort their performance. In our study, the grading and P4P mechanism limited patients’ access for a different reason, namely, the dispute between payers and hospitals over the awarded grade and the rates of payment.
Misrepresentation of performance data by hospitals was reported by all groups, including hospital staff. Hospitals knew about grading visits in advance. This can be seen as a positive outcome of the grading system because it incentivises hospitals to make the desired improvements. However, due to the nature of the audit visits, many changes were merely temporary and not embedded in organizational processes and quality improvement activities. Even when hospitals did make long-term changes, in many cases the awarded grading was higher than the hospital could have achieved on a more typical day in the preceding year. Moreover, the superficial way in which the grading inspections were conducted (as exemplified by the case of social worker at hospital B), made gaming the system even easier for hospitals. Similar findings have been reported in other countries.23,35 Such behaviours in Iranian hospitals may thus erode trust in the system.
Increased levels of anxiety among hospital staff may be viewed as a natural consequence of the grading process. 36 Our study, however, found that in addition to normal levels of ‘exam stress’, staff experienced a heightened form of worry caused by hospitals being forced to reduce levels of services following a downgrading. Indeed, private hospital staff could be dismissed by hospital managers in order to reduce hospital costs as a result of the lower revenue linked to the P4P mechanism. Another source of stress was caused by the additional work to be undertaken by staff immediately before the grading visits.
Similar to some other accreditation and performance measurement systems,37–39 the Iranian hospital grading system also induced a form of tunnel vision among hospitals. Owing to the fact that most grading measures focus on structure, 29 other important aspects of care such as caring about patients with compassion, dignity and respect were relatively neglected. Our findings concur with studies in the US which have found that performance measurement causes ‘metric-driven harm’; i.e. caregivers feel that they just pass the standards rather than consider patients as people through their care. 37 Also hospitals’ clinical and managerial autonomy was decreased, as reported in UK primary care where GPs’ autonomy was affected by the P4P programme. 40
Financial dysfunctional consequences were also experienced by hospitals. As discussed earlier, the grading domains were dominated by structure-focused measures that incentivised hospitals to purchase equipment and develop buildings. Such pressure challenged hospitals financially and favoured more wealthy hospitals. Moreover, the grading system’s P4P scheme was thought to worsen the financial situation of poorly performing hospitals as they received lower rates of payment. Such hospitals would have fewer resources to make the necessary changes ahead of a future grading. In addition, the loss of revenue would delay payment to staff, especially physicians, who may then decide to shift patients from the hospital to their own private clinics or leave the hospital. This situation would make achieving an improved grade very difficult for poorly graded hospitals.
A summary of our findings and the relationship between the dysfunctional consequences and their causes is shown in Figure 1. The grading’s P4P scheme is the key contextual factor and the driver for changes in hospitals. However, the dysfunctional consequences are triggered by two characteristics of the grading system: the superficial way in which surveys were conducted and the announced surveys. Thus, the dysfunctionality is a result of the ‘regime’ in which the Iranian hospital grading system is implemented,
41
rather than being a direct result of a performance measurement system.
The unintended dysfunctional consequences and their interwoven relations in the Iranian national hospital grading system.
We suggest that efforts be made towards proofing the process against gaming. 26 This requires, first, improving the validity and reliability of the measures and the methods by which they are assessed. The superficial criteria for measuring performance and quality, especially the concentration on structural measures, can be addressed by the development of a ‘balanced scorecard’ of measures. In addition, auditors should monitor adherence to standards using a range of methods including direct observation, document analysis and patient and staff surveys for final judgement. 27 Nevertheless, even these changes will not prevent perverse effects in hospitals unless the grading organizations and hospitals develop a degree of mutual trust. 42 Second, the surveys should be unannounced so that hospitals are unable to make temporary arrangements. In addition, in the Iranian context, the P4P element of the system appears to handicap poorly graded hospitals rather than serve as an incentive for improvement. Indeed, the system assumes that all hospitals are equally able to meet equipment and building standards. We recommend such physical standards are excluded from grading, checked only as necessary by the MOHME once hospitals apply for a licence.
This study is one of the first to explore the dysfunctional consequences of a national performance measurement system. We have attempted to provide a comprehensive evaluation based on information triangulated from the main groups involved. Although most of the dysfunctional consequences have previously been identified in relation to performance measurement systems in other healthcare systems,15,20–23,35–39 the antecedents and consequences are somewhat different in the Iranian context.
Footnotes
Acknowledgements
Aidin Aryankhesal was supported financially by Iran University of Medical Sciences (IUMS) and the Iranian Ministry of Health and Medical Education (MOHME).
