Abstract
Proposals for centralizing services are often justified on the basis of studies linking the volume of activity to the outcomes achieved. However, the evidence of such studies is far from demonstrating a causal link between volume and outcome. This article assesses the main reasons why volume and outcome studies do not in themselves demonstrate a causal link, and therefore do not provide adequate support for proposals for centralizing hospital services. It then sets out a number of precepts to guide those responsible for proposing centralization of services.
Introduction
For over 30 years, studies have appeared with the primary aim of identifying whether or not the volume of activity, i.e. the number of cases treated by a hospital or individual clinicians is linked to outcome, usually defined in terms of 30 days in hospital mortality. There are now hundreds of articles from a wide range of countries, linking the volume of activity of a hospital or clinician – usually a surgeon – to the outcomes they achieve. In the majority of these, an inverse statistical relationship is found between level of activity and mortality rates.
Systematic reviews of this evidence 1–3 and expert commentary 4–6 from the 1990s onwards have, however, confirmed its unreliability as a guide to action.
The most recent review, a study carried out for the Cooperation and Competition Panel by the University of York,
7
concluded that the extensive evidence it assessed did not provide:
Consistent recommended minimum levels of procedure volume; Proof that a correlation exists between higher volume and better outcomes; Information on the causal links between volume and outcome.
Despite findings such as these, official documents continue to use volume/outcome results in support of the changes they propose. 8,9
The main aim of this paper is to show why it is unsafe to do so. It draws on the systematic reviews cited above and many of the studies reported in them. It also draws, selectively, on wider literature relating to the factors determining the quality of hospital services.
The first part of this paper shows why the evidence does not demonstrate a causal link between volume and outcome. In the second part it goes on to consider the process of change by which the possible gains of centralization might be realized. The third part proposes some guidelines for assessing the case for centralization of hospital services. The final part draws some brief conclusions on future research needs.
The link between volume and outcome: alternative explanations
In this part we consider a number of reasons why most studies fail to provide evidence of a causal link between volume of activity and outcomes. The literature commonly acknowledges that a statistical relationship between volume and outcome may arise because of differences in case-mix rather than a causal relationship. We consider this and other analytic issues first.
We then go on to consider the most commonly offered causal explanation of the statistical link: that practice makes perfect. Most volume and outcome studies do not employ time series data and hence cannot demonstrate whether or not there is a causal relationship. The few studies that do use time series, do not in general find volume and outcome to be causally linked.
We then turn to consider other factors that might have a causal relationship to outcome. Typically, volume and outcome studies focus on the surgeon or the hospital as a whole and ignore the contribution of other factors of production, such as skilled nursing staff and other hospital resources. These may be more commonly found in larger units: the issue here is whether or not they can be made available at reasonable cost in smaller ones.
Finally, we consider cost factors. The vast majority of volume and outcome studies focus on elective or planned care. Emergency care, including services such as stroke, has very different economic characteristics. Unlike planned care, high-quality provision requires the capacity to respond at all times of the day or week. As a result, a volume effect may arise because this type of care can only be provided effectively in a large unit. We take these competing explanations in turn.
Case-mix and other analytic issues
As the numerous reviews cited above point out, what appear to be statistically sound relationships may be misleading. Many studies do not adjust for confounding factors such as differences in case-mix, or do so only partially, because they are reliant on purely administrative data that do not allow any adjustment for the physical condition of the patient or other characteristics that might affect outcomes. 10
Differences in case-mix may arise because of differences in the characteristics of the catchment areas of different hospitals 11 or because people with higher incomes and health status are more likely to travel to a more distant hospital; possibly because they believe that larger hospitals offer better quality of care and they are, on average, better fitted to make the longer trip. When differences in case-mix are taken into account, the apparent benefits of scale may be reduced or even disappear. 12
Even where a statistically well-founded volume/outcome relationship takes case-mix into account, the overall result may hide the fact that some large units or surgeons carrying out a large number of operations produce poor results. For example, a study 13 of hospitals meeting volume standards as defined by the Leapfrog group of commissioners, found that although they had better outcomes than hospitals carrying out lower volumes, 90-day mortality varied by a factor of five. Similarly, a study 14 of variations in the rate of re-operation after colorectal surgery in England found that outliers were to be found across the whole spectrum of hospital and surgical team caseload. There is also evidence that performance varies widely even among experienced surgeons. 15
Such results warn against assuming automatically, even where there is a well-founded relationship between volume and outcome in statistical terms, that better outcomes will result from centralization if pursued as a national policy if it does not take account of outcomes achieved in practice. 16 If it is applied across the board, then high-performing (small) units may be closed and poor-performing (large) units expanded.
Another reason for caution is that in some cases, small units match the outcomes of larger units. In the case of heart surgery, for example, a recent study from Japan 17 found that small hospitals were producing results in line with those from much larger units in the USA. A number of other studies have also found that small hospitals can produce results in line with larger units elsewhere. 18–21 These studies suggest that the apparent advantages of larger units may not be intrinsic but can be matched, at least in some circumstances.
This interpretation is further supported by studies which have failed to find a relationship, even for complex cases. Studies of revision of total hip arthroplasties, 22 gynaecological malignancy, 23 major colorectal surgery, 24 radical nephrectomy, 25 thoracotomy, 26 oesophagectomy, 27 coronary artery bypass graft (CABG), 28 interphalangeal (IP) fracture, 29 and a number of chronic conditions 30 have not found a statistical link between volume and outcome, despite the fact that most of these procedures would normally be regarded as complex procedures.
The weaknesses described above, go far to explain why systematic reviews of volume and outcome studies tend to be cautious in deriving clear conclusions. There is a further reason stemming from the way that size of hospital or surgeon workloads are defined. Typically, workload numbers are divided into bands defining low, medium or high levels of activity, but the width of the bands varies between studies, making it impossible to compare results and making it hard to define a clear threshold level below which no surgeon or hospital should provide the type of care concerned.
As a result, there is no agreed meaning attached to the term ‘high’ or ‘low’ volume and hence no consensus on the minimum or optimum level of activity for any given procedure. However, even where such high levels have been defined, they have been identified at very low volumes by any standard, e.g. 15 or more operations per annum 31,32 even for complex procedures. In addition, studies that have made a case for centralization may include hospitals or surgeons carrying out only one or two procedures a year. Few would be surprised that such low-volume providers could not match the performance of others.
Does practice make perfect?
In the light of the evidence briefly set out here, it is hard to understand why volume and outcome studies continue to be used as a reliable guide to action. One reason may be that it seems intuitively reasonable that volume and outcome should be positively related, on the grounds that ‘practice makes perfect’.
However, most volume and outcome studies shed no light on whether it does or not. Only a very few use data extending over different time periods. They therefore cannot identify whether performance improves as experience accumulates. Where attempts have been made to test the learning by doing hypothesis directly using time series data, they have found no relationship. 33,34 Furthermore, it is clear from the evidence cited above on the variation in performance among high-volume providers that, in some cases, practice does not make perfect either at hospital or surgeon level.
Such findings do not of course mean that there is no ‘learning by doing’ when a new procedure is introduced. There is evidence for a collective learning effect, as experience with a new procedure grows and hence that the volume effect may be temporary. Some studies 35,36 have found that a volume effect may last only until what was once pioneering surgery becomes routine. This work suggests that while it may be right to restrict new procedures to a few sites, it may not be necessary as a long-term policy, once they become routine. Such a process reflects collective learning resulting in a set of routines known to give good results, rather than the learning of an individual surgeon or hospital.
Other factors determining quality
Despite the intuitive appeal of ‘practice makes perfect’, it is obvious that volume alone cannot account for superior performance, even where a strong statistical relationship is identified. Some studies have concluded that superior hospital resources explain why larger units may perform better. These resources may comprise not only physical assets but also specially trained staff as well as effective organization and use of defined protocols for the delivery of care during and after surgery. 37–41
These findings suggest that where better outcomes are associated with higher activity, the explanation might lie with factors that may or may not be linked to volume. Hence the key question is whether the factors determining outcomes may be reproducible at low volumes.
Some evidence suggests that they can be: for example, Schell et al. 42 found no volume effect when comparing a tertiary care centre with smaller hospitals and attributed this finding to the adoption by the latter of the care pathways of the former, and more generally to the effective transfer of expertise between them. Similarly, a Finnish study 43 of rectal cancer patients with an unfavourable stage profile treated in small hospitals, found that mortality was low. This was attributed to the ability of the unit to transfer knowledge from a leading surgeon in the UK who had pioneered improved treatment methods for this condition.
Cost thresholds
Some of the factors determining outcomes at high-volume centres may not be transferable at acceptable cost. Palmer, 44 in a study in Southeast London, concluded that there would be gains to be won through reducing the number of hospital sites in that sector of London. In the case of emergency surgery, maternity and newborn care including premature babies, the minimum volume level for a high-quality service was high – relative to the existing pattern – because these services require sufficient skilled staff and equipment to be able to handle patients quickly at all times.
This minimum level is essentially a cost threshold, not a volume threshold. It is set by the level of resources required to deliver best practice care at all times. The threshold is likely to be high for any emergency service such as a major trauma or hyperacute stroke centre, where service standards require 24/7 availability of the human and physical resources required to deliver the service. Smaller units may not be able to provide such availability at acceptable cost.
However, they may still be able to deal effectively with part of the workload. Recent changes to services in London, for example, have resulted in only four major trauma centres and eight hyperacute stroke centres for a population of over eight million. However, a larger number of hospitals continue to treat less serious emergency cases and provide the bulk of care for stroke victims in specialized facilities within those hospitals, after diagnosis at the superacute centres. In other words, centralization is selective and may only benefit a limited class of patients. But if selection is not selective, then services for some may be centralized when they do not benefit or may even suffer a worse standard of care. 45
Process of change
Most studies have focused on a narrow range of procedures, usually surgical interventions. They take no account of the implications for both the losing and the receiving hospitals of the effects of moving services between sites.
Implicitly such studies assume that the higher-volume hospitals can absorb extra activity and maintain their supposed higher-quality levels. But staffing and physical constraints may make that impossible or if possible, very slow to realize. Karthikesalingam et al. 46 have argued in relation to vascular surgery, where the statistical evidence for a relationship between volume and outcome is strong, that larger units do not always have all the facilities in place for best quality care and unless these shortfalls are made good, any shift of care may not be beneficial. In addition, the loss of income from hospitals ceasing to carry out the procedures concerned, may have knock-on effects on other services, making them uncompetitive if costs cannot be reduced in line with loss of income.
Although there is evidence which showed that centralization has benefited patients, 47 Simunovic et al. 48 found that centralization had proved beneficial in one Canadian province but not in another. At minimum, this result suggests that centralization will not automatically produce benefits.
In Palmer's case study 44 there was no evidence to suggest that the hospitals losing activity were high performers. But centralization may involve the closure of high-performing units. A recent study 9 of paediatric surgery for congenital heart disease in England proposed that some high-performing units should be closed to reduce the number of sites where this service was available from 11 to 6/7.
This policy was based on volume and outcome evidence that was taken to mean that units should treat at least 400 patients a year. In addition, the study assumed that a geographical spread of units was desirable on grounds of access. Together these two principles led inexorably to proposals for closing one or more high-performing units.
In other situations, the units chosen for closure may be poor performers. But if the clinicians in low-volume hospitals are transferred to high-volume hospitals when their work is transferred, there may be no overall improvement. This would be so if, as some studies suggest, 49 surgical skill or specialist training is a key determinant of good outcomes, and if it is the presence of less skilled surgeons in the low-volume hospitals that explain their poor results. Implicitly, however, most volume and outcome studies assume that surgeons only differ in respect of the number of operations they carry out or that poor performers are randomly distributed between different providers.
Even if gains in outcomes are achieved by centralization, the longer journey times that it entails for some patients may offset them to some extent. One study 50 of stroke care found that the clinical risks of longer journeys outweighed the benefits of centralization. Nicholl et al. 51 found that for every mile a seriously injured person had to travel to hospital, the risk of death increased by one percent. Other work has found that the longer journeys discouraged use of health-care services. 52
Implications for centralization
The evidence cited here suggests that the case for centralizing services should not be based on volume and outcome studies alone, in part because the results they obtain may not be reliable and in part because, by their nature, they omit a wide range of factors that should be taken into account in a real-world situation.
But decisions cannot or will not always wait for better research. In this final section we briefly set out some practical suggestions for determining when services should be centralized.
Firstly, given the limitations of most existing volume and outcome studies, a crucial first step is to determine the strength of the evidence relating to the services considered for reconfiguration. It is particularly important to check whether other measures such as outcomes actually achieved, provide a better basis for centralization.
In the light of the weaknesses identified above in the evidence base, this should not need emphasizing. Unfortunately it does since, as noted in the introduction, proposals for change continue to cherry pick the evidence or overstate the strength of the evidence they do cite.
Secondly, proposals for centralization of services should be based on a much broader-based analysis than volume and outcome relationships. They should at minimum include an assessment of the implications for access, short-run and long-run changes in costs, the availability of other factors, particularly skilled staff, and finally the implications for other services both at the losing and gaining institutions.
Thirdly, if after this stage it does appear reasonable to conclude that the evidence does support concentration of activity, the next requirement is to check out the scope for improving performance of the smaller hospitals before considering a shift of their activity to another site. 53 This is particularly important where a shift in care would impose substantial access costs.
The critical issue is the extent to which the factors determining high quality, other than volume, can be made available in the smaller unit(s). In principle, some should be: skilled clinical staff of all the relevant disciplines, effective knowledge transfer, well-defined care processes, etc. – are transferable, i.e. can be made available in hospitals with lower levels of activity. 54
Transferring some of these contributors to high quality may entail cost penalties. In tight financial circumstances, that may be regarded as a poor use of resources. But equally any extra costs may be judged to be acceptable as the price of good access.
Some contributors may not be directly transferable: large hospitals command a range of specialists and physical resources that smaller units are unlikely to be able to match. These may make it easier for them to deal with critical situations and for their clinical staff to obtain advice quickly when they occur. They may also find recruitment easier if they are regarded as prestige institutions.
Fourthly, where the factors determining outcomes are not transferable, options other than concentration should be considered. These may include moving the individual clinician to the patient: this may be appropriate where a large hospital is surrounded by smaller satellites without in-house surgical staff; using telemedicine to allow local clinicians to seek advice from specialized units on the basis of electronic transfer of imaging: this is particularly appropriate in areas of low population density; or linking the smaller units into networks with agreed referral to ensure that the most difficult cases are transferred to better equipped units and the rest treated in smaller more accessible facilities.
Fifth, if none of these are judged to be feasible in clinical or cost terms, then the final stage is to consider how the process of change can be managed in such a way as to ensure that the potential benefits are realized in practice. A recent review from the Independent Reconfiguration Panel,
55
which advises on proposals for service reconfiguration within England, concluded that the process of change was often lengthy and the hoped for benefits slow to materialize. Reviewing a number of cases, the Panel found that merging clinical teams was not always successful: Clinical integration within multi-site acute trusts and a broader vision of integration into the whole health community has been weak. This has limited flexibility and encouraged site-based solutions rather than a broader vision of creating an excellent service for a whole community.
This conclusion, which echoes many of the findings of the Ontario Health Services Restructuring Commission's Legacy Report 56 on the process of change in Canada, confirms how hard it may be in practice to achieve the benefits that concentration of services may promise. No change should be planned without an effective process in place to ensure that the desired results are achieved in practice.
Conclusion
We have argued that volume and outcome studies do not provide, in themselves, an adequate justification for centralizing hospital services. Further studies focusing purely on hospital or surgeon volumes will not advance understanding of why quality of outcome varies between hospitals, why some small units seem able to produce the same quality as much larger ones, why large units sometimes perform badly or on the circumstances which determine whether the hoped for gains from centralizing services are actually achieved.
As Lyman
57
has put it: … a more promising avenue for exploration and potential improvement in patient outcomes exists. This is to examine the process of care elements that result in improved outcomes in higher volume or more specialized hospitals and identify ways to transfer these improvements from centres of excellence to other hospitals.
This formulation of what research is needed, recognizes that the apparent advantages of scale may be temporary and may also be transferable once the reasons for it are understood. This is likely to prove a far more valuable approach than simply expanding the number of volume and outcome studies that focus on only a fraction of the factors that make for high-quality care. That said, as we have shown above, a better understanding of the causal links between volume and outcome would provide only part of the information required to make an evidence-based case for centralization.
Footnotes
Acknowledgments
I would like to thank Sean Boyle and anonymous referees for their comments and suggestions for improving the paper.
