Abstract
We systematically reviewed the evidence on the effectiveness of telerehabilitation (TR) applications. The review included reports on rehabilitation for any disability, other than mental health conditions, and drug or alcohol addiction. All forms of telecommunications technology for TR and all types of study design were considered. Study quality was assessed using an approach that considered both study performance and study design. Judgements were made on whether each TR application had been successful, whether reported outcomes were clinically significant, and whether further data were needed to establish the application as suitable for routine use. Sixty-one scientifically credible studies that reported patient outcomes or administrative changes were identified through computerized literature searches on five databases. Twelve clinical categories were covered by the studies. Those dealing with cardiac or neurological rehabilitation were the most numerous. Thirty-one of the studies (51%) were of high or good quality. Study results showed that 71% of the TR applications were successful, 18% were unsuccessful and for 11% the status was unclear. The reported outcomes for 51% of the applications appeared to be clinically significant. Poorer-quality studies tended to have worse outcomes than those from high- or good-quality studies. We judged that further study was required for 62% of the TR applications and desirable for 23%. TR shows promise in many fields, but compelling evidence of benefit and of impact on routine rehabilitation programmes is still limited. There is a need for more detailed, better-quality studies and for studies on the use of TR in routine care.
Introduction
Telerehabilitation (TR) is the provision of rehabilitation services at a distance using telecommunications technology as the delivery medium. 1 TR has been proposed as a way of increasing accessibility and enhancing continuity of care for vulnerable populations with disabilities, with potential time- and cost-savings. 2
Reviews of TR have reported that there is promise in this area, but have drawn attention to the limitations of available evidence. Huis et al. 3 reported that evaluation of the effects of some categories of TR interventions is still in its early stages, with an emphasis on technical feasibility and acceptability. Such limitations were also considered by Russell who noted that much research in telemedicine for physical rehabilitation has been technology-focused, and with small sample research designs. 1 He concluded that demonstration of viable TR services in real-world environments using well-controlled research methodologies and large patient cohorts is required. Kairy et al. concluded that while evidence is mounting concerning the efficacy and effectiveness of TR, high-quality evidence regarding impact on resource allocation and costs is still needed to support clinical and policy decision-making. 2 A broad overview by Rogante et al., which included reviews and technical reports, referred to a lack of comprehensive studies providing evidence for supporting decision- and policy-makers in adopting TR technologies in clinical practice. 4
Reviews of TR in specific applications have also drawn attention to limitations in evidence. Schwaab concluded that trials of home-based cardiac rehabilitation (CR) were predominantly feasibility studies with few patients. 5 Also, most patients in the studies he reviewed were uncomplicated low-risk males, most of whom were not included in studies until weeks or months after the cardiac event. A statement from the American Heart Association and the American Stroke Association 6 notes that TR has the potential to provide timely and efficient post-acute care for stroke patients beyond the hospital and into their homes. However, more work is needed to demonstrate the efficacy of these methods in promoting in-home rehabilitation. A review of TR in stroke care found that most studies in post-stroke rehabilitation showed promising results in improving the health of patients, although the quality of evidence was low. 7 There was no evidence regarding the effects on resource utilization or cost-effectiveness. Van Dijk and Hermens found there were some promising applications of distance motor training, including use of virtual reality and robotic devices, but that the strength of evidence from these studies was poor. 8
We have conducted a systematic review of literature on the effectiveness of TR applications. Our intention was to provide guidance to decision makers in health care, focusing on studies that provided an indication of the use or potential of TR in routine practice. We derived the scope of TR from the definition of rehabilitation medicine. Rehabilitation medicine is involved with the prevention and reduction of functional loss, activity limitation and participation restriction arising from impairments; the management of disability in physical, psychosocial and vocational dimensions; and the improvement of lost function.
The focus of our review was on papers that reported health-related outcomes for patients and/or caregivers. All types of study design were considered. We excluded studies on applications that we judged to be routine monitoring or consultation as part of normal practice. The review included reports on rehabilitation for any disability, other than mental health conditions, and drug or alcohol addiction. All forms of telecommunications technology for TR were considered, including telephone, Internet, videoconferencing (VC) and virtual reality (VR) approaches. We did not include publications that reported outcomes only in terms of satisfaction with or acceptance of a telemedicine application.
Methods
The protocol for the review was developed by the authors, following approaches taken in previous telemedicine reviews, and approved by the Finnish Office for Health Technology Assessment and the Institute of Health Economics. Steps in the review process were performed independently by two or more reviewers, and any differences resolved by discussion. Initial screening of the identified articles was based on the information obtained from their abstracts. All abstracts were read independently by at least two of the authors and selection of relevant articles agreed upon in discussion. When an abstract did not give sufficiently precise information about the study, or such information was not available at all, the article was obtained for further review.
Each full-text article obtained for closer inspection was evaluated independently by at least two of the authors, who then reached a consensus on whether or not an article should be included in the final review. Data were extracted independently from each of the selected publications using a table that was created a priori. Any disagreements were resolved by consensus. Information extracted included the study objectives, design, type of comparison with the TR intervention, setting and duration; patient numbers and characteristics, and reported outcomes.
Literature searches were performed using the following electronic databases: Cochrane Library, MEDLINE, EMBASE, PsycINFO and CINAHL to November 2009. There was no date limit or language restriction.
Articles were selected which described, in a scientifically valid manner, studies reporting clinical or administrative outcomes for patients or caregivers using TR applications in the management of somatic disorders. We included controlled studies in which TR was compared with a non-TR alternative, and non-controlled studies in which there were no fewer than 20 subjects.
We excluded studies on rehabilitation for mental health conditions or substance abuse, those in which the only outcome measures were related to satisfaction with TR, and reports on technical development or feasibility of a rehabilitation technology.
Strength of evidence was assessed with an approach used in previous telemedicine reviews that takes account of both study design and study quality. This provides five ratings of study quality and implications for decision-makers. 9
For study performance, five areas of interest were considered (patient selection, description/specification of the interventions, specification and analysis of the study, patient disposition and outcomes reported). When reviewing an article on a TR study, each of these areas was given a score of 0 (relevant information was missing or given in only minimal detail), 1 (reasonable detail was provided, but there were some important limitations) or 2 (information was satisfactory, with no significant limitations). A further score was allocated to each publication, according to the study design that had been used. Large RCTs, defined as those with at least 50 subjects in each arm, were given a score of 5. Smaller RCTs had a score of 3, prospective non-randomized studies a score of 2, retrospective comparative studies a score of 1 and non-comparative series a score of 0.
At least two authors independently assigned scores to each study. If there was disagreement on the study design classification or if individual scores for any performance item differed from each other by more than one, the discrepancies were discussed and resolved by consensus. For each publication, the mean of the authors’ individual scores was reported to the nearest 0.5.
The performance and design scores were then combined into an overall score for each publication to give an indication of the confidence that decision-makers should place in the findings that were reported. The maximum value was 15 (corresponding to a large RCT with no significant limitations). On the basis of the combined scores, we assigned each study to one of five categories to give an indication of the reliability of the findings that the study reported (Table 1).
Study quality and implications for decision making 9
In addition to the appraisal of quality, judgements were made on whether the reviewed publications indicated that the TR applications had been successful. Success was defined in terms of whether TR had performed at least as well as a similar alternative intervention. The principal summary measure was the difference in means. For comparative studies, where the interventions in each group were similar, TR was considered successful when it provided better or equivalent outcomes to those for the comparator. In studies where the TR intervention included additional resources to those of the comparator intervention, it was considered successful when it provided better outcomes.
Judgements were also made on whether study findings included outcomes that were clinically significant (results such as a treatment effect large enough to be of practical importance to patients and health-care providers). We considered whether the clinical significance of the application had been explained or justified by the authors, effect sizes, relationship of treatment effect to the minimal clinically important difference, and any side effects of the treatment. Judgements were reported as Yes or No when the clinical significance of reported outcomes was apparent, and as Unclear when the data in the publication were insufficient to reach a conclusion.
For each paper, we considered whether additional data to those reported were needed to establish the TR method as suitable for routine use (ratings: yes, desirable or no). Factors informing our ratings included the success and clinical significance of the TR application, whether it was still under development, the size and composition of the patient or carer population that was studied and the length of follow-up following initiation of TR. A ‘yes’ rating indicated substantial limitations in the available evidence, due to factors such as small sample size, high drop-out rates, absence of long-term follow-up and incomplete reporting of outcomes. ‘Desirable’ referred to studies of adequate quality that had provided reasonable evidence of success and where further work was required (typically with larger and/or different patient populations) to confirm the findings. A ‘no’ rating was used both for studies that had provided a strong indication that TR was suitable for routine use, and for those that had indicated that an intervention was clearly unsuccessful.
Results
From 1870 publications identified in the literature search, 133 were retrieved for closer inspection. Sixty-six papers dealing with 61 unique studies were judged to meet the selection criteria and were included in the review.
Thirty-one of the studies (51%) were of high or good quality, 18 (30%) fair to good and the remainder poor to fair (Table 2). In the selected studies 69 comparisons were described. In one third of the comparisons, the interventions were similar – for example home- and hospital-based rehabilitation programmes with identical exercise routines and monitoring. In 56% of the comparisons the intervention in the TR arm was more elaborate than that offered to the control group. Typically, patients in TR were contacted more frequently, given additional interventions or provided with more information. In one study, the TR intervention was less elaborate than that used by the control group.
Quality of reviewed studies by clinical category
An overview of judgements for the reviewed TR studies is given in Table 3. For a variety of populations and types of outcome, 71% of the TR applications were successful. For 11% the status was unclear and 18% were unsuccessful. The reported outcomes for 51% of the TR applications appeared to be clinically significant. Clinical significance was unclear in 20% and was not achieved or not reported in 29%. Further study was judged to be required for 62% of the TR applications and desirable for 23%. For two of the applications further study was not needed as there was sufficiently strong supporting evidence. For another seven applications (11%) additional research appeared unnecessary as the interventions were clearly unsuccessful. Further details of judgements made on the reviewed TR studies for each of the clinical categories are shown in Table 4.
Overview of judgements on reviewed TR studies
*One study not successful for all outcomes
†Two studies not successful for all outcomes
‡[ ] indicates intervention was unsuccessful
Details of judgements on reviewed studies
Discussion
For the present review we sought evidence on the effectiveness of TR applications, with an emphasis on studies that provided an indication of the use or potential of such approaches in routine practice. Using broad selection criteria, which included all types of study design, we located only 61 studies on all areas of rehabilitation for somatic disorders. There is still a very small database for TR studies that provide useful data on clinical outcomes.
During the abstract selection process we noted a number of studies that provided descriptions or considered the feasibility of innovative approaches to TR, but which did not meet our criteria because patient outcomes were not provided or numbers of patients were very small. Many of these reports had been published some years previously but more detailed follow-up investigations have not yet emerged.
While the studies we reviewed included some that made use of recently-developed technologies it was notable that telephone-based interventions continue to play an important role in TR. Over 60% of the reviewed studies included use of telephone links for communication (over 80% in the cardiology group).
The approach used in the review illustrates various factors that should be taken into account in appraising evidence on telemedicine applications that are proposed for use in routine health services. Evaluation of study quality is an important first step but there is then a need to consider the context both of the studies and the health system to which they relate. Questions arise on the types of comparison being made with the telemedicine service, and whether the application has been shown to give equal or better outcomes than an alternative approach. Beyond these points are the issues of whether the study outcomes were clinically significant and if further study is needed before the application is approved for routine use. Finally, as indicated by Kairy et al., research on TR needs to be matched by an understanding of factors influencing the sustainability of TR programmes to be useful to clinical and policy decision-makers. 2 In a previous review we noted that the effect of telemedicine on organizational and health-care process changes may have a significant effect on the success of the programme, but that these issues were rarely discussed in the literature. 9
Over half of the studies in our review were of high or good quality, but 29% had limitations that should be considered in any implementation of their findings and a further 20% had substantial limitations. Limitations included the low power of some studies, high drop-out rates and short follow-up. Also, a general problem with studies on TR is the absence of blinding.
In the majority of the comparative studies the intervention provided through TR was more elaborate than that in the comparator. Patients were contacted more frequently or provided with additional services, so that these studies may establish that any advantages are related to use of a more elaborate intervention, rather than to the method of delivery.
Overall, we found that TR had been shown to be successful in 71% of the studies. Even in the most studied areas (cardiac and neurological rehabilitation) success was not demonstrated for a large minority of applications. Outcomes appeared to be clinically significant, as well as statistically significant, in 51% of the reviewed studies. Given the variety of patient groups and conditions considered there may well be difficulties in defining clinical significance in some circumstances. Judgements on 38 of the studies (62%) suggested that further work would be needed to establish that the TR applications covered were suitable for routine use. Further work was seen as desirable for a further 14 applications.
An interesting feature of the literature we reviewed was that poorer-quality studies tended to have worse outcomes than those from high- or good-quality studies (Table 5). The usefulness of TR applications was especially evident in the good-quality studies. The significance of this distribution is uncertain, given the range of TR applications, patient populations and settings. Perhaps better-quality studies are needed to establish the true benefits of TR. Another possibility is that some good-quality studies with negative findings have not been reported, giving a publication bias.
Study quality and success of TR applications
Our review identified a number of studies that demonstrated the success of individual TR applications, but in most reports there was little or no discussion of how these approaches might be integrated into health-care systems. Our findings are consistent with the conclusions of earlier reviews that further research on TR with stronger studies is required. It may be that some additional information was not available because of publication delays, but we suspect that the large, good-quality studies that have been called for have yet to be put in place. TR shows promise in many fields but compelling evidence of the benefit, and of the effect on routine rehabilitation, will probably need to await the availability of adequate research funding and a high level of commitment by rehabilitation professionals to engage in longer-term studies.
