Abstract
Objectives:
A systematic review and meta analysis using the Grades of Recommendation, Assessment, Development and Evaluation (GRADE) approach. The aim was to evaluate the efficacy of Botulinum Toxin type A for limb spasticity on improving activity restriction and quality of life outcomes.
Data sources:
Pubmed, Cinahl, Amed, Embase and Cochrane databases. English Language. Search to January 2015.
Review methods:
All randomized, placebo controlled trials on adults with active function or quality of life measures for the arm and leg relating to spasticity of any origin and treated with a single dose of Botulinum Toxin A. Evidence quality was assessed by GRADE.
Results:
Twenty-five studies were reviewed. Meta analysis was carried out on six upper limb and six lower limb studies. Evidence quality for the upper limb was low/very low. A significant result for Botulinum Toxin A was found at four to twelve weeks for the upper limb for active function (SMD 0.32 CI 0.01, 0.62, P=0.04) These effects were maintained for up to six months for Active Research Arm Test (ARAT) only (MD 1.87 CI 0.53, 3.21, P=0.006).
Evidence quality was very low for the lower limb. No significant effect was found. Meta analysis was not possible for quality of life measures.
Conclusion:
Botulinum Toxin A may improve active outcomes in the upper limb but further evidence is needed. No conclusion can be drawn about the effect on active outcomes for the lower limb or for quality of life measures in either limb.
Introduction
This review examines the efficacy of Botulinum Toxin A on active function and quality of life outcomes using the Grades of Recommendation, Assessment, development and evaluation (GRADE) system. 1
Improvements in active function and quality of life are often key goals for patients with spasticity. However, despite some positive results in open trials, few randomized controlled trials have shown significant improvements in these areas following Botulinum Toxin injections in either the upper or lower limb. 2 Spasticity is only one part of the upper motor neuron syndrome and it may be that the negative aspects such as weakness, loss of dexterity, sensory loss or learned non use are more significant factors in the loss of active function. 3 However, spasticity is acknowledged as a hindrance to voluntary movement and by selectively and reversibly reducing spasticity, greater access to voluntary movement should be possible.
As spasticity can cause pain and discomfort, as well as psychological disturbance through poor self esteem, body image 4 or social isolation; quality of life measures are vital to a patient orientated approach. However, quality of life is a problematic area to show improvements in with such a focal treatment. While questionnaires tend to lack sensitivity, goal setting can be difficult to make achievable and realistic in this complex and diverse area. 5
No previous systematic reviews have looked at this topic using the GRADE system of analysis.
This systematic review uses the GRADE approach to look at the evidence for improving activity restriction or quality of life in the upper and lower limb using Botulinum Toxin A.
The following clinical questions will be addressed; should Botulinum Toxin A be used in patients with spasticity to improve:
Active function in the upper/lower limb?
Quality of life issues?
Method
This study defined Botulinum Toxin as any preparation of the Type A toxin for clinical use to reduce spasticity.
Inclusion criteria: studies were included in the review if they were randomised controlled trials, included the use of Botulinum Toxin A versus a placebo/saline injection/control group, on either the upper or lower limb in adult inpatients or outpatients, with outcome measures relating to active function or quality of life. Muscle spasticity of any pathological origin was considered; this allows for a comprehensive review and is more representative of the clinical setting.
Outcomes considered were: all validated Quality of Life measures, Active Range of Movement, Action Research Arm Test, Functional Independence Measure, Frenchay Arm Test, Modified Rankin Test, Barthel Activities of Daily Living Index, Nine Hole Peg Test, Gait Speed, Gait Analysis, Gait Distance, Goal Attainment Scale (with active based goals). The outcomes selected are from recommendations in the Royal College of Physicians spasticity guidelines 6 and from initial literature searches.
Exclusion criteria: studies were excluded if they did not have a control group, were observational studies, on paediatrics, used other preparations of Botulinum Toxin besides Type A, or did not use one of the above outcome measures. Repeated injection studies were also excluded to provide specificity to the search results. All non-English language studies were excluded from this review.
Database search
A literature search was carried out for all relevant studies from 1989 (date of approval of Botulinum Toxin for clinical use) 4 up to January 2015. The primary database searched was Pubmed using the MeSH terms of “spasticity” and “Botulinum Toxin”. The limits and exclusions set were: humans, English language and clinical trials (see Appendix I and II). The search was repeated adding all individual outcome measure names but no new studies were found. Searches with the same terms and limitations were carried out for Embase, Amed, Cinahl, and Cochrane.
Study quality
The GRADE approach was used as detailed in a previous paper. 7 The following definition was used: “quality of evidence” relates to the extent which one can be confident that an estimate of effect is correct. 1 One reviewer graded the quality of evidence using the GRADE approach on a pre-prepared form. A qualitative summary of appropriate studies was carried out and quantitative analysis was implemented where possible. Decisions on quality were made using guidance from GRADE publications and the website.1,8–18 Further details on the method of grading can be found in Appendix III.
Statistical analysis
Studies were chosen for meta analysis if they provided sufficient data as a group in either dichotomous form (when data allowed results to be divided into improved versus no change/worse) or as means and standard deviations. Data was used from between one and six months post intervention; results were then analysed for four to twelve weeks (to analyse the effect over the active time for Botulinum Toxin) and for twelve to twenty four weeks (to gauge any significant lasting effects of treatment). Results given for week twelve could be used in either analysis but only once for each study, i.e. not duplicated.
Dichotomous data was analysed using the Mantel-Haenszel method to provide risk ratios (RR). Continuous data was analysed using the inverse-variance method to give a Weighted Mean Difference (WMD) for individual outcome measures where possible. The standardised mean difference (SMD) was used to pool the results of all outcome measures together as it allows for a variety of measurement methods. Random effects models were used in the presence of significant unexplained heterogeneity.
For dose ranging studies, treatment groups were combined to give a single pair wise comparison, to avoid potential bias in choosing results for analysis and as recommended by the Cochrane Handbook for systematic reviewers. 19 When multiple joint results were presented the most commonly measured joint was used and any variations noted. Significance was set at P <0.05.
Data was analysed using the statistical pack Revman 5.2 from the Cochrane Collaboration.
Results
Twenty five studies were reviewed in full; eighteen were on the upper limb, six on the lower limb and one examined the effects of injection on the upper and lower limb. Eight used quality of life outcome measures and twenty-two measured active outcomes.
Qualitative summary
In general, all studies used varied measuring techniques, which were not always fully described or objective. For instance, only one Active Range of movement study reported using an electrogoniometer 20 and only one study used formal motion analysis for gait 21 with the others using subjective ratings from video analysis on varying rating scales.
The level of function amongst subjects also varied greatly with only one study stipulating appropriate inclusion criteria surrounding existing active function; this study also included targeted physiotherapy in the treatment. 22 The majority of trials had inclusion criteria, which tended to target low functioning subjects with limited active potential. The BOTOX Economic Spasticity Trial, which used the Goal Attainment Scale, did specify that subjects needed to demonstrate the potential for functional gains but did not describe how this was assessed. 23 Quality of Life issues were not stipulated in the inclusion criteria for any study.
All Action Research Arm Test trials20,24,25 included a concurrent specified exercise programme of different descriptions following injection, while the majority of other trials often left the amount of therapy as a fully uncontrolled factor. The Botulinum Toxin dosage used in each study was another significant variable.
Additional methodological weaknesses for the studies using Quality of Life measures is the lack of proven specificity and sensitivity of the scales for problems caused by spasticity. Only one study examined quality of life as the primary measure 26 , however, this was not reflected in the inclusion criteria, as a result the populations selected for each study were extremely diverse in terms of baseline quality of life scores (see Appendix V).
Active Range of Movement in the upper limb was examined in eight studies using stroke patients varying in duration from acute to chronic presentation.20,22,27–32 All measured elbow range and most also measured wrist and finger range of movement. Although half of trials did detect an increase in active movement following injection, nearly all found no statistically significant difference between placebo and treatment groups.
Active Range of Movement was measured in three lower limb studies;33–35 two looked at ankle movement in stroke patients and one at hip abduction in multiple sclerosis sufferers. Only one study demonstrated a significant improvement in the treatment group but did not fully present the results. 33
The Action Research Arm Test was used by three trials.20,24,25 One was on acute stroke patients, one on a mixed aetiology group and one on stroke survivors of over a month.
All found significant improvements in scores for the treatment groups. The acute stroke study found significant changes only in the sub group who had no arm function at the start of the study. The largest study found no difference at one month, their chosen end point, but a significant difference at 3 months.
The Barthel Index, which is a global scale rating 10 items of activities of daily living and mobility, was used by seven studies despite its generic nature.24,25,27–30,36 All, except one, which had a mixed aetiology, were on stroke patients. Six studies found no significant changes. One study, which had a higher level of baseline function in their study sample, detected a significant improvement. 24
Gait analysis was carried out in six studies,21,31,33,35,37,38 subjects included; stroke, brain injury and cerebral palsy adults. Two out of the six studies found a significant difference in favour of the treatment group.33,38
A timed walk was recorded by seven studies.21,31,35,37–40 Five used a 10m walk, one a two minute timed walk and one a six minute walk test. Although improvements were noted in treatment groups in many of the studies, no statistically significant changes were documented.
Other active measures used in the reviewed studies were; Rivermead Motor Assessment,29,35,40 Nine Hole Peg Test,25,40 Functional Independence Measure,41–42 Frenchay Arm Test,31,39 Motor Assessment Scale, 30 Fugyl Meyer,38,41–43 Wolf Motor Function Test, 22 Motor Assessment Log, 27 Goal Assessment Scale.23,26 The majority either found no significant differences or failed to present the results fully. Meythaler et al. 27 did find a significant improvement with Botulinum Toxin and physiotherapy combined when measured with the Motor Assessment Log, however, this was a small crossover study, with, at present, no other studies to support this finding (Appendix IV).
Quality of Life measures were recorded in eight studies with five different measures being used in total; EuroQol-5D (EQ-5D),25,36 Stroke Impact Scale,22,25 Short Form (36) Health Survey (SF-36),21,27,41,44 Assessment of Quality of Life (AQoL) 26 and Oxford Handicap Scale. 25
Outcomes are difficult to compare because of the different sections to each scale and the number of scales used in each study. A significant difference was found for pain at three months and anxiety/depression at a year for the EuroQol-5D. 25 Rodgers et al. 25 found a significant difference in the communication domain at one year and Wolf et al. 22 in the hand function domain of the Stroke Impact Scale. The Short Form Health Survey demonstrated significant improvements in general health and emotional health, 27 in mental health 21 and social functioning.21,44 A significant difference was also found for treatment groups at three months and one year for the Oxford Handicap Scale. 25 No significant differences were found for the Assessment of Quality of Life measure.
Data was not sufficient for a meta- analysis to be carried out for quality of life (see Appendix V).
Quantitative analysis
A meta- analysis from 4–12 weeks was possible for Active Range of Movement of the upper limb, Action Research Arm Test and the Barthel Index.20,24,25,28,29,31 Only the Barthel Index failed to show a significant difference in favour of Botulinum Toxin (see Table 1). A pooled analysis gave a standardized mean difference of 0.31 (95% CI 0.00, 0.62), P=0.05. A random effects analysis was used due to heterogeneity being detected (I2=76%). With the Barthel Index results removed, due to its more global focus in measurement, heterogeneity was reduced to I2= 49%, this gave a new outcome of; SMD 0.32 (95% CI 0.01, 0.62), P=0.04 (Figure 1). This equates to a small treatment effect using Cohen’s descriptors. 45
Results of individual outcome measures.
(P value * = <0.05 significance) MD, mean difference; SMD, standardized mean difference; RR, risk ratio; CI, confidence interval; AROM, active range of movement; UL, upper limb; ARAT, action research arm test; BI, barthel index; LL, lower limb, # weeks 12–24 mixed analysis of gait speed and distance.

Standardized mean difference Botulinum Toxin A vs. placebo: improving active movement in the upper limb, 4-12 weeks (to the right of 0.00 favours Boutlinum Toxin A).
Evidence quality: Low/Very Low (see Table 2) (Appendix IV)
Only analysis of the Action Research Arm Test was possible for 12–24 weeks.20,24–25 This gave a mean difference of 1.87 (95% CI 0.53, 3.21), P=0.006, I2 = 0% (Figure 2(a)).
Grades of evidence for individual outcomes.
AROM, active range of movement; UL, upper limb; ARAT, action research arm test; BI, Barthel index; LL, lower limb; QoL, quality of life; RCT, randomized controlled trial; n, number; v low, very low.
Key: 0= no downgrade, –1 = serious limitation, –2 very serious limitation.

Mean difference Botulinum Toxin A versus placebo; improving active movement in the upper limb as measured with ARAT, 12–24 weeks.
A meta-analysis was possible for gait speed, gait distance and lower limb Active Range of Movement at 4–12 weeks but with very small numbers.21,31,33–35,37 No significant difference was found for any outcomes (see Table 1).
Evidence quality: Very Low (see Table 2) (Appendix IV)
A pooled analysis for 12–24 weeks for gait speed and distance also showed no significant improvements.35,37 (SMD 0.01, CI −0.23, 0.25, P=0.94) (Figure 2(b)).

Standardised Mean difference Botulinum Toxin A versus placebo; improving active movement in the lower limb as measured by gait speed and distance, 12–24 weeks.
Discussion
This meta- analysis found a significant result in favour of Botulinum Toxin for active measures, when restricted to upper limb specific measures, but with a small effect size. Evidence quality was low to very low primarily due to a lack of study directness and small sample sizes.
Fewer studies focus on the lower limb and there was little usable data to study in this analysis. A statistically significant effect was not found for the lower limb but again this was very low quality evidence; therefore no clinical conclusions can be drawn from the results.
The reasons for not detecting active change has been discussed in other articles2,3 with reasons such as poor study quality, inappropriate population and outcome measure selection as well as the lack of concurrent targeted therapy programmes. The results of this review certainly lend support to these hypotheses. The Royal College of Physicians’ guidelines state that Botulinum Toxin should only be used in conjunction with targeted and goal orientated physiotherapy programmes, 6 yet only a few trials included any formal exercises, with the rest leaving existing physiotherapy, if present, as a large variable. It is notable that the best results came from studies which used the Action Research Arm Test and which all used some form of exercise programme in conjunction with the injection.
The BOTOX Economic Spasticity Trial is a comprehensive study of active outcomes but it did not find any significant differences between groups. 23 However, it was acknowledged that the physiotherapy given to both groups, which was uncontrolled, was likely to be a significant confounding factor. This trial did suggest that active outcomes were more likely to be achieved in younger subjects and those with a shorter duration since their stroke; 23 this may provide valuable guidance for future studies’ inclusion criteria.
The biomechanical properties of muscles need time to change; for example fast glycolytic fibres often convert to slow oxidative in the presence of hypertonia 46 and so would need time to convert back. Skill acquisition is also required, as neurological patients have to relearn how to perform even simple movements as well as everyday tasks. 47 This requires not only skilled guidance and repetition but also a significant amount of time and coordinated team approaches. Cohen 48 reports that with intense training of hand skill, reorganization of the cortical map occurs; yet many studies allow uncontrolled or unrecorded exercise programmes and take the primary measures at just four weeks post injection; leaving little time or opportunity for skill acquisition. Measurement of outcome at 24 weeks was only possible for the ARAT, which showed continuing benefit.
Outcome selection remains extremely mixed from study to study with no consensus being followed. Measures such as the Functional Independence Measure and Barthel Activities of Daily Living Index rarely detect changes from focal treatments yet continue to be used. 3 Few studies used Goal setting and when it was utilized, tended to focus on passive goals. However, future studies may start to recognize the value of using goals, particularly the Goal Attainment Scale, as a way to capture changes in active function, which are meaningful to the patient and can cover the wide diversity in patterns of spasticity. 5 A large observational trial found that although active goals were set less often than passive ones (22% versus 29%), they were still achieved by up to 72% of subjects. 5
The directness of the studies’ results was a major factor in their downgrading. Patient population selection was not aimed at detecting changes in active movement; in fact many groups were extremely low functioning with a limited likelihood of improving on active tasks. While outcome measures continue to be selected separately to the formation of appropriate inclusion criteria the directness of results will remain limited as will their validity to clinicians.
Despite these limitations, this review found some benefit for active measures in the upper limb. Perhaps with improved study planning and population selection a more conclusive outcome with a stronger quality of evidence can be found for both the upper and lower limbs.
Quality of life has been included in a number of studies but almost never as a primary measure. Many different questionnaires are used which makes comparison difficult. Several detected some significant changes following Botulinum Toxin but the miscellany of methodologies and patient populations make the results of limited use. Additionally, the query must be raised whether a generic quality of life questionnaire can detect improvements from such a focal treatment. Much more specific investigation is needed on quality of life issues for spasticity sufferers; Are quality of life issues related to spasticity impairments? Which issues are most affected? Which questionnaires detect these changes the best? Should a more specific questionnaire be developed? Once these questions are answered future studies should be able to examine the impact of Botulinum Toxin injections on quality of life and direct clinical practice to help patients with this important area.
Observational studies may be better suited to assessing the evidence surrounding active outcome measures and quality of life, as they are not as restricted as randomized trials. They often stipulate more active inclusion criteria and include some form of rehabilitation therapy in conjunction with injections. As a result they have reported more positive outcomes in both active and quality of life measures.49–53 Therefore, a review of the observational literature using GRADE is recommended to investigate outcomes using Botulinum Toxin as part of a complex intervention with other therapies.
A major limitation recognised by this review is the use of only one author to carry out the search and critique the articles. PRISMA recommends that the use of at least two investigators reduces the risk of rejecting relevant reports. 54 Every attempt was made to minimise bias by the use of pre-drawn up forms, criteria and by double checking results. A second author was used for statistical analysis and robustness. It is a possibility that relevant studies have been missed in the search or that unwitting bias has been allowed into some decisions. Other criticisms may be directed at some of the subjective decisions on grading the quality of evidence. The decisions are based on uniform criteria and any diversions are fully explained. As documented by GRADE; as long as decisions are justified with clear reasoning then different recommendations can be valid and must be judged by the clinician on this basis. 1 The decision to group dose ranging studies may also be an area for debate. A meta analysis in this area is often problematic. Trials look at differing indications, injection techniques, doses and muscles as well as a wide variety of outcome measurement; this is often alongside the use of ordinal measures with parametric statistical tests. 55 Therefore any results and conclusions must take into account these limitations.
Nonetheless some key gaps in current Botulinum Toxin research have been highlighted which, if properly addressed will help clinicians and purchasers make informed decisions on clinical care in the future.
Clinical messages
There is low quality strength of evidence that Botulinum Toxin A can improve active outcomes in the upper limb as measured by Active Range of Movement and Action Research Arm Test.
Evidence is inconclusive for active outcomes in the lower limb or quality of life measures in either limb.
Footnotes
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
