A systematic review of case-mix adjustment models for stroke

Abstract

Objective:

To identify any externally validated prognostic model for predicting outcome in unselected populations following acute stroke comprising variables feasible for collection in routine care.

Data sources:

Searches were run in MEDLINE, EMBASE, CINAHL, PsycInfo, AMED and ISI Web of Science with no limits on publication date or language.

Review methods:

Any study describing the development or external validation of a discernible prognostic model to predict any valid outcome following acute stroke was included. Papers were retained if they met pre-specified inclusion criteria identified from previous reviews and pertinent discussion papers. Data extraction focused on methodological quality of model development, generalizability and feasibility of variable collection. Model performance was examined through consideration of external validation studies.

Results:

Seventeen externally validated models were identified from 43 papers fulfilling inclusion criteria. Quality of studies describing model development was variable and model performance in external validation studies was generally poor. Models were generally constructed through secondary use of randomized trial or stroke database data. Prognostic variables broadly encompassed markers of stroke severity, pre-stroke function and comorbidities. One model that fulfilled the review criteria and had extensive external validation in a range of post-stroke populations was identified (the Six Simple Variables model).

Conclusion:

The Six Simple Variables model performed well in six external validation studies, although prediction of outcome in patients with milder strokes was less reliable. Other models identified in this review have been developed using robust methodology but comprise more complex clinical variables which may limit their utility in routine stroke care.

Keywords

Stroke risk adjustment statistical models prognosis

Introduction

Stroke is a heterogeneous clinical syndrome in which the clinical course and outcomes for individual patients are dependent not only on the site and size of the pathological lesion, but on the context of the injury in relation to combinations of mediating factors that are unique to individuals. Pre-stroke function, comorbidities, social, environmental and personal factors are all likely to affect an individual’s functional, cognitive and social outcomes following a stroke and limit the validity of direct comparisons between individuals or populations. If comparisons of individual patient or population outcomes are to be attempted, as may occur for the purposes of audit, observational studies or performance monitoring, it is necessary to control for confounding factors and attempt to statistically homogenize the groups. This process is called case-mix adjustment. Inadequate case-mix adjustment has been highlighted as a major factor limiting between-site comparisons of patient outcome (particularly mortality).¹ However, such comparisons are common.² As routine collection of outcomes, specifically patient-reported outcomes, is likely to expand as a mechanism to record patient-centred quality of care,^3,4 the need for robust case-mix adjustment models becomes more urgent.

Previous reviews have been undertaken to identify prognostic models specifically in stroke^5–12 and have found the models to be generally poor. The review by Counsell et al., published in 2001,⁵ identified studies describing models to predict stroke survival, survival in an independent state, or alive and at home. The vast majority of the 83 discrete prognostic models identified demonstrated significant flaws in statistical or internal validities and none was fit for purpose to case-mix adjust in routine clinical care.⁵ Other authors have attempted to identify prognostic models which were developed to predict functional outcomes following stroke.^6,11 However, these have tended to be limited to prediction of activities of daily living (most commonly the Barthel Index), which has limitations due to its marked and well-documented ceiling effects.¹³

Since these reviews were performed, clear evidence demonstrating the benefits of organized specialist multidisciplinary stroke care over general ward care¹⁴ has led to the widespread adoption of this model and fundamental changes to the delivery and monitoring of stroke care across healthcare systems.^15–18 It is possible that prognostic factors previously unknown or overlooked are important in determining patient outcomes and these should be modelled explicitly. In addition, increasing scrutiny of the quality of prognostic research^19–21 and more sophisticated statistical modelling techniques (e.g. multilevel modelling, latent variable analysis and structural equation modelling) are likely to have altered the type and quality of models to predict outcomes following stroke.

We undertook a systematic review of the literature to ascertain whether any robust, externally validated prognostic model exists to predict outcomes in unselected post-stroke populations comprising variables that are feasible for prospective collection in routine care and observational research.

Methods

Any study or review describing the use of a discernible prognostic or case-mix adjustment models to predict valid and reliable stroke outcomes at a fixed time point following ischaemic or haemorrhagic stroke was considered. Studies referring to ‘adjustment for baseline variables’ were excluded unless a discernible model was further qualified. Studies using cohort data from stroke registers, prospective observational studies performed with the primary objective of model development and studies that described the secondary use of data obtained through randomized controlled trials were all included if there were further studies that validated the models in independent prospective cohorts. Papers describing models which lacked external validation were excluded. Previous systematic reviews were identified and examined to identify any externally validated models that had otherwise been overlooked.

No restriction was placed on patient age or stroke severity. However, studies describing models developed in populations unlikely to be representative of the wider stroke population (e.g. exclusion of the oldest old or patients at the extremes of stroke severity) were excluded. Studies with a focus on transient ischaemic attack or subarachnoid haemorrhage were not considered in this review.

Information sources

A comprehensive search strategy was generated in collaboration with colleagues at the University of Leeds library. The initial strategy included the stroke terms as used by the Cochrane stroke group,²² key terms from relevant papers of which we were already aware and discussion between the research team. This strategy was run on the MEDLINE database and reviewed to ascertain whether appropriate papers were being captured. Following this review, the search strategy was amended (Appendix 1 online). The search was then run through MEDLINE, EMBASE, CINAHL, PsycInfo, AMED and ISI Web of Science with no date or language limits on 30 May 2009. Handsearching of references of included studies and previous reviews was performed to identify further potentially relevant citations.

Study selection

Titles were reviewed independently by two researchers (AF and AR) and obviously irrelevant titles excluded. Where there was no consensus, titles were examined by a third reviewer (ET). Titles and abstracts of potentially relevant papers where there was agreement between at least two of the three reviewers were then re-examined by AF and ET and included through consensus. Citations not fulfilling inclusion criteria were discarded (Figure 1).

Figure 1.

Identification of models for inclusion in review.

Data extraction

Data extraction was performed in duplicate by two independent researchers (ET and RL). Further examination of models was only performed if there was evidence that they had been externally validated in independent populations.^23,24

Quality of the prognostic studies was considered against criteria used by Counsell et al.⁵ in their 2001 review of prognostic models in stroke and a framework to assess model internal validity and potential biases as described by Hayden et al.²⁰ Independent statistical appraisal of model development methodology and performance was performed by TM using criteria selected from those suggested by Counsell et al.⁵

Details regarding the name of the case-mix model, author, model variables, reference population (inception cohort and study exclusion criteria), prospective or retrospective data collection, losses to follow-up, outcome measures (and time point of measurement), sample size, external validation of model and feasibility of collection of independent variables were extracted if available. Studies describing the development of models and subsequent validation studies were then grouped together.

Data items: model characteristics

Case-mix adjustment or prognostic models containing independent variables which were considered neither clinically relevant nor feasible for routine collection in non-specialist settings (e.g. magnetic resonance imaging or invasive imaging) were excluded. Models developed on study populations with an inception cohort (time from stroke onset to data collection) of greater than two weeks were also excluded. Models where outcomes were not assessed at a fixed time point following stroke onset were excluded.

Adequate sample size for model development was assumed if the ‘events per variable (EPV)’ was greater than ten for dichotomous outcomes.²⁵ The EPV is calculated by dividing the number of discrete outcome events (e.g. number of deaths) by the number of independent (including dummy) variables in the model. For a dichotomous dependent variable, the lower frequency outcome from the pair is used to calculate EPV.²⁵ Models developed through adoption of a valid method of variable selection (clinical reasoning and forwards or backwards stepwise selection for logistic regression modelling) were retained. Models were excluded if there was no specific consideration of multicollinearity and interaction terms,²⁶ or if linearity assumptions were not tested or addressed if not met.

Summary measures

Measures of model performance were extracted from external validation studies of identified models and comprise measures of discriminatory function (e.g. c statistic), sensitivity/specificity analysis, the coefficient for multiple determination (R squared statistic) and calibration in external datasets.

Risk of bias

Potential sources of bias from individual studies (retrospective secondary use of data from randomized controlled trials (RCTs) or stroke registers, selected populations that may limit generalizability or overfitting of models) are highlighted and discussed qualitatively. Quantitative assessment of bias has not been performed. Comparisons of models based on performance are limited by disparate measures of performance, data sources and variable methodological quality of external validation studies.

Presentation of results

Quantitative synthesis has not been possible in this review; models are presented in tabular form with their relative performance against each of the quality and statistical criteria outlined. The variables included in each of the models are also provided when they have been reported in the original studies.

Results

The initial search identified 21 592 titles. Removal of duplicates reduced this to 19 867. Screening of titles and abstracts led to two independent reviewers agreeing to the retention of 176 citations. In 487 citations where consensus between these two reviewers was not met, the opinion of a further independent reviewer (ET) was sought with the retention of a further 183 citations. Following discussion (AF and ET), 119 papers were examined in full text. A further 15 potentially relevant citations were identified through handsearching of the reference lists of these papers (ET). In addition, five review articles were retained to identify any models that may have been overlooked.^5–9 Examination of previous reviews did not identify any additional externally validated models fulfilling inclusion criteria and comprising variables that were feasible for collection in routine care. A total of 43 papers were retained for data extraction including three discussion papers (Figure 1).

Twenty-one discrete prognostic or case-mix adjustment models were identified predicting mortality, dependency and functional outcomes following stroke. In addition, two studies described the use of three existing impairment scales to predict post-stroke outcome (Table 1). These were the National Institute of Health Stroke Severity score (NIHSS), the Canadian Neurological Score (CNS) and the Middle Cerebral Artery Neurological Score (MCANS) also known as the Orgogozo score.^27,28 Of the 21 models identified, four^29–32 had been validated by the authors using a ‘split sample technique’. Here the model is developed using data from a proportion of the sample (training set) and validated on the remainder (validation or test set). This represents a form of internal (not external) validation²³ and these models were not considered further.

Table 1.

Twenty-one models and three existing scoring systems identified through review process

Model	Citation
Anderson	Anderson et al. (1994)²⁹
Belfast	Fullerton et al. (1983)⁴¹
Bristol	Wade et al. (1983)³⁴
Edinburgh	Prescott et al. (1982)³⁶
G score	Gompertz et al. (1994)⁴²
Guys	Allen (1984)⁴⁰
Johnston	Johnston et al. (2000)⁴⁸
Lincoln	Lincoln et al. (1989)³³
Masiero	Masiero et al. (2007)³⁰
Modified National Institute of Health Stroke Scale (mNIHSS)	Lyden et al. (2001)⁴⁹
Shortened National Institute of Health Stroke Scale (NIHSS_8)	Tirschwell et al. (2002)⁵¹
National Institute of Health Stroke Scale + age (NIHSS+age)	Weimar et al. (2004)⁴⁶
Orpington	Kalra et al. (1993)⁴³
Six Simple Variables	Counsell et al. (2002)³⁸
Tilling	Tilling et al. (2001)⁵⁰
Uppsala	Frithz et al. (1976)⁴⁷
Wang	Wang et al. (2003)³¹
Weimar Ischaemic Stroke Model (Weimar)	Weimar et al. (2002)⁴⁵
Weimar Intracerebral Haemorrhage Model (Weimar_ICH)	Weimar et al. (2006)⁴⁴
Williams	Williams et al. (2000)³²
Young	Young et al. (2001)³⁵
Existing prognostic models
Canadian Neurological Scale (CNS)	Muir (1996)²⁸
Middle Cerebral Artery Neurological Score (Orgogozo score)	Muir (1996)²⁸
National Institute of Health Stroke Scale (NIHSS)	Muir (1996),²⁸ Lai (1998)²⁷

Data extraction was performed from papers describing development and validation of 17 prognostic models. Detailed tables examining studies describing development and validation of individual models are available from the authors. Table 2 offers a summary of the key features of each externally validated model.

Table 2

Performance of identified models against quality criteria

	Reference	Sample size	Data source	Adequate inception cohort	Less than 10% loss to follow-up	No systematic difference in patients lost to follow-up?	Valid outcome measured at fixed time point	Modelling methods	Adequate EPV	Linearity assumptions tested and met	Control for collinearity
Belfast	Fullerton (1983)⁴¹	206	P	+	+		−	CDA	0	−	−
Bristol	Wade (1983)³⁴	162	P	−	−	+	+	LinR	+	0	+
Edinburgh	Prescott (1982)³⁶	155	T	−	+		−	LinR	−	−	−
G score	Gompertz (1994)⁴²	361	P	+	−	0	+	N/A	+	0	0
Guys	Allen (1984)⁴⁰	148	P	+	+	0	−	S LR	−	−	+
Johnston	Johnston (2000)⁴⁸	256	T	+	−	0	+	LR	+	+	−
Lincoln	Lincoln (1989)³³	70	P	−	−	0	−	S LR	−	−	+
mNIHSS	Lyden (2001)⁴⁹	291	T	+	0	0	+	FA	NA	NA	NA
NIHSS_8	Tirschwell (2002)⁵¹	233	T	+	0	0	+	S LR	−	−	+
NIHSS+age	Weimar (2004)⁴⁶	1079	D	+	+		+	S LR	+	+	+
Orpington	Kalra (1993)⁴³	96	P	+	+		+	LinR	+	−	−
SSV	Counsell (2002)³⁸	530	D	+	+		+	S LR	+	+	+
Tilling	Tilling (2001)⁵⁰	299	T	+	0	0	+	MM	+	+	0
Uppsala	Frithz (1976)⁴⁷	344	CN	+	+		+	LR	+	0	+
Weimar	Weimar (2002)⁴⁵	1754	D	+	+		+	S LR	+	+	+
Weimar_ICH	Weimar (2006)⁴⁴	260	P	+	−	+	+	S LR	+	0	+
Young	Young (2001)³⁵	207	T	−	+		+	S LR	−	+	+

P, Prospective data collection; T, retrospective use of RCT data; D, data extracted from database or cohort study; CN, data extracted from casenotes; (S) LR, (stepwise) logistic regression; LinR, linear regression; MM, multilevel modelling; FA, factor analysis; CDA, Canonical Discriminant Analysis; +, condition met; –, condition not met; 0, unclear from study reports.

Highlighted studies were not developed on an adequate inception cohort and are not considered further.

Three models (Lincoln, Bristol and Young models)^33–35 were developed on cohorts where the collection of model variables were collected on admission to (or discharge from) a rehabilitation facility and therefore the inception cohort (time from stroke event to assessment) was not uniform. Measurement of variables for the development of the Edinburgh model was at four weeks from admission to the acute hospital³⁶ and this is likely to limit the usefulness of this model in the acute stroke setting.

Of the remaining 13 models, inception cohorts for model development were less than two weeks. The Six Simple Variables model was developed using data from the Oxford Community Stroke Project (OCSP) cohort.^37,38 Within the original OCSP cohort about three-quarters of assessments were performed within two weeks of the stroke event (median time to assessment 4 days),³⁷ however the Six Simple Variables model was developed using data on assessments performed up to 30 days following stroke. The proportion of assessments performed after 14 days was small and their inclusion is unlikely to limit the usefulness of the Six Simple Variables as a model.

Models should ideally be developed from prospective data collected according to a protocol with the express purpose of model development.³⁹ There were three main sources of data used for model development and validation: prospective data collection for the purposes of model development, retrospective use of data collected within stroke registers, and the secondary use of data from previously conducted randomized controlled trials.

Five of the 13 remaining studies described data collection with an a priori intention of developing a prognostic model (Belfast, G score, Guys, Orpington, Weimar_ICH).^40–44 These tended to be small studies (sample size 96–361, median 206). The G score⁴² and Weimar_ICH⁴⁴ models both reported loss to follow-up of greater than 10%, but the characteristics of non-responders as compared to patients with complete outcomes data was only considered (and found to be non-significant) by Weimar et al.⁴⁴

Two of the studies described models derived from data held within a large stroke database (Weimar and NIHSS+age)^45,46 and one model (Six Simple Variables) was derived from existing data from prospective cohort studies.³⁸ The Uppsala model was developed through extracting data from patient case-notes.⁴⁷

Four remaining models used data from previously conducted RCTs (Johnston, mNIHSS, NIHSS_8, Tilling).^48–51 These model development studies all excluded participants where outcomes data were not available at at least one time point. One of these studies (Johnston⁴⁸) reported the numbers of patients excluded through incomplete outcomes data, but none compared the characteristics of patients excluded through missing data with the model development study population or reported the approach to missing outcomes data adopted in these trials (Table 2). Exclusion of patients with incomplete data (or those lost to follow-up) through secondary use of data (from RCTs, databases or previously conduced cohort studies) may introduce the the risk of systematic bias. Moreover, inclusion or exclusion criteria of RCTs (e.g. exclusion of patients unable to transfer from bed to chair⁵⁰ or exclusion of patients with contraindications to thrombolysis⁵¹) may affect the ability of models developed from trial data to predict outcomes in the groups that were excluded from the training dataset. If validation studies were also performed in selected populations, the performance of models in empirical populations may remain untested and uncertain.

Modelling methodology

The case-mix adjustment models identified in this review are based largely on logistic regression modelling techniques to predict dichotomized outcomes (Table 2). The Tilling model was developed through multilevel modelling which allows consideration of the hierarchical structure of data to predict individual patient recovery trajectories (Barthel Index) over time.⁵⁰ Seven of the 13 models were developed to predict dichotomized continuous (or ordinal) dependent variables (Johnston, G score, Weimar_ICH, Weimar, NIHSS+age, NIHSS_8, mNIHSS).^{42,44–46,48,49,51} Dichotomizing continuous variables into two extremes results in loss of information⁵² and prevents the prediction of more complex outcomes. However, treating an ordinal variable (such as the Barthel Index) as interval data may fail to account for non-linear relationships between the independent and dependent variables unless this is addressed specifically. The Orpington model was developed through linear regression modelling with no report of specific consideration as to whether or not linearity assumptions were met.⁴³

Of the 13 remaining models, two (Guys, NIHSS_8)^40,51 had an EPV of less than 10 during model development. One further study (Belfast model) reported insufficient information to determine if an EPV of 10 had been achieved.⁴¹ If the number of observed outcome events (or whichever outcome is fewer for dichotomous dependent variables) is less than 10, the model coefficients are likely to be unstable and the model unreliable.²⁵ This may be reflected in the relatively poor sensitivity⁵³ and specificity^42,53 of the Guys model in external validation studies (Table 3).

Table 3.

Summary of external validation studies for identified models with adequate inception cohorts

	Cumulative sample size	Study	Sample size	Outcome assessed	Model performance
Belfast	102	Gladman (1992)⁵³	102	3 month mortality	Sens 94%, Spec 29%
Guys	871	Gompertz (1994)⁴²	361	6 month BI (dichotomized)	Prediction of poor outcomeSens 72%, Spec 63%
		Gladman (1992)⁵³	102	3 month mortality	Sens 58%, Spec 83%
		Muir et al. (1996)²⁸	408	Alive at home vs. in care or dead at 3 months	Measured in a model including the NIHSS
					Sens 70%, Spec 89%
G score	No specific validation studies for the G score which is identical to the Guys prognostic score with simplified covariate weightings
Johnston	914	Johnston (2003)⁷¹	299	Excellent outcome	c statistic
				NIHSS ≤1	0.85 Over optimistic predictions
				BI ≥95	0.83
				GOS = 1	0.81
				Devastating outcome	c statistic
				NIHSS ≥20	0.75
				BI <60	0.84
				GOS >2	0.89
		Johnston (2004)⁷²	615	Devastating outcome with NIHSS Dichotomized BIDichotomized GOS	Study describes use of model rather than validation or performance
mNIHSS	27	Meyer (2002)⁷³	27	To predict NIHSS	Examined reliability/validity of mNIHSS rather than performance. Good inter-rater reliability and concurrent validity. Valid predictor of NIHSS
NIHSS_8	531	Tirschwell (2002)⁵¹	531	Composite good/poor outcome based on NIHSS <1, GOS = 1, BI >95 at 3 months	c statistic for prediction of good outcome = 0.77
NIHSS+age	7150	Weimar (2004)⁴⁶	1307	BI <95 (120 days)	Sens 63%, Spec 83%
				120 day mortality	Sens 58%, Spec 92%
		König et al. (2008)⁷⁴	5843	BI <95 (90 days)	c statistic 0.808
				90 day mortality	c statistic 0.706
Orpington	814	Lai (1998)²⁷	184	BI (treated as interval variable) at 1, 3 and 6 months	R² = 0.62 at 1 month, <0.5 at 3 and 6 months
		Studenski (2001)⁷⁵	413	5 (study-specific) markers of functional independence at 3 and 6 months	c statistic >0.8 for all outcomes at 3 months, 0.74–0.8 at 6 months
		Kalra et al. (1994)⁷⁶	217	Independent living at discharge	Sens 96%, Spec 36%
SSV	8964	Counsell et al. (2002)³⁸	538 (community cohort)	30 day survival	c statistic 0.88
				6 month independent survival	c statistic 0.84
			1330 (hospital cohort)	30 day survival	c statistic 0.86
				6 month independent survival	c statistic 0.84
		FOOD trial (2003)⁶⁴ Dennis et al. (2006)⁶² Dennis et al. (2003)⁶³	2955	6 month independent survival	c statistic 0.79
		Lewis et al. (2008)⁶⁵	537	30 day survival	c statistic 0.73 (Variables collected within 6 hours)
				6 month independent survival	c statistic 0.82
		Reid et al. (2007)⁶⁶	538	6 month independent survival	c statistic 0.79 (all stroke subtypes) Subgroup analysis: Performs well (c statistic > 0.75) for prediction of outcome in ischaemic and haemorrhagic stroke, hyperacute variable collection (<6 h) and in patients both under and over 75
					Performance variable according to stroke severity, tends to perform better in more severe strokes (c statistic 0.78) and is no better than chance for mild strokes (c statistic 0.46)
		Weir et al. (2001)⁶⁷	2774	6 month mortality	c statistic 0.84
		Weir et al. (2003)⁶⁸	292	Inter-rater reliability of variable collection	Kappa statistic for prospective and retrospective variable collection >0.6 (except ability to walk κ 0.55)
Tilling	710	Tilling et al. (2001)⁵⁰	710	Barthel Index (recovery trajectory)	Average difference between observed and predicted BI = −0.4 (limits of agreement −7 to +6)
Uppsala	102	Gladman (1992)⁵³	102	3 month mortality	Sens 30%, Spec 96%
Weimar	1470	German Stroke Study Collaboration (2004)⁷⁷	1470	BI <95	Sens 68%, Spec 86%
				120 day mortality	Sens 47%, Spec 96%
Weimar_ICH	173	Weimar (2006)⁴⁴	173	Dichotomized BI at 100 days	c statistic 0.876
Studies using existing impairment scales to predict outcome
NIHSS	592	Muir et al. (1996)²⁸	408	Alive at home vs. dead or in care at 3 months	Prediction of poor outcome Sens 71%, Spec 90%
		Lai (1998)²⁷	184	BI (treated as interval data) at 1, 3, 6 months	R² = 0.56 at 1 month, <0.5 at 3 and 6 months
Canadian Neurological Scale (CNS)	408	Muir et al. (1996)²⁸	408	Alive at home vs. dead or in care at 3 months	Prediction of poor outcome when added to model with NIHSS Sens 71%, Spec 89%
Middle Cerebral Artery Neurological Score (Orgogozo score)	408	Muir et al. (1996)²⁸	408	Alive at home vs. dead or in care at 3 months	Prediction of poor outcome when added to model with NIHSS Sens 71%, Spec 89% Predictive accuracy 82%

Variables included in each of the models are presented in Table 4. These variables fit into three broad categories; markers of stroke severity, pre-stroke function and comorbidities. All of the models contain a marker of post-stroke motor function and most feature age and conscious level.

Table 4.

Variables included in individual models (only models with adequate inception cohorts are listed)

	Variables included in model
Belfast⁴¹	Albert’s test score, leg function, conscious level, arm power, weighted mental score, non-specific ECG changes
Guys⁴⁰ (and G score)⁴²	Limb paralysis, higher cerebral dysfunction + hemiparesis + hemianopia, drowsy, age, unconscious at onset, uncomplicated hemiparesis
Johnston⁴⁸	Age, NIHSS score, small vessel stroke, previous stroke, diabetes, pre-stroke disability, infarct volume
mNIHSS⁴⁹	Items 1B, 1C, 2, 3, 5a&b, 6a&b, 8, 9, 11 from the NIHSS. Conscious level, gaze, visual fields, upper and lower limb power, sensory function, language and neglect
NIHSS_8⁵¹	NIHSS_15 items 1a, 2, 3, 4, 6a&b, 9, 10. Conscious level, gaze visual fields, facial paresis and lower limb motor scores, language and dysarthria
NIHSS+age⁴⁶	Age, NIHSS
Orpington⁴³	Arm power, proprioception, balance, cognition
SSV³⁸	Age, living alone, independent pre-stroke, normal GCS verbal score, able to lift both arms, able to walk
Tilling⁵⁰	Age, sex, ethnicity, pre-stroke handicap, limb weakness, dysphasia, dysarthria, incontinence, conscious, swallowing deficit, stroke subtype
Uppsala⁴⁷	Adaptation of Mathew’s score (0–100) Conscious level, orientation, dysphasia, conjugate gaze palsy, facial weakness, arm power, Performance Disability Scale, reflexes, sensation
Weimar_ICH⁴⁴	Age, NIHSS
Weimar⁴⁵	Model 1: Neurological complications, fever, lacunar infarct, diabetes, previous stroke, sex, age, mRS, NIHSS score on admission
	Model 2: Fever, age, NIHSS score on admission

Variable selection was often performed through entering variables reaching statistical significance in univariate analysis into multivariable models. This data-driven approach risks inclusion of variables which are of statistical, but questionable clinical significance (for example the inclusion of non-specific ECG changes as a predictor in the Belfast model⁴¹). Multiple univariate analyses may also identify or disregard correlations through the role of chance.⁵⁴ Forwards or backwards stepwise variable selection was common in model development (Guys, NIHSS_8, NIHSS+age, Six Simple Variables, Weimar, Weimar_ICH)^{38,40,44–46,51} and helps to circumvent problems with collinearity as the effect on model residuals of inclusion and exclusion of combinations of individual variables is considered in turn.

Model performance was quantified through presentation of the c statistic (a measure of the ability of a model to discriminate correctly between two incongruous outcome pairs)⁵⁵ or sensitivity and specificity. In the Orpington model development study, the amount of variation in Barthel Index explained with the Orpington model was presented as the R-squared statistic.⁴³

Table 3 presents data on the performance of the models in the external validation studies identified in this review.

Factors potentially limiting feasibility of variable selection

National clinical audits conducted in a number of countries have revealed variable access to stroke unit care. In the recent Royal College of Physicians National Sentinel Stroke Clinical Audit (2010) of England, Wales and Northern Ireland, 88% of included patients spent some of their inpatient stay on a specialist stroke unit.⁵⁶ National audits in New Zealand and Australia performed in 2009⁵⁷ and Scotland in 2011⁵⁸ revealed that 52%, 74% and 82% respectively of included patients spent time on a stroke unit (these figures exclude patients admitted to hospitals without a stroke unit). These audits occurred at different times and the expectation is that access to specialized stroke unit care will continue to improve over time, however, the general point remains that specialist stroke unit care is not currently universally achieved.

The feasibility of collecting variables within individual models is dependent on the setting and the skills and experience of those performing assessment. The availability of staff trained to perform complex clinical assessments (e.g. the NIHSS) may currently limit the use of some models to specialist settings. In addition, data collection is resource dependent. In funded research projects feasibility of assessments is likely to differ from that of routine clinical care. Eight of all the 24 models and prognostic scores identified in this review require complex clinical assessments for completion (Johnson, Lincoln, NIHSS, Weimar, Weimar_ICH models, Belfast, NIHSS+age, Uppsala, mNIHSS)^{28,33,41,44–49} and are therefore unlikely to be feasible for collection in non-specialist settings. In addition, three of these models require collection of variables within 6 hours of the stroke event (Johnston, NIHSS+age and Weimar_ICH models).^44,46,48 Feasibility of hyperacute data collection in this way may limit the use of these models to patients admitted directly to the specialist stroke setting.

Discussion

There are currently no universally accepted criteria to assess the quality of prognostic studies used to develop case-mix adjustment models.^19,21,54 However, there is both generic and disease specific literature that has identified key clinical and statistical criteria that should be considered in model development and assessment.^{5,6,19,20,26,39,54,59–61} The 21 models considered in this review have been considered against many of these criteria.

There are methodological weaknesses in the development and validation of many of the models considered in this review, and clinical feasibility of collection of complex variables may limit the use of some of the models in the routine care setting.

The extent to which models have been validated in independent cohorts also varied and this will affect the confidence with which these models may be used in settings other than those in which the validation studies were performed. Most (11) models were validated in just one external validation study, and these were often performed by the authors of the instrument. Further validation of these models may have been performed in studies that have not been identified in this review.

The Weimar models to predict dichotomized Barthel Index at 100 days following ischaemic stroke⁴⁵ and the NIHSS+age⁴⁶ are well-developed models, but their performance in independent cohorts has been variable and this may restrict their usefulness as case-mix adjustment models. In addition, the variables to construct these models incorporate the NIHSS which requires specific training to administer.

The Six Simple Variables models are robust in terms of model development and have been extensively externally validated in six independent post-stroke populations including community- and hospital-based cohorts,^38,62–67 and with both prospective and retrospective data extraction. The inter-rater reliability of the collection of Six Simple Variables variables has also been shown to be acceptable.⁶⁸ The Six Simple Variables model may be used to predict 30-day mortality and six-month independent survival with c statistics (a measure of the ability of a model to correctly predict good over poor outcome) consistently greater than 0.75. Model performance where variables are collected hyperacutely (within 6 hours of the event) has been shown to be reasonable (c statistic >0.8065⁶⁶), although the models perform less well in milder strokes.⁶⁶ Calibration of models (correct prediction of outcome in independent populations) is generally good, although the Six Simple Variables model tended to make over pessimistic predictions of survival and living at home⁶³ or independent survival⁶⁵ in moderate to severe strokes. However, it is in these patients that particular uncertainty may exist as regards prognosis and where prognostic models may be most useful. It has been suggested that the Six Simple Variables model can be used for stratifying patients for RCTs and for the adjustment of observational cohorts,^38,63,65 although the models are not sufficiently robust for predicting outcomes in individual patients following stroke.^38,65 Prognostic scoring systems developed from population level data should only be used as a guide to predict outcome in individuals as they fail to account for factors such as the recovery trajectory. The multilevel modelling approach using repeated measures of function as adopted by Tilling et al. may address some of these issues.⁶⁹ It should also be considered that there are many markers of recovery that are more likely to be of interest to individual patients and their carers as post-stroke outcomes than the hard endpoints of death and dependency.³⁸ These may include markers of physical or social functioning, mood or quality of life. However, the prediction of these outcomes is further complicated by the complex nature of these endpoints.

This review provides a systematic overview of available externally validated prognostic models in stroke, updating previous reviews^5,6 to include more recent models and modelling methodologies. This review was based on a comprehensive and replicable search strategy producing a vast amount of literature for consideration. Despite this process, it is possible that relevant citations describing model development or validation of existing models have been overlooked. In addition, models that are yet to be externally validated and may yet prove to be good predictors of patient outcome have been excluded from the review. Information regarding modelling techniques may not have been reported in detail in individual studies, and where this detail was lacking we have not attempted to obtain this information directly from authors. It is therefore possible that further robust models may have been excluded. Apparently poor performance of individual models in independent populations may reflect the methodology of external validation studies. It has not been possible to offer a quantitative summary of the performance of individual models in external populations due to the heterogeneity of external validation studies. Instead, validation studies have been presented individually to allow comparative assessment of their methodological quality and generalizability.

Conclusions

This review has identified that the Six Simple Variables model demonstrates statistical robustness, good discriminatory function in external validation studies and comprises variables that are clinically feasible to collect at ward level by non-specialist staff. However, the Six Simple Variables model predicts the hard endpoints of death or dependency which do not capture the nuances of complex rehabilitation outcomes or patient-reported outcomes that are likely to be of more interest to patients and their carers following stroke (e.g. reintegration or social functioning).

Alternative modelling approaches for case-mix adjustment, such as latent class analysis,⁷⁰ structural equation modelling or decision trees, may allow exploration of the heterogeneity in the recovery of individuals following stroke where allowing for the interrelationships among prognostic factors may improve existing models. Further work is needed to explore the feasibility and utility of these alternative modelling approaches to adjust for case-mix in large unselected populations of patients admitted to hospital with acute stroke.

Clinical messages

Many existing prognostic models in stroke require complex assessments that may limit their feasibility for use in routine stroke care.

The Six Simple Variables prognostic model is feasible to collect in routine settings, statistically robust and extensively externally validated in post stroke populations.

Many existing prognostic models in stroke (including the Six Simple Variables model) predict mortality or dependency. These endpoints may be of less interest to individual patients and their carers than more complex rehabilitation outcomes.

Footnotes

Acknowledgements

With thanks to Deirdre Andre at the University of Leeds Library for assistance with developing the search strategy, Anita Rajendram (University of Leeds) for initial screening of citations, and Ruth Lambley (Academic Unit of Elderly Care and Rehabilitation, University of Leeds) for assistance with data extraction.

Conflict of interest

None declared.

Funding

We are pleased to acknowledge funding from the National Institute for Health Research (NIHR) Collaborations for Leadership in Applied Health Research and Care (CLAHRC) at Leeds, York and Bradford. The views and opinions expressed in this paper are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health.

References

Lilford

Mohammed

Speigelhalter

Thompson

. Use and misuse of process and outcome data in managing performance of acute medical care: avoiding institutional stigma. Lancet 2004; 363: 1147–1154.

Dr Foster Health. Dr Foster Report Card. http://www.drfosterhealth.co.uk/quality-reports/ (accessed 29 April 2011).

Department of Health. The NHS Outcomes Framework 2011/12. http://www.dh.gov.uk/prod_consum_dh/groups/dh_digitalassets/@dh/@en/@ps/documents/digitalasset/dh_123138.pdf (accessed 29 April 2011).

PROMIS: Patient Reported Outcomes Measurement Instrument Systems. http://www.nihpromis.org/ (accessed 14 June 2011).

Counsell

Dennis

. Systematic Review of prognostic models in patients with acute stroke. Cerebrovasc Dis 2001; 12: 159–170.

Kwakkel

Wagenaar

Kollen

Lankhorst

. Predicting disability in stroke – a critical review of the literature. Age Ageing 1996; 25: 479–489.

Meijer

Ihnenfeldt

de Groot

van Limbeek

Vermeulen

de Haan

. Prognostic factors for ambulation and activities of daily living in the subacute phase after stroke. A systematic review of the literature. Clin Rehabil 2003; 17: 119–129.

Meijer

van Limbeek

Kriek

Enfeldt

Meulen

. Prognostic social factors in the subacute phase after a stroke for the discharge destination from the hospital stroke-unit. A systematic review of the literature. Disabil Rehabil 2004; 26: 191–197.

Meijer

Ihnenfeldt

van Limbeek

Vermeulen

de Haan

. Prognostic factors in the subacute phase after stroke for the future residence after six months to one year. A systematic review of the literature. Clin Rehabil 2003; 17: 512–520.

10.

Hier

Edelstein

. Deriving clinical prediction rules fom stroke outcome research. Stroke 1991; 22: 1431–1436.

11.

Jongbloed

. Prediction of function after stroke: a critical review. Stroke 1986; 17: 765–776.

12.

Segal

Whyte

. Modeling case mix adjustment of stroke rehabilitation outcomes. Am J Phys Med Rehabil 1997; 76: 154–161.

13.

Salter

Jutai

Zettler

Moses

McClure

Foley

Teasell

. Evidence based review of stroke rehabilitation: Outcome measures in stroke rehabilitation. http://www.ebrsr.com/uploads/Module-21_outcomes.pdf . 2011 (accessed 8 November 2011).

14.

Stroke Unit Trialists’ Collaboration. Organised inpatient (stroke unit) care for stroke. Cochrane Database Syst Rev 2007; (4): CD000197.

15.

American Stroke Association’s Task Force on the Development of Stroke Systems. Recommendations for the establishment of stroke systems of care. Stroke 2005; 36: 690–703.

16.

Lindsay

Gubitz

Bayley

. On behalf of the Canadian Stroke Strategy Best Practices and Standards Writing Group. The Canadian best practice recommendations for stroke care (update 2010). 2010. Ottawa: Canadian Stroke Network.

17.

Thomassen

Thorén

Leys

Roine

Anderson

. On behalf of the 6th Karolinska Stroke Update, Stockholm Sweden. Organised acute stroke care. Karolinska Stroke Update Consensus Statement, 2006. www.strokeupdate.org/Cons_organised_2006.aspx (accessed 7 June 2011).

18.

The Department of Health. The National Stroke Strategy. 2007. http://www.dh.gov.uk/prod_consum_dh/groups/dh_digitalassets/documents/digitalasset/dh_081059.pdf (accessed 24 February 2010).

19.

Altman

. Systematic reviews in health care: systematic reviews of evaluations of prognostic variables. BMJ 2001; 323: 224–248.

20.

Hayden

JA.

Côté

Bombardier

. Evaluation of the quality of prognosis studies in systematic reviews. Ann Intern Med 2006; 144: 427–437.

21.

Hemingway

Riley

Altman

. Ten steps towards improving prognosis research. BMJ 2010; 339: 410–414.

22.

Cochrane Database of Systematic Reviews Stroke Review Group. Stroke search strategy. 2009. http://onlinelibrary.wiley.com/o/cochrane/clabout/articles/STROKE/frame.html (accessed 10 June 2009).

23.

Altman

Royston

. What do we mean by validating a prognostic model? Stat Med 2000; 19: 453–473.

24.

Altman

Vergouwe

Moons

. Prognosis and prognostic research: validating a prognostic model. BMJ 2009; 338: 1432–1435.

25.

Peduzzi

Concato

Kemper

Holford

Feinstein

. A simulation study of the number of events per variable in logistic regression analysis. J Clin Epidemiol 1996; 49: 1373–1379.

26.

Harrell

Lee

Mark

. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med 1996; 15: 361–387.

27.

Lai

Duncan

Keighley

. Prediction of functional outcome after stroke: comparison of the Orpington Prognostic Scale and the NIH Stroke Scale. Stroke 1998; 29: 1838–1842.

28.

Muir

KW.

Weir

Murray

GD.

Povey

Lees

. Comparison of neurological scales and scoring systems for acute stroke prognosis. Stroke 1996; 27: 1817–1820.

29.

Anderson

Jamrozik

Broadhurst

Stewart-Wynne

. Predicting survival for 1 year among different subtypes of stroke: Results from the Perth community stroke study. Stroke 1994; 25: 1935–1944.

30.

Masiero

Avesani

Armani

Verena

Ermani

. Predictive factors for ambulation in stroke patients in the rehabilitation setting: a multivariate analysis. Clin Neurol Neurosurg 2007; 109: 763–769.

31.

Wang

Lim

Heller

Fisher

Levi

. A prediction model of 1-year mortality for acute ischemic stroke patients. Arch Phys Med Rehabil 2003; 84: 1006–1011.

32.

Williams

Jiang

. Development of an ischemic stroke survival score. Stroke 2000; 31: 2414–2420.

33.

Lincoln

Blackburn

Ellis

. An investigation of factors affecting progress on a stroke unit. J Neurol Neurosurg Psychiatry 1989; 52: 493–496.

34.

Wade

Skilbeck

Langton Hewer

. Predicting Barthel ADL score at 6 months after an acute stroke. Arch Phys Med Rehabil 1983; 64: 24–28.

35.

Young

Bogle

Forster

. Determinants of social outcome measured by the Frenchay Activities Index at one year after stroke onset. Cerebrovasc Dis 2001; 12: 114–120.

36.

Prescott

Garraway

Akhtar

. Predicting functional outcome following acute stroke using a standard clinical examination. Stroke 1982; 13: 641–647.

37.

Bamford

Sandercock

Dennis

. A prospective study of acute cerebrovascular disease in the community: the Oxfordshire Community Stroke Project 1981–86. J Neurol Neurosurg Psychiatry 1988; 51: 1373–1380.

38.

Counsell

Dennis

McDowall

Warlow

. Predicting outcome after acute and subacute stroke: development and validation of new prognostic models. Stroke 2002; 33: 1041–1047.

39.

Wyatt

. Acquisition and use of clinical data for audit and research. J Eval Clin Pract 1995; 1: 15–27.

40.

Allen

CMC

. Predicting the outcome of acute stroke: a prognostic score. J Neurol Neurosurg Psychiatry 1984; 47: 475–480.

41.

Fullerton

MacKenzie

Stout

. Prognostic indices in stroke. Q J Med 1988; 66: 147–162.

42.

Gompertz

Pound

Ebrahim

. Predicting stroke outcome: Guy’s prognostic score in practice. J Neurol Neurosurg Psychiatry 1994; 57: 932–935.

43.

Kalra

Crome

. The role of prognostic scores in targeting stroke rehabilitation in elderly patients. J Am Geriatr Soc 1993; 41: 396–400.

44.

Weimar

Roth

Willig

Kostopoulos

Benemann

Diener

. Development and validation of a prognostic model to predict recovery following intracerebral hemorrhage. J Neurol 2006; 253: 788–793.

45.

Weimar

Ziegler

Konig

Diener

. Predicting functional outcome and survival after acute ischemic stroke. J Neurol 2002; 249: 888–895.

46.

Weimar

Konig

Kraywinkel

Ziegler

Diener

German Stroke Study Collaboration. Age and National Institutes of Health Stroke Scale Score within 6 hours after onset are accurate predictors of outcome after cerebral ischemia: development and external validation of prognostic models. Stroke 2004; 35: 158–162.

47.

Frithz

Werner

. Studies on cerebrovascular strokes II. Clinical findings and short-term prognosis in a stroke material. Acta Med Scand 1976; 199: 133–140.

48.

Johnston

Connors

Wagner

Knaus

Wang

Clarke

. A predictive risk model for outcomes of ischemic stroke. Stroke 2000; 31: 448–455.

49.

Lyden

Levine

Brott

Broderick

NINDS rtPA Stroke Study Group. A modified National Institutes of Health Stroke Scale for use in stroke clinical trials: preliminary reliability and validity. Stroke 2001; 32: 1310–1317.

50.

Tilling

Sterne

Rudd

Glass

Wityk

Wolfe

. A new method for predicting recovery after stroke. Stroke 2001; 32: 2867–2873.

51.

Tirschwell

Longstreth

Jr. Becker

. Shortening the NIH Stroke scale for use in the prehospital setting. Stroke 2002; 33: 2801–2806.

52.

Altman

Royston

. Statistics notes: the cost of dichotomising continuous variables. BMJ 2006; 332: 1080.

53.

Gladman

Harwood

Barer

. Predicting the outcome of acute stroke: prospective evaluation of five multivariate models and comparison with simple methods. J Neurol Neurosurg Psychiatry 1992; 55: 347–351.

54.

Mallett

Royston

Dutton

Waters

Altman

. Reporting methods in studies developing prognostic models in cancer: a review. BMC Med 2010; 8: 20.

55.

Royston

Moons

KGM

Altman

Vergouwe

. Prognosis and prognostic research: developing a prognostic model. BMJ 2008; 338: 1373–1377.

56.

Intercollegiate Working Party. National Sentinel Stroke Clinical Audit 2010, Round 7, Public Report for England, Wales and Northern Ireland. 2011, Royal College of Physicians. http://www.rcplondon.ac.uk/sites/default/files/national-sentinel-stroke-audit-2010-public-report-and-appendices_0.pdf (accessed 23 June 2011).

57.

Stroke Foundation of New Zealand. National Acute Stroke Services Audit 2009. 2010, Wellington, New Zealand: Stroke Foundation of New Zealand. http://www.stroke.org.nz/resources/SFNZ-NASSA-2009.pdf (accessed 30 June 2011).

58.

SCCA Steering Committee. Scottish Stroke Care Audit, 2011 National Report, Stroke Services in Scottish Hospitals. 2011; Edinburgh: ISD Scotland Publications. http://www.strokeaudit.scot.nhs.uk/Downloads/2011_Report/SSCA-report-2011-web-version_new.pdf (accessed 8 November 2011).

59.

Laupacis

Sekar

Stiell

. Clinical prediction rules: A review and suggested modifications of methodological standards. JAMA 1997; 277: 488–494.

60.

Mallett

Royston

Waters

Dutton

Altman

. Reporting performance of prognostic models in cancer: a review. BMC Med 2010; 8: 21.

61.

Perel

Edwards

Wentz

Roberts

. Systematic review of prognostic models in traumatic brain injury. BMC Med Inform Decis Mak 2006; 6: 38.

62.

Dennis

Lewis

Cranswick

Forbes

. FOOD: A multicentre randomized trial evaluating feeding policies in patients admitted to hospital with a recent stroke. Health Technol Assessment 2006; 10: 1–91.

63.

Dennis

Cranswick

Fraser

. Performance of a statistical model to predict stroke outcome in the context of a large, simple, randomized, controlled trial of feeding. Stroke 2003; 34: 127–133.

64.

Food Trial Collaboration. Poor nutritional status on admission predicts poor outcomes after stroke: observational data from the FOOD trial. Stroke 2003; 34: 1450–1456.

65.

Lewis

Sandercock

PAG

Dennis

. Predicting outcome in hyper-acute stroke: Validation of a prognostic model in the Third International Stroke Trial (IST3). J Neurol Neurosurg Psychiatry 2008; 79: 397–400.

66.

Reid

Gubitz

Dai

. External validation of a six simple variable model of stroke outcome and verification in hyper-acute stroke. J Neurol Neurosurg Psychiatry 2007; 78: 1390–1391.

67.

Weir

Dennis

and the Scottish Stroke Outcomes Study Group. Towards a national system for monitoring the quality of hospital-based stroke services. Stroke 2001; 32: 1415–1421.

68.

Weir

Counsell

McDowall

Gunkel

Dennis

. Reliability of the variables in a new set of models that predict outcome after stroke. J Neurol Neurosurg Psychiatry 2003; 74: 447–451.

69.

Tilling

Sterne

Wolfe

. Multilevel growth curve models with covariate effects: application to recovery after stroke. Stat Med 2001; 20: 3474–3486.

70.

West

Hill

Hewison

Knapp

House

. Psychological disorders after stroke are an important influence on functional outcomes: a prospective cohort study. Stroke 2010; 41: 1723–1727.

71.

Johnston

Connors

Jr. Wagner

Haley

Jr.

Predicting outcome in ischemic stroke: external validation of predictive risk models. Stroke 2003; 34: 200–202.

72.

Johnston

Connors

Jr. Wagner

Haley

Jr.

Risk adjustment effect on stroke clinical trials. Stroke 2004; 35.

73.

Meyer

Hemmen

Jackson

Lyden

. Modified National Insitute of Health Stroke Scale for use in stroke clinical trials: prospective reliability and validity. Stroke 2002; 33: 1261–1266.

74.

Konig

Ziegler

Bluhmki

. Predicting long-term outcome after acute ischemic stroke: a simple index works in patients from controlled clinical trials. Stroke 2008; 39: 1821–1826.

75.

Studenski

Wallace

Duncan

Rymer

Lai

. Predicting stroke recovery: three- and six-month rates of patient-centered functional outcomes based on the Orpington Prognostic Scale. J Am Geriatr Soc 2001; 49: 308–312.

76.

Kalra

Dale

Crome

. Evaluation of a clinical score for prognostic stratification of elderly stroke patients. Age Ageing 1994; 23: 492–498.

77.

German Stroke Study Collaboration. Predicting outcome after acute ischemic stroke: an external validation of prognostic models. Neurology 2004; 62: 581–585.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.00 MB

0.01 MB