Psychometric Properties of the Satisfaction With Life Scale in People With Traumatic Brain,Spinal Cord,or Burn Injury: A National Institute on Disability,Independent Living,and Rehabilitation Research Model System Study

Abstract

This study evaluated the measurement properties of the Satisfaction With Life Scale (SWLS) in a sample of 17,897 people with spinal cord injury (48%, n = 8,566), traumatic brain injury (44%, n = 7,941), and burn injury (8%, n = 1,390), 1 year following injury. We examined measurement invariance across the groups, unidimensionality, local independence, reliability from a classical test and item response theory (IRT) framework, and fit to a unidimensional IRT model. The results support unidimensionality and local independence of the SWLS. Reliability was adequate from a classical test and IRT perspective. IRT analysis found that the SWLS could be improved by using only five response categories rather than seven and by removing the fifth item, “If I could live my life over, I would change almost nothing.” This item functions poorly and reduces instrument reliability. With these revisions, the SWLS is a useful instrument to monitor an important outcome of trauma rehabilitation.

Keywords

measurement invariance item response theory Satisfaction With Life Scale traumatic brain injury spinal cord injury burn injury

Satisfaction with life (SWL) reflects a global evaluation of one’s life and is a component of subjective well-being (Diener, 1984; Lee et al., 2013) and health-related quality of life (HR-QOL; Guyatt, Feeny, & Patrick, 1993; Patrick & Erickson, 1988). SWL is a universal construct that allows us to examine how people’s cognitive judgments of their life vary across populations, studies, treatments, and time. It is widely studied from economics and policy research (Dolan & White, 2007; Ng & Diener, 2014) to studies of health (Lee et al., 2013). For instance, economists have examined the effect of changes in household income on life satisfaction (Stutzer, 2004) and evaluated the causal relationship between social leisure and SWL, making important recommendations related to successful retirement policies (Becchetti, Ricca, & Pelloni, 2012). SWL is also important for international comparisons, and is used to evaluate the effects of health, environmental, and economic factors on individual’s lives (Diener & Tay, 2015). It is especially valuable as a general measure of well-being in people living with chronic illness or disability.

General measures of well-being are an important part of monitoring the impact of traumatic injury on people’s lives. Individuals with traumatic injuries, such as spinal cord injury (SCI), traumatic brain injury (TBI), or burn injury (BI) generally report lower SWL postinjury than the general population or people without injury (Jacobsson & Lexell, 2013; McMahon et al., 2013; Moergeli, Wittmann, & Schnyder, 2012; Post, Van Dijk, Van Asbeck, & Schrijvers, 1998), although one study in TBI suggested that after an adjustment period SWL can increase to or exceed preinjury levels (Schulz-Heik et al., 2016). In all, SWL is a key outcome used to better understand and improve well-being of people, both healthy people and those with chronic conditions and disabilities.

The Satisfaction With Life Scale (SWLS; Diener, Emmons, Larsen, & Griffin, 1985) is one of the most widely used instruments for measuring SWL. The SWLS has been widely translated and used in a number of populations, and is a National Institutes of Health Common Data Element for general life satisfaction for rehabilitation populations with moderate to severe disabilities (Biering-Sorensen et al., 2015; Maas et al., 2010). Longitudinal databases tracking health outcomes of people with traumatic injury, such as the Model Systems funded by the National Institute on Disability, Independent Living, and Rehabilitation Research (NIDILRR), have used the SWLS in large samples of people with SCI, TBI, and BI who are tracked longitudinally (Dijkers, 1999; Goverman et al., 2016; Putzke, Richards, Hicken, & DeVivo, 2002; Williamson et al., 2016). In SCI research, it has been used to evaluate the effect of factors, such as weight or mechanical ventilation use, on overall HR-QOL (Charlifue et al., 2011; Y. Chen, Cao, Allen, & Richards, 2011; Hartoonian et al., 2014; Tate et al., 2015) and is one of the most frequently used patient-reported measures of quality of life (Wilson, Hashimoto, Dettori, & Fehlings, 2011). The SWLS has also been included in a number of studies with people with TBI (Cicerone & Azulay, 2007; Corrigan, Kolakowsky-Hayner, Wright, Bellon, & Carufel, 2013; Davis et al., 2012). A few studies have evaluated the course, correlates, and predictors of SWL in burn survivors (Costa et al., 2003; Gilboa, Bisk, Montag, & Tsur, 1999; Patterson, Ptacek, Cromes, Fauerbach, & Engrav, 2000). The SWLS has become an important part of tracking and evaluating HR-QOL in individuals with traumatic injury.

This study makes an important contribution to the existing literature. First, it examines if SWL functions the same in each of the three injury populations. Without this evidence, comparisons of SWL across populations may not be valid or interpretable. Second, an in-depth evaluation of the functioning of the scale using the tools provided by item response theory (IRT) provides new ways of examining psychometric properties. Third, it offers specific ways in which the psychometric properties of the scale can be improved.

Studies have generally found good psychometric properties when evaluating the SWLS using classical test theory (CTT) methodology or evidence of convergent validity when examining correlations with instruments of related constructs, such as other measures of life satisfaction, morale, psychological well-being, and positive-affect (Pavot, Diener, Colvin, & Sandvik, 1991; Rosengren, Jonasson, Brogårdh, & Lexell, 2015; Urry et al., 2004). Overall, psychometric evaluation of the scale in persons with traumatic injury has been limited. In people with SCI, the SWLS showed good reliability (α = .83), and evaluations of correlations and response patterns supported the validity of SWLS scores (Post, Van Leeuwen, Van Koppenhagen, & De Groot, 2012). However, CTT evaluations are found to be lacking in many cases due to the nature of the theory. All estimates are sample dependent, including difficulty and discrimination. The assessment of response category functioning is left to observing frequencies, and the estimation of reliability, using Cronbach’s alpha, is confined to a point estimate. The reliability estimate is the same for all levels of life satisfaction.

IRT methods allow for more rigorous evaluation of the SWLS by addressing shortcomings of CTT approaches, including sample dependent estimates of item characteristics and a point estimate for the reliability of the entire scale. Cronbach’s alpha systematically underestimates reliability or provides an estimate with unknown biases (Hambleton & Van Der Linden, 1982) making it difficult to use for power analyses. Adequate reliability requires smaller sample sizes and allows for greater confidence that the differences in scores result from real individual differences in the construct, SWL (Raykov, 2004). The ability to differentiate between the SWL of individuals over time or mean levels of SWL within groups is essential in research aimed at improving HR-QOL outcomes. For this, measurement error must be small and items must provide stable scores (Sijtsma, 2015). The IRT framework provides better tools for examining reliability and improving all of the scale’s psychometric properties.

In the IRT framework, information corresponds to reliability in the CTT frame. As the reciprocal of the standard error of the estimate, the amount of information in an item or scale provides a measure of the precision for estimating a person’s level of SWL. Unlike Cronbach’s alpha, information varies across different levels of life satisfaction. By using IRT, the reliability at different levels of SWL is estimated and better recommendations (especially those for sample size required to detect a treatment effect of a certain size) for the use of the SWLS can be made. IRT facilitates more in-depth evaluations of item functioning, including response options and item reliability.

The goal of this study was to use a combination of CTT, IRT, and factor analytic methods to examine the unidimensionality, functioning of the response options, and reliability of the SWLS in people with multiple traumatic injuries. Given the widespread use of the SWLS in evaluating HR-QOL, and in traumatic injury in particular, plus its use in a much wider body of research, an in-depth evaluation of item and scale functioning is warranted. Recommendations for the improvement and appropriate use of the scale can then be made.

Method

Participants

The sample included individuals with SCI, TBI, or BI, participating in the Model Systems. Participating centers follow people with traumatic injury to examine their long-term outcomes at regular intervals. Informed consent was given at time of enrollment in the Model Systems. Data were collected by the individual institutions participating in Model Systems, and subsequently transferred to the respective national database. The data were collected through in-person, phone interview, or mailed questionnaire (Y. Chen, DeVivo, Richards, & SanAgustin, 2016; Dijkers, Harrison-Felix, & Marwitz, 2010; Klein et al., 2007). All data for the current study were collected at 1 year following injury onset.

Participation by individual institutions in each of the Model Systems is based on a competitive grant process and funded by NIDILRR. Institutions are selected for their ability to provide care and conduct research, among other criteria. Descriptions of the individual institutions and the inclusion criteria for each Model System is reported below.

The goal of the SCI Model System (SCIMS) is to study the longitudinal course of traumatic SCI and factors that affect that course, as well as examine trends and rehabilitation treatment outcomes over time. The SCIMS has been enrolling participants since 1973. Current inclusion and exclusion criteria require that participants have recent traumatic SCI, be admitted to acute inpatient rehabilitation at a SCIMS center within 1 year of injury, and have completed inpatient rehabilitation or achieved a neurological status of normal or minimal deficit. Since its inception, a total of 30 rehabilitation hospitals have served as SCIMS centers across the United States, including 14 current sites (Y. Chen et al., 2016).

The TBI Model System (TBIMS) began data collection in 1987. Participants are enrolled if they have moderate to severe TBI and are admitted to inpatient rehabilitation. They must have presented to the designated TBIMS acute care hospital within 72 hours of injury, received both acute medical and acute rehabilitation care within the same system, been 16 years of age or older, provided informed consent, or if unable, family or legal guardian provided informed consent, and sustained a TBI with at least one of the following characteristics: Glasgow Coma Scale score <13 on emergency admission (not due to intubation, sedation, or intoxication), posttraumatic amnesia >24 hours, or trauma-related intracranial abnormality on neuroimaging. A total of 22 sites have participated in the TBIMS at some point, with 16 currently active sites (Dijkers et al., 2010).

The Burn Model System (BMS) centers program began a longitudinal database on outcomes in 1993. Enrolled participants are required to have received surgery for BIs. Database enrollment criteria for adult participants include meeting one of the following criteria: greater than 10% total body surface area (TBSA) burned and ≥65 years of age, greater than 20% TBSA burn and 18 to 64 years of age, electrical high-voltage/lightning injury, or BI to the hand, face, or feet. Database structure, enrollment, follow-up strategies, and data verification processes have been described in detail (Klein et al., 2007). Four institutions currently participate in the BMS program, five have participated since its inception.

Measures

The SWLS was developed to quantify global quality of life and satisfaction (Diener et al., 1985). Initial validity evidence was obtained from samples of undergraduates and older adults and later was expanded to several patient populations, including those with SCI and TBI (Braden et al., 2012; Post et al., 2012). It has been translated into numerous languages and has demonstrated adequate validity and reliability in a variety of populations (Schumaker, Shea, Monfries, & Groth-Marnat, 1993; Simpson, Schumaker, Dorahy, & Shrestha, 1996). The questionnaire consists of five items describing SWL that respondents rate on a 7-category Likert-type scale ranging from strongly disagree (1) to strongly agree (7). SWLS scores are calculated by summing item responses, with total scores ranging from 5 to 35. Higher scores indicate greater life satisfaction. SWLS scores are assigned qualitative grades by increments of 5 points (e.g., 5-9 = extremely dissatisfied; 31-35 = extremely satisfied). Some previous studies have used a 5-point rating scale (Kobau, Sniezek, Zack, Lucas, & Burns, 2010).

Analyses

Multiple psychometric properties of the SWLS were examined, including measurement invariance across the three injury groups, unidimensionality, local independence, reliability from both a CTT, and IRT framework, and fit to a unidimensional IRT model.

Measurement Invariance

Because the sample is composed of people with three types of injury, we completed a series of multigroup confirmatory factor analyses (MG-CFAs) to test for measurement invariance; doing so ensures that the construct of SWL is the same for the SCI, TBI, and BI groups. As an initial step, a single-factor confirmatory factor analysis (CFA) was fit to each group. Next, three levels of invariance, configural, metric, and scalar (Steenkamp & Baumgartner, 1998), each with increasing restrictions placed on the model, were assessed (Millsap, 2011). Configural invariance tests that the three populations have the same pattern on zero and nonzero loadings, while allowing the model parameters to be freely estimated. Metric invariance imposes equality of factor loadings across the populations. Retaining metric invariance does not ensure the same latent mean structure across populations, as the intercepts and residuals are free to vary. To confirm the equivalence of latent means across populations, scalar invariance was tested by restricting the pattern and estimates of loadings, as well as intercepts, to be equal across groups. Evidence of scalar invariance suggests that the SWLS functions the same among the three groups, providing support for combining the data for the IRT analysis.

Using the MG-CFA approach, we evaluated the fit of the baseline configural model using χ², root mean square error of approximation (RMSEA), comparative fit index (CFI), and Tucker–Lewis index (TLI). The χ² statistic is a measure of deviation of the fitted model from the unrestricted null model. A nonsignificant χ² statistic indicates the fit of the restricted model is no worse than the unrestricted model. The χ² statistic is sensitive to small deviations in model fit with large samples (Hu & Bentler, 1995). With a sample of almost 18,000, we expected this statistic to be significant so other statistics were used to evaluate measurement invariance. The RMSEA is a parsimony corrected, approximate fit index, with smaller values indicating better fit. Values less than 0.05 are considered to indicate a close approximate fit, while values between 0.05 and 0.08 are considered a reasonable fit (Browne & Cudeck, 1993). The 90% confidence intervals for the RMSEA were also created and evaluated. CFI and TLI are incremental fit indices that compare the fitted model with a baseline model. Hu and Bentler (1999) recommended values of 0.95 or higher as indicators of good fit. Additionally, in keeping with recommendations made by R. B. Kline (2010), we examined the full range of model estimates in addition to fit indices to evaluate the appropriateness of the model.

If the fit of the configural model is judged adequate, the fit of the more restrictive model is then compared with the less restrictive model. If the fit is not significantly degraded by the imposition of additional restrictions, then the more restrictive model is retained, accepting the restrictions. The change in the χ² (Δχ²) statistic is one way to judge differences in model fit. In Mplus, the DIFTEST option is used to evaluate the Δχ² statistic when the weighted least squares estimation with missing value estimator is used. Because χ² is sensitive to sample size, we judged change in model fit using the approach recommended by Cheung and Rensvold (2002): change in CFI (ΔCFI). Values less than 0.01 indicate a nonsignificant change between models, allowing the more restrictive model to be retained.

Unidimensionality

Sufficient unidimensionality is an assumption of all scales that provide one summary score, whether developed using a CTT or IRT framework. Sufficient unidimensionality means that the summary score is mainly driven by one primary dimension, in this case, SWL. The unidimensional structure of the scale was evaluated using a one-factor CFA. All CFAs were run in Mplus 7.2 (Muthén & Muthén, 1998-2012), using the weighted least squares estimation with missing value estimator. A CFI of 0.95 or higher was interpreted as support for unidimensionality (Reise, Scheines, Widaman, & Haviland, 2013).

Local Independence

Local independence is a necessary assumption for multi-item scales that provide a summary score. It assumes that once SWL is taken into account, the items should be independent of each other. Items with local dependence indicate redundancy, containing less information than the IRT model would predict and may bias item parameter estimates (W.-H. Chen & Thissen, 1997). Local independence violations can be evaluated using a matrix of residual correlations from the CFA. Residual correlations of less than 0.10 were interpreted as support for the absence of local dependence (Kim, De Ayala, Ferdous, & Nering, 2011).

Reliability Analysis

Classical and IRT methods were used to evaluate the reliability of the SWLS score. CTT reliability was evaluated using Cronbach’s alpha and item-total correlations. Cronbach’s alpha is a measure of internal consistency and shows how closely related the scale items are as a group. Cronbach’s alpha provides a single estimate for reliability that is constant across the measured construct (i.e., SWL). Values ranging from 0.70 to 0.90 indicate good reliability (Dunn, 1989). Values higher than 0.90 can indicate redundancy among the items (Boyle, 1991). The contribution of each item to the overall scale reliability was evaluated by examining Cronbach’s alpha once an item was deleted from the scale. Ideally, alpha should decrease if an item is removed, demonstrating a positive contribution of an item to the reliability of the scale. Item-total correlations were also calculated; values greater than 0.30 provide evidence of scale reliability (P. Kline, 1979).

Reliability from an IRT perspective is conceptualized as the amount of information in a given item or entire scale. When there is more information available, there is less variability. Lower variability increases the accuracy and, in turn, reliability of the entire scale. Reliability in the IRT framework varies along the continuum of SWL. We evaluated reliability from an IRT perspective by examining the test information curve; a value of 5 corresponds to a reliability of 0.80 in CTT, while test information of 10 equals a reliability of 0.90.

IRT Analysis

The data were fit to Samejima’s (1969) graded response (GR) model using IRTPRO 3 after confirming that the IRT assumptions of unidimensionality and local independence were met. A number of approaches were taken to assess the fit and adequacy of the GR model. Overall, model fit was judged by the M₂ statistic (Maydeu-Olivares & Joe, 2005) and by its associated RMSEA value (Maydeu-Olivares & Joe, 2014). Item-level fit was evaluated using Orlando and Thissen’s (2003) S − χ², a review of the estimated item parameters, and an inspection of item trace lines and item information curves. Item parameters of interest include the ability measure (θ) and the slope. θ is located on the continuum of the underlying trait being measured, akin to CTT’s item difficulty. The slope is essentially the discrimination, or how well an item differentiates between individuals on different ability levels. The trace lines are curves that indicate the probability of selecting a particular response option along the trait continuum. Ideally, the curve for each response option should be greatest at some point, not covered by another response curve. The item information curve show how the information changes across different levels of the trait, as measure by the item.

Results

Participants

The demographic characteristics of the three samples are presented in Table 1. The sample included 17,897 individuals with traumatic injury, 48% (n = 8,566) with SCI, 44% (n = 7,941) with TBI, and 8% (n = 1,390) with BI participating in the Model Systems. The mean age for the combined sample was 38.85 years, ranging from 35.10 for individuals with BI to 39.40 for those with TBI. Higher proportions of males than females were observed in all groups, with an average of 75.14% males. The percentage of individuals identifying as White ranged from 59.03% to 71.40%, with an overall proportion of 69.27%. A greater proportion of the TBI sample (38.48%) reported some college education or more than the SCI sample (20.92%). The majority of participants were married at time of injury. Education level and marital status are not available for individuals with BI.

Table 1.

Descriptive Statistics by Condition.

	Total	TBI	SCI	BMS
N	17,897	7,941	8,566	1,390
	M (SD)	M (SD)	M (SD)	M (SD)
Age, years	38.85 (17.58)	39.40 (18.26)	38.94 (16.44)	35.10 (19.88)
	N (%)	N (%)	N (%)	N (%)
Male	13,448 (75.14)	5,707 (71.87)	6,766 (78.99)	975 (70.14)
White	12,398 (69.27)	5,670 (71.40)	5,911 (69.01)	817 (59.03)
Some college or more	4,827 (29.24)	3,035 (38.48)	1,792 (20.92)	—
Single at time of injury	7,727 (46.81)	3,728 (46.9)	3,999 (46.7)	—
Employed at time of injury	11,304 (63.16)	5,233 (65.9)	5,316 (62.1)	755 (54.6)
Injury severity	M (SD)			M (SD)
Total body surface area burned	24.75 (19.86)	—	—	24.75 (19.86)
Category of neurological impairment	N (%)		N (%)
Paraplegia
Incomplete	1,640 (19.1)		1,640 (19.1)
Complete	2,082 (24.3)		2,082 (24.3)
Minimal deficit	27 (0.3)	—	27 (0.3)	—
Tetraplegia
Incomplete	3,082 (36.0)		3,082 (36.0)
Complete	1,310 (15.3)		1,310 (15.3)
Minimal deficit	43 (0.5)		43 (0.5)
Normal neurologic	6 (0.1)		6 (0.1)
Unknown	376 (4.4)		376 (4.4)

Note. TBI = traumatic brain injury; SCI = spinal cord injury; BMS = Burn Model System.

Measures of injury severity for SCI and BI are included for interested readers, no such measure is available for people with TBI. One measure of burn severity used by the BMS is the TBSA burned. Higher percentages of TBSA burned indicate a greater injury severity. SCIMS includes a measure of the neurologic impairment associated with the injury. Individuals are classified as having paraplegia (paralysis and/or loss of sensory of the legs and lower body), tetraplegia (paralysis and/or loss of sensory of all four limbs and torso), or normal neurologic. Paraplegia and tetraplegia are further subclassified as complete injury, incomplete injury, or minimal deficit. Complete injury refers to an absence of sensory or motor function in the lowest sacral segment. Incomplete injury refers to a partial preservation of sensory and/or motor function below the neurological level and includes the lowest sacral segment. Minimal deficit refers to minimal neurological damage with no significant or incapacitating loss of function. Finally, normal neurologic refers to individuals with no demonstrable muscular weakness or impaired sensation, and is free of other significant neurologic complications.

Measurement Invariance

The results for the one-factor model is presented in Table 2. The χ² and RMSEA indicated poor model fit; however, the CFI and TLI showed good model fit. The results of the MG-CFAs provided support for configural, metric, and scalar measurement invariance and are presented in Table 3. The fit of the initial configural model was acceptable indicating the same pattern of loadings for each subgroup. The test of metric invariance, with loading constrained equal between subgroups, showed good model fit. The Δχ² statistic indicated a significant difference in model fit, suggesting the metric model provided a worse fit to the data, Δχ²(10) = 103.48, p < .001. However, the ΔCFI = 0.001 provided evidence that the fit was not significantly worse for the metric model. Thus, the model was retained and metric invariance was supported.

Table 2.

Model Fit Statistics for One-Factor Model for Each Injury Group.

	BMS	TBI	SCI
χ²(df)	64.522(5)*	444.030(5)*	64.522(5)*
RMSEA	0.093	0.105	0.157
90% CI	[0.073, 0.113]	[0.097, 0.114]	[0.149, 0.165]
CFI	0.998	0.993	0.983
TLI	0.995	0.985	0.967

Note. BMS = Burn Model System; TBI = traumatic brain injury; SCI = spinal cord injury; RMSEA = root mean square error of approximation; CI = confidence interval; CFI = comparative fit index; TLI = Tucker–Lewis index; df = degrees of freedom.

p < .001.

Table 3.

Model Fit Statistics for the Multigroup Confirmatory Factor Analysis.

	Configural	Metrical	Scalar
χ²(df)	1223.383(53)*	1019.090(63)*	1060.112(73)*
Δχ²(df)	NA	103.482(10)*	144.203(10)*
RMSEA	0.061	0.050	0.048
90% CI	[0.058, 0.064]	[0.048, 0.053]	[0.045, 0.050]
CFI	0.992	0.993	0.993
ΔCFI	NA	0.001	0.000
TLI	0.995	0.997	0.997

Note. RMSEA = root mean square error of approximation; CI = confidence interval; CFI = comparative fit index; TLI = Tucker–Lewis index; df = degrees of freedom.

p < .001.

The fit statistics indicated a close approximate fit for the scalar model. As with the metric model, the Δχ² statistic indicated a significant difference in model fit for the scalar model, Δχ²(10) = 144.20, p < .001, but the ΔCFI < 0.001 suggests the scalar model fits equally well and was retained. The results of the tests of measurement invariance suggested the scale functions in the same manner for participants with BI, SCI, or TBI. Individuals with similar levels of SWL tend to respond in similar ways, no matter their condition. This finding lends support to the use of SWLS scores for comparisons between the conditions. Consequently, the groups were combined and analyzed together in the IRT analysis.

Unidimensionality

The results of the initial single-factor model provided evidence for the essentially unidimensional nature of the scale, with a CFI of 0.99. The result suggests the single-factor model provides an adequate representation of the latent structure, and supports the use of the unidimensional IRT model, and provides support for one summary score.

Local Independence

The residual correlations between items ranged from −0.04 to 0.09, indicating low or no local dependence.

Reliability

The CTT analysis supported adequate reliability (α = .85) of the SWL scale. Item 5, “If I could live my life over, I would change almost nothing,” did not contribute positively to the overall reliability, with α increasing to .86 with the item’s removal. Item-total correlations ranged from 0.52 (Item 5) to 0.75 (Item 3).

Evaluation of the total information curve (Figure 1), found strong reliability in the center, near the mean SWLS score. The information function was slightly negatively skewed, providing better reliability a bit above the mean than below. The estimated reliability was greater than 0.80 from approximately 1.8 SDs below mean SWL to 2.0 SD above, that is, approximately between the scores of 6 and 34, encompassing nearly the full range of possible SWLS scores. Of the current sample, 3,544 individuals, or 19.8%, would fall outside of this range.

Figure 1.

Total information curve and standard error.

IRT Analysis

The data for the whole sample were fit to the GR model. We anticipated poor item-level fit due to the sensitivity of the S − χ² statistic with such a large sample size. As expected, the S − χ² statistic was significant for each item, ps < .001. The slopes and information for Items 1 to 3 were good, providing evidence of their usefulness in the model and the scale. The slopes (Table 4) for Items 4 and 5 are noticeably smaller, indicating less discrimination and providing less information throughout the item, as seen by the dotted lines in Figures 2 to 6.

Table 4.

Parameter Estimates for the Graded Response Model.

Item	Slope	Threshold
Item	Slope	1	2	3	4	5	6
1	2.88	−1.01	−0.35	−0.04	0.19	0.60	1.57
2	3.41	−1.09	−0.35	−0.04	0.18	0.59	1.44
3	3.51	−1.25	−0.59	−0.32	−0.13	0.25	1.19
4	1.81	−1.76	−0.87	−0.53	−0.30	0.17	1.43
5	1.33	−1.32	−0.30	0.08	0.31	0.71	1.92

Figure 2.

Trace lines and item information function for Item 1.

Figure 3.

Trace lines and item information function for Item 2.

Figure 4.

Trace lines and item information function for Item 3.

Figure 5.

Trace lines and item information function for Item 4.

Figure 6.

Trace lines and item information function for Item 5.

The trace lines in Figures 2 to 6 indicate that one or more categories in each item was unlikely to be chosen by any participants, regardless of the level of a person’s SWL, indicating that only some response options are useful in differentiating different SWL levels (Adams, Wu, & Wilson, 2012). Seven appears to be too many response options.

Item 5 made little contribution to the reliability of the scale, as indicated by the low item information seen in Figure 6 and an increase of Cronbach’s alpha with the item’s removal. Combined with the failure of some categories to differentiate individuals, this item appears to add little to the overall scale.

Discussion

The results of this study generally support adequate psychometric properties of the SWLS. First, we found support that the construct of SWL functions in a similar way for all the three traumatic injury populations. Thus, scores on the SWLS can be compared between individuals with different traumatic injuries, opening additional avenues of HR-QOL research in individuals with traumatic injuries.

Second, the results supported unidimensionality and local independence of the SWLS items, suggesting that the items all measure a single construct. This finding corroborates the use and interpretation of a single summary score for the SWLS in people with traumatic injuries. These findings also support the use of the IRT model, and interpretation of item scores.

Third, the adequate reliability of the scale was supported by CTT and IRT evidence. The IRT analysis showed adequate reliability within two standard errors of the mean ability score. Of the fifth of the sample whose scores did not reach adequate reliability, the vast majority (about 81%) had scores more two standard errors above the mean, indicating that high levels of SWL were not reliably measured by the SWL. Caution should therefore be exercised when reporting and interpreting scores outside of this range, as increased levels of error would undermine certainty in the results.

The results of the IRT analysis indicate the scale could be improved in two ways. First, seven response options are not supported. For all items, at least two response options have a lower likelihood of ever being endorsed than the other options; the trace lines for these options are covered by the trace lines for other response options. The study results support a maximum of five response categories (strongly disagree, disagree, neither agree nor disagree, agree, strongly agree) rather than the current seven, confirming previous research using the SWLS with a 5-point scale (Kobau et al., 2010) and in keeping with research showing that five response options tend to maximize reliability (Bandalos & Enders, 1996).The use of a 5-point scale would also bring the SWLS in line with other widely used HR-QOL measures, like the Patient-Reported Outcomes Measurement Information System measures (Carle et al., 2011), increasing uniformity of response options across scales, simplifying instructions, and potentially decreasing respondent burden (Bradburn, 1978). The burden placed on respondents by different aspects of surveys has long been a concern (Sharp & Frankel, 1983). As the Model Systems surveys, and those included in other HR-QOL research, include a number of scales in addition to demographic information, the streamlining of responses and lessoning of burden could have a meaningful impact.

Second, the fifth item, “If I could live my life over, I would change almost nothing,” does not function well and reduces the reliability of the scale. Other researchers have also found issues with the item (Heinemann, Sokol, Garvin, & Bode, 2002). Tulsky et al. (2011) noted that some individuals that have suffered a traumatic injury may find item five offensive, as it asks if the person would change nothing about their life if given a choice. Partly because of this concern, the appropriateness of the scale for individuals with traumatic injury has been called into question (Hill, Noonan, Sakakibara, & Miller, 2010). Though previous research has identified this issue, research continues to use the full scale (Goverman et al., 2016; Tate et al., 2015). Removal of Item 5 would improve the psychometric properties of the scale and obviate concerns about the appropriateness of the scale for individuals with traumatic injury. It would remove a potential source of construct irrelevant variance, and make the scale more universally comparable, improving its utility in research.

A consequence of implementing these recommendations is the loss of comparability with previous reports. This limitation could be overcome by developing a cross-walking table to convert the original scores into modified scale scores, or to develop an alternate scoring of the four items. The psychometric properties of the SWLS could be improved while still maintaining continuity with scores obtained in previous studies using the original items and scoring.

Limitations

The results of this study need to be considered within the context of its limitations. No evidence for the validity of the SWLS scores was collected or evaluated during this study. The evaluation of validity evidence for the use the SWLS scores in the three populations was outside the scope of this study. Validity evidence for the use of SWLS scores in individuals with SCI is available. However, there is no evidence in TBI and BI samples. The removal of Item 5 would obviate concern for the applicability of the SWLS to persons with traumatic injury (Hill et al., 2010; Tulsky et al., 2011). More general concerns for the use of the scale in populations with traumatic injury warrant further research into the validity of the SWLS scores.

The samples used in the current study were unbalanced, with fewer cases with BI than either SCI or TBI. In other types of analysis, the results might be unduly influenced by the larger groups. However, the measurement invariance in the SWLS across the three groups indicates that the unbalanced sample did not affect the IRT results.

Finally, the samples for each of the injury types, SCI, TBI, and BI, are not necessarily representative of the larger population of individuals with such injuries, though Corrigan et al. (2012) provides evidence of the TBIMS’ representativeness. The specific recruitment and inclusion conditions for each of the model systems make the samples less general than the whole of individuals with SCI, TBI, or BI.

Conclusion

This study provided support for the comparison of SWLS scores in populations with SCI, TBI, and BI, as well as recommendations for improving the functioning of the scale. The SWLS could be improved by using five rather than seven response options, as well as the removal of Item 5.

Footnotes

Authors’ Note

The contents of this publication do not necessarily represent the policy of National Institute on Disability, Independent Living and Rehabilitation Research, Administration for Community Living, Department of Health and Human Services, and endorsement by the Federal Government should not be assumed.

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The contents of this publication were developed in part under grants from the National Institute on Disability, Independent Living, and Rehabilitation Research (NIDILRR Grant Numbers H133P120002, 90DP0053, 90DP0031, 90SI5006). NIDILRR is a center within the Administration for Community Living and Department of Health and Human Services.

References

Adams

R. J.

M. L.

Wilson

(2012). The Rasch rating model and the disordered threshold controversy. Educational and Psychological Measurement, 72, 547-573. doi:10.1177/0013164411432166

Bandalos

D. L.

Enders

C. K.

(1996). The effects of nonnormality and number of response categories on reliability. Applied Measurement in Education, 9, 151-160.

Becchetti

Ricca

E. G.

Pelloni

(2012). The relationship between social leisure and life satisfaction: Causality and policy implications. Social Indicators Research, 108, 453-490.

Biering-Sorensen

Alai

Anderson

Charlifue

Chen

DeVivo

. . . Jakeman

L. B.

(2015). Common data elements for spinal cord injury clinical research: A National Institute for Neurological Disorders and Stroke project. Spinal Cord, 53, 265-277. doi:10.1038/sc.2014.246

Boyle

G. J.

(1991). Does item homogeneity indicate internal consistency or item redundancy in psychometric scales? Personality and Individual Differences, 12, 291-294. doi:10.1016/0191-8869(91)90115-R

Bradburn

(1978). Respondent burden. In Proceedings of the Survey Research Methods Section of the American Statistical Association (pp. 35-40). Retrieved from http://ww2.amstat.org/sections/srms/Proceedings/papers/1978_007.pdf

Braden

C. A.

Cuthbert

J. P.

Brenner

Hawley

Morey

Newman

. . . Harrison-Felix

(2012). Health and wellness characteristics of persons with traumatic brain injury. Brain Injury, 26, 1315-1327. doi:10.3109/02699052.2012.706351

Browne

M. W.

Cudeck

(1993). Alternative ways of assessing model fit. In Bollen

K. A.

Long

J. S.

(Eds.), Testing structural equation models (pp. 136-162). Newbury Park, CA: Sage.

Carle

A. C.

Cella

Cai

Choi

S. W.

Crane

P. K.

Curtis

S. M.

. . . Hays

R. D.

(2011). Advancing PROMIS’s methodology: Results of the Third Patient-Reported Outcomes Measurement Information System (PROMIS®) Psychometric Summit. Expert Review of Pharmacoeconomics & Outcomes Research, 11, 677-684. doi:10.1586/erp.11.74

10.

Charlifue

Apple

Burns

S. P.

Chen

Cuthbert

J. P.

Donovan

W. H.

. . . Pretz

C. R.

(2011). Mechanical ventilation, health, and quality of life following spinal cord injury. Archives of Physical Medicine and Rehabilitation, 92, 457-463. doi:10.1016/j.apmr.2010.07.237

11.

Chen

W.-H.

Thissen

(1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265-289. doi:10.2307/1165285

12.

Chen

Cao

Allen

Richards

J. S.

(2011). Weight matters: Physical and psychosocial well being of persons with spinal cord injury in relation to body mass index. Archives of Physical Medicine and Rehabilitation, 92, 391-398. doi:10.1016/j.apmr.2010.06.030

13.

Chen

DeVivo

M. J.

Richards

J. S.

SanAgustin

T. B.

(2016). Spinal cord injury model systems: Review of program and national database from 1970 to 2015. Archives of Physical Medicine and Rehabilitation, 97, 1797-1804. doi:10.1016/j.apmr.2016.02.027

14.

Cheung

G. W.

Rensvold

R. B.

(2002). Evaluating goodness-of-fit indexes for testing measurement invariance. Structural Equation Modeling, 9, 233-255. doi:10.1207/S15328007SEM0902_5

15.

Cicerone

K. D.

Azulay

(2007). Perceived self-efficacy and life satisfaction after traumatic brain injury. Journal of Head Trauma Rehabilitation, 22, 257-266. doi:10.1097/01.HTR.0000290970.56130.81

16.

Corrigan

J. D.

Cuthbert

J. P.

Whiteneck

G. G.

Dijkers

M. P.

Coronado

Heinemann

A. W.

. . . Graham

J. E.

(2012). Representativeness of the Traumatic Brain Injury Model Systems National Database. Journal of Head Trauma Rehabilitation, 27, 391-403. doi:10.1097/HTR.0b013e3182238cdd

17.

Corrigan

J. D.

Kolakowsky-Hayner

Wright

Bellon

Carufel

(2013). The Satisfaction With Life Scale. Journal of Head Trauma Rehabilitation, 28, 489-491. doi:10.1097/HTR.0000000000000004

18.

Costa

B. A.

Engrav

L. H.

Holavanahalli

Lezotte

D. C.

Patterson

D. R.

Kowalske

K. J.

Esselman

P. C.

(2003). Impairment after burns: A two-center, prospective report. Burns, 29, 671-675.

19.

Davis

L. C.

Sherer

Sander

A. M.

Bogner

J. A.

Corrigan

J. D.

Dijkers

M. P.

. . . Seel

R. T.

(2012). Preinjury predictors of life satisfaction at 1 year after traumatic brain injury. Archives of Physical Medicine and Rehabilitation, 93, 1324-1330. doi:10.1016/j.apmr.2012.02.036

20.

Diener

(1984). Subjective well-being. Psychological Bulletin, 95, 542-575. doi:10.1037/0033-2909.95.3.542

21.

Diener

Emmons

R. A.

Larsen

R. J.

Griffin

(1985). The Satisfaction With Life Scale. Journal of Personality Assessment, 49, 71-75. doi:10.1207/s15327752jpa4901_13

22.

Diener

Tay

(2015). Subjective well-being and human welfare around the world as reflected in the Gallup World Poll. International Journal of Psychology, 50, 135-149. doi:10.1002/ijop.12136

23.

Dijkers

M. P.

(1999). Correlates of life satisfaction among persons with spinal cord injury. Archives of Physical Medicine and Rehabilitation, 80, 867-876. doi:10.1016/S0003-9993(99)90076-X

24.

Dijkers

M. P.

Harrison-Felix

Marwitz

J. H.

(2010). The traumatic brain injury model systems: History and contributions to clinical service and research. Journal of Head Trauma Rehabilitation, 25, 81-91. doi:10.1097/HTR.0b013e3181cd3528

25.

Dolan

White

M. P.

(2007). How can measures of subjective well-being be used to inform public policy? Perspectives on Psychological Science, 2, 71-85.

26.

Dunn

(1989). Design and analysis of reliability studies: The statistical evaluation of measurement errors. New York, NY: Edward Arnold.

27.

Gilboa

Bisk

Montag

Tsur

(1999). Personality traits and psychosocial adjustment of patients with burns. Journal of Burn Care & Rehabilitation, 20, 340-346.

28.

Goverman

Mathews

Nadler

Henderson

McMullen

Herndon

. . . Schneider

J. C.

(2016). Satisfaction with life after burn: A Burn Model System National Database Study. Burns, 42, 1067-1073. doi:10.1016/j.burns.2016.01.018

29.

Guyatt

G. H.

Feeny

D. H.

Patrick

D. L.

(1993). Measuring health-related quality of life. Annals of Internal Medicine, 118, 622-629.

30.

Hambleton

R. K.

Van Der Linden

W. J.

(1982). Advances in item response theory and applications: An introduction. Applied Psychological Measurement, 6, 373-378. doi:10.1177/014662168200600401

31.

Hartoonian

Hoffman

J. M.

Kalpakjian

C. Z.

Taylor

H. B.

Krause

J. K.

Bombardier

C. H.

(2014). Evaluating a spinal cord injury-specific model of depression and quality of life. Archives of Physical Medicine and Rehabilitation, 95, 455-465. doi:10.1016/j.apmr.2013.10.029

32.

Heinemann

A. W.

Sokol

Garvin

Bode

R. K.

(2002). Measuring unmet needs and services among persons with traumatic brain injury. Archives of Physical Medicine and Rehabilitation, 83, 1052-1059.

33.

Hill

M. R.

Noonan

V. K.

Sakakibara

B. M.

Miller

W. C.

(2010). Quality of life instruments and definitions in individuals with spinal cord injury: A systematic review. Spinal Cord, 48, 438-450. doi:10.1038/sc.2009.164

34.

L.-t.

Bentler

P. M.

(1995). Evaluating model fit. In Hoyle

R. H.

(Ed.), Structural equation modeling: Concepts, issues, and applications (pp. 76-99). Thousand Oaks, CA: Sage.

35.

L.-t.

Bentler

P. M.

(1999). Cutoff Criteria for fit indexes in covariance structure analysis: Conventional criteria versus new alternatives. Structural Equation Modeling, 6, 1-55.

36.

Jacobsson

Lexell

(2013). Life satisfaction 6-15 years after a traumatic brain injury. Journal of Rehabilitation Medicine, 45, 1010-1015. doi:10.2340/16501977-1204

37.

Kim

De Ayala

R. J.

Ferdous

A. A.

Nering

M. L.

(2011). The comparative performance of conditional independence indices. Applied Psychological Measurement, 35, 447-471. doi:10.1177/0146621611407909

38.

Klein

M. B.

Lezotte

D. L.

Fauerbach

J. A.

Herndon

D. N.

Kowalske

K. J.

Carrougher

G. J.

. . . Engrav

L. H.

(2007). The National Institute on Disability and Rehabilitation Research burn model system database: A tool for the multicenter study of the outcome of burn injury. Journal of Burn Care & Research, 28, 84-96.

39.

Kline

(1979). Psychometrics and psychology. New York, NY: Academic Press.

40.

Kline

R. B.

(2010). Principles and practice of structural equation modeling (3rd ed.). New York, NY: Guilford Press.

41.

Kobau

Sniezek

Zack

M. M.

Lucas

R. E.

Burns

(2010). Well-being assessment: An evaluation of well-being scales for public health and population estimates of well-being among US adults. Applied Psychology: Health and Well-Being, 2, 272-297. doi:10.1111/j.1758-0854.2010.01035.x

42.

Lee

Vlaev

King

Mayer

Darzi

Dolan

(2013). Subjective well-being and the measurement of quality in healthcare. Social Science & Medicine, 99, 27-34. doi:10.1016/j.socscimed.2013.09.027

43.

Maas

A. I.

Harrison-Felix

C. L.

Menon

Adelson

P. D.

Balkin

Bullock

. . . Schwab

(2010). Common data elements for traumatic brain injury: Recommendations from the interagency working group on demographics and clinical assessment. Archives of Physical Medicine and Rehabilitation, 91, 1641-1649. doi:10.1016/j.apmr.2010.07.232

44.

Maydeu-Olivares

Joe

(2005). Limited- and full-information estimation and goodness-of-fit testing in 2ⁿ contingency tables: A unified framework. Journal of the American Statistical Association, 100, 1009-1020.

45.

Maydeu-Olivares

Joe

(2014). Assessing approximate fit in categorical data analysis. Multivariate Behavioral Research, 49, 305-328. doi:10.1080/00273171.2014.911075

46.

McMahon

P. J.

Hricik

Yue

J. K.

Puccio

A. M.

Inoue

Lingsma

H. F.

. . . Vassar

M. J.

(2013). Symptomatology and functional outcome in Mild Traumatic Brain Injury: Results from the prospective TRACK-TBI study. Journal of Neurotrauma, 31, 26-33. doi:10.1089/neu.2013.2984

47.

Millsap

R. E.

(2011). Statistical approaches to measurement invariance. New York, NY: Routledge.

48.

Moergeli

Wittmann

Schnyder

(2012). Quality of life after traumatic injury: A latent trajectory modeling approach. Psychotherapy and Psychosomatics, 81, 305-311.

49.

Muthén

L. K.

Muthén

B. O.

(1998-2012). Mplus user’s guide (7th ed.). Los Angeles, CA: Author.

50.

Diener

(2014). What matters to the rich and the poor? Subjective well-being, financial satisfaction, and postmaterialist needs across the world. Journal of Personality and Social Psychology, 107, 326-338. doi:10.1037/a0036856

51.

Orlando

Thissen

(2003). Further investigation of the performance of S − X²: An item fit index for use with dichotomous item response theory models. Applied Psychological Measurement, 27, 289-298. doi:10.1177/0146621603027004004

52.

Patrick

D. L.

Erickson

(1988). Assessing health-related quality of life for clinical decision making. In Walker

S. R.

Rosser

R. M.

(Eds.), Quality of life: Assessment and application (pp. 9-19). Lancaster, England: MTP Press.

53.

Patterson

D. R.

Ptacek

J. T.

Cromes

Fauerbach

J. A.

Engrav

(2000). The 2000 clinical research award: Describing and predicting distress and satisfaction with life for burn survivors. Journal of Burn Care & Rehabilitation, 21, 490-498.

54.

Pavot

Diener

Colvin

C. R.

Sandvik

(1991). Further validation of the Satisfaction With Life Scale: Evidence for the cross-method convergence of well-being measures. Journal of Personality Assessment, 57, 149-161. doi:10.1207/s15327752jpa5701_17

55.

Post

M. W.

Van Dijk

A. J.

Van Asbeck

F. W.

Schrijvers

A. J.

(1998). Life satisfaction of persons with spinal cord injury compared to a population group. Scandinavian Journal of Rehabilitation Medicine, 30(1), 23-30.

56.

Post

M. W.

Van Leeuwen

C. M.

Van Koppenhagen

C. F.

De Groot

(2012). Validity of the Life Satisfaction Questions, the Life Satisfaction Questionnaire, and the Satisfaction With Life Scale in Persons With Spinal Cord Injury. Archives of Physical Medicine and Rehabilitation, 93, 1832-1837. doi:10.1016/j.apmr.2012.03.025

57.

Putzke

J. D.

Richards

J. S.

Hicken

B. L.

DeVivo

M. J.

(2002). Predictors of life satisfaction: A spinal cord injury cohort study. Archives of Physical Medicine and Rehabilitation, 83, 555-561.

58.

Raykov

(2004). Behavioral scale reliability and measurement invariance evaluation using latent variable modeling. Behavior Therapy, 35, 299-331. doi:10.1016/S0005-7894(04)80041-8

59.

Reise

S. P.

Scheines

Widaman

K. F.

Haviland

M. G.

(2013). Multidimensionality and structural coefficient bias in structural equation modeling: A bifactor perspective. Educational and Psychological Measurement, 73, 5-26. doi:10.1177/0013164412449831

60.

Rosengren

Jonasson

S. B.

Brogårdh

Lexell

(2015). Psychometric properties of the Satisfaction With Life Scale in Parkinson’s disease. Acta Neurologica Scandinavica, 132, 164-170. doi:10.1111/ane.12380

61.

Samejima

(1969). Estimation of latent ability using a response pattern of graded scores (Psychometric Monograph No. 17). Richmond, VA: Psychometric Society.

62.

Schulz-Heik

R. J.

Poole

J. H.

Dahdah

M. N.

Sullivan

Date

E. S.

Salerno

R. M.

. . . Harris

(2016). Long-term outcomes after moderate-to-severe traumatic brain injury among military veterans: Successes and challenges. Brain Injury, 30, 271-279. doi:10.3109/02699052.2015.1113567

63.

Schumaker

Shea

Monfries

Groth-Marnat

(1993). Loneliness and life satisfaction in Japan and Australia. Journal of Psychology, 127, 65-71. doi:10.1080/00223980.1993.9915543

64.

Sharp

L. M.

Frankel

(1983). Respondent burden: A test of some common assumptions. Public Opinion Quarterly, 47, 36-53.

65.

Sijtsma

(2015). Delimiting coefficient α from internal consistency and unidimensionality. Educational Measurement: Issues and Practice, 34(4), 10-13. doi:10.1111/emip.12099

66.

Simpson

Schumaker

Dorahy

Shrestha

(1996). Depression and life satisfaction in Nepal and Australia. Journal of Social Psychology, 136, 783-790. doi:10.1080/00224545.1996.9712255

67.

Steenkamp

Baumgartner

(1998). Assessing measurement invariance in cross-national consumer research. Journal of Consumer Research, 25, 78-90. doi:10.1086/209528

68.

Stutzer

(2004). The role of income aspirations in individual happiness. Journal of Economic Behavior & Organization, 54, 89-109. doi:10.1016/j.jebo.2003.04.003

69.

Tate

D. G.

Forchheimer

Bombardier

C. H.

Heinemann

A. W.

Neumann

H. D.

Fann

J. R.

(2015). Differences in quality of life outcomes among depressed spinal cord injury trial participants. Archives of Physical Medicine and Rehabilitation, 96, 340-348. doi:10.1016/j.apmr.2014.09.036

70.

Tulsky

Kisala

P. A.

Victorson

Tate

Heinemann

Amtmann

Cella

(2011). Developing a contemporary patient-reported outcomes measure for spinal cord injury. Archives of Physical Medicine and Rehabilitation, 92, S44-S51. doi:10.1016/j.apmr.2011.04.024

71.

Urry

H. L.

Nitschke

J. B.

Dolski

Jackson

D. C.

Dalton

K. M.

Mueller

C. J.

. . . Davidson

R. J.

(2004). Making a life worth living: Neural correlates of well-being. Psychological Science, 15, 367-372.

72.

Williamson

M. L.

Elliott

T. R.

Bogner

Dreer

L. E.

Arango-Lasprilla

J. C.

Kolakowsky-Hayner

S. A.

. . . Perrin

P. B.

(2016). Trajectories of life satisfaction over the first 10 years after traumatic brain injury: Race, gender, and functional ability. Journal of Head Trauma Rehabilitation, 31, 167-179. doi:10.1097/HTR.0000000000000111

73.

Wilson

J. R.

Hashimoto

R. E.

Dettori

J. R.

Fehlings

M. G.

(2011). Spinal cord injury and quality of life: A systematic review of outcome measures. Evidence-Based Spine-Care Journal, 2(1), 37-44. doi:10.1055/s-0030-1267085