Differential Item Functioning by HIV Status and Sexual Orientation of the Center for Epidemiological Studies–Depression Scale: An Item Response Theory Analysis

Abstract

The Center for Epidemiological Studies–Depression Scale (CES-D) is the most widely used instrument to assess depressive symptoms in people living with HIV. However, its differential item functioning (DIF) by HIV status and sexual orientation has yet to be explored. This study examined DIF and measurement invariance of the CES-D using an item response theory (IRT) framework, and a more traditional factor analytic approach. Data from 841 HIV-infected and HIV-uninfected individuals, from Miami, Florida, were analyzed. Uniform DIF by HIV status was detected in Items 4, 12, and 16 from the Positive Affect factor. Nonuniform DIF was detected in Items 13 and 17. Uniform DIF by sexual orientation was detected in Items 2, 15, and 19, two of them from the Interpersonal factor. Nonuniform DIF was detected in Item 2. Using a factor analytic approach, the CES-D was invariant at the configural and metric levels by HIV and sexual orientation. These findings indicate that overall, however, using IRT, the magnitudes of DIF were negligible, the CED-D was somewhat invariant using factor analytic methods; the CES-D may be reliably used to compare by HIV status or sexual orientation.

Keywords

CES-D HIV sexual orientation item response theory DIF

Depression is the most prevalent mental health problem in people living with HIV (PLWH; Adams, Zacharia, Masters, Coffey, & Catalan, 2016; Cholera et al., 2017; Nacher et al., 2010; Nanni, Caruso, Mitchell, Meggiolaro, & Grassi, 2015), and the prevalence of depression in this population ranges from 36% to 42% in the United States (Rabkin, 2008), twice as high as in the general population (Ciesla & Roberts, 2001). Depressive symptoms are predictors of negative outcomes for the health and well-being of PLWH, and depression has been associated with poor adherence to antiretroviral treatment (Mayston, Kinyanda, Chishinga, Prince, & Patel, 2012) and adverse clinical events, such as increased HIV symptoms, detectable viral load, and a faster disease progression (Brandt, Bakhshaie, Zvolensky, Grover, & Gonzalez, 2015; Norcini Pala & Stecca, 2011; Sumari-de Boer, Sprangers, Prins, & Nieuwkerk, 2012). As such, depression is an important target for interventions to improve PLWH’s retention in HIV care, health outcomes, and overall quality of life.

In the context of HIV, valid and reliable instruments are essential in the assessment of depressive symptoms and its severity. The Center for Epidemiological Studies–Depression Scale (CES-D; Radloff, 1977) is used to evaluate the frequency and severity of depressive symptoms; it also includes cutoff scores to enable its use as a screening instrument. The CES-D is one of the most widely used instruments in the public domain (Gay, Kottorp, Lerdal, & Lee, 2016), and it is the most commonly used measure of depression in HIV research (Simoni et al., 2011). Initially, Radloff (1977) identified four factors in the CES-D using principal components analysis: Depressed or Negative Affect, Positive Affect, Somatic Symptoms/Retarded Activity, and Interpersonal. However, the author has since argued against excessive emphasis on separate factors and suggested using the total score to measure depressive symptomatology.

In addition to the sound psychometric properties of the CES-D (Devins et al., 1988; Shafer, 2006), the extensive use of this scale requires that the measure functions similarly across groups. Differential item functioning (DIF) is related to unequal probabilities of giving a certain response on an item for members of different groups, after matching on the attributes the test is intended to measure. The absence of DIF indicates that groups can be meaningfully compared (Choi, Gibbons, & Crane, 2011). Presence of DIF indicates that differences in scores may be attributable to response bias rather than to actual differences in the frequency or severity of depressive symptoms, and may also suggest that other variables, besides the latent construct of interest, may be influencing the scores. This may compromise the reliable interpretation and validity of the scores and the comparison across groups (Yang & Jones, 2007). DIF also indicates that the items in a scale operate inconsistently across groups and, therefore, lack measurement equivalence (Hambleton, Swaminathan, & Rogers, 1991). In this sense, DIF may suggest lack of measurement invariance, which may lead to conclude that CES-D DIF has been examined across a variety of characteristics with disparate and inconsistent results—for example, gender (Gay et al., 2016; Yang & Jones, 2007), age (Covic, Pallant, Conaghan, & Tennant, 2007), ethnicity, cultural background, and/or level of acculturation (Jang, Kwag, & Chiriboga, 2010).

Despite the widespread use of the CES-D in people with HIV and the prevalence of depressive symptoms in this population, few studies have examined CES-D DIF in HIV. Using an item response theory approach, Gay et al. (2016) analyzed DIF of the CES-D in a sample of PLWH. DIF was found for eight items according to race, gender, and AIDS diagnosis. For example, the item “I felt hopeful about the future” was more easily endorsed by people who had not been diagnosed with AIDS, whereas African Americans were more likely to endorse the item “I felt that people disliked me.” As differential responding may cause scores to vary systematically across subgroups, it is essential to determine if differences are clinically meaningful for diagnosis and if cutoffs should be adjusted to be more appropriate for specific subgroups, for a more reliable interpretation of scores. In addition to this, to our knowledge, DIF has not been evaluated by HIV diagnosis, although HIV-uninfected and HIV-infected individuals are frequently compared, and this comparison is necessary to establish evidence-based practices tailored to this population. The absence of DIF would support the validity of mean-level comparisons across HIV-infected and HIV-uninfected individuals.

Moreover, little research has examined DIF of the CES-D associated with sexual orientation. The limited evidence available indicates that specific items, such as those related to interpersonal problems or relations (e.g., people were unfriendly), are more frequently endorsed by transgender persons (Gay et al., 2016). Accordingly, the same effect was observed among other highly stigmatized groups, such as African Americans (Yang & Jones, 2007). Evidence has consistently shown the association between stigma and depression in sexual minorities (Herek, Saha, & Burack, 2013; McDowell, Hughto, & Reisner, 2019). Excessive chronic social stress, due to the experience or constant expectation of stigma and discrimination, known as minority stress, underlies the high levels of emotional distress and depressive symptoms in sexual minorities (Meyer, 2003; Valentine & Shipherd, 2018). Within an intersectionality framework, co-occurring stigmas, such as sexual orientation and HIV-related stigmas, interact, exacerbating and increasing their negative impact on health and well-being (Logie, James, Tharao, & Loutfy, 2011). This may lead to even higher levels of depression in sexual minority individuals with HIV. Thus, these individuals are expected to endorse the Interpersonal items more readily than individuals from general population, which may be due to actual experiences of rejection from others, independent of their level of depression. Therefore, they may endorse these items more frequently even in the presence of lower levels of depression. As a result of differential functioning of these items, comparison of CES-D across groups by sexual orientation may be limited. Conversely, absence of DIF would suggest that mean-level comparisons between individuals who identify as a sexual minority and those who do not, are valid.

These studies highlight the importance of exploring DIF of the CES-D by HIV status and sexual orientation. The objective of this study was to examine DIF of the CES-D using an item response theory framework (Bernstein, Ahluvalia, Pogge, & Handelsman, 1997; Bernstein et al., 2003), as well as a more traditional factor analytic approach. It was hypothesized that nonuniform DIF and measurement non-invariance would be present by sexual orientation and HIV status. Sexual minorities were anticipated to be more likely to endorse Interpersonal items, whereas HIV negative individuals were expected to endorse Positive Affect items more readily.

Method

Study Recruitment and Design

To maximize the inclusion of sexual minorities and increase sample heterogeneity, which increases the stability of the estimated parameters in item response theory (Hambleton et al., 1991), data from two studies were aggregated for analyses. Study 1 focused on HPT (hypothalamic pituitary thyroidal) and HPA (hypothalamic pituitary adrenal) axis functioning among MSM (men who have sex with men) from ethnically and racially diverse backgrounds recruited from Miami, Florida (N = 347); this study has been previously described (Carrico, Rodriguez, Jones, & Kumar, 2017). Study 2 focused on predictive biomarkers of cardiovascular disease among HIV-infected and HIV-uninfected individuals in Miami, Florida (N = 494) and has been previously described (Rodriguez et al., 2019).

Ethical Approval

Ethical approval from the University of Miami Miller School of Medicine institutional review board was obtained prior to study onset.

Participants and Procedures

All participants (N = 841) completed all measures in person at the study site on Qualtrics, a web-based data collection platform. Participants were compensated US$50 for their time and transportation to the study site.

Measures

Demographic Characteristics

Participants completed a demographic questionnaire that included age, gender, race, ethnicity, and sexual orientation.

Depressive Symptoms

The CES-D (Radloff, 1977) was used to measure depressive symptoms. The CES-D is a 20-item scale that has been shown to have adequate internal consistency and concurrent validity (Shinar et al., 1986; Wells, Klerman, & Deykin, 1987). The CES-D asks participants to rate how they felt and behaved in the past week, ranging from 0 = rarely (less than 1 day a week) to 3 = most or all of the time (up to 7 days a week). The original scale has four factors, including Depressive Affect (in this sample, α = .89), Somatic Symptoms (α = .82), Positive Affect (α = 0.84), and Interpersonal (α = .76), which have demonstrated good to adequate internal consistency in this sample. The full scale demonstrated excellent reliability (α = .91).

Statistical Analysis

Descriptive Statistics and Unidimensionality

Univariate analyses, such as frequencies, means, and standard deviations, were used to describe participants. The internal consistency of the CES-D was evaluated using the Cronbach’s α coefficient. Then, a single-factor confirmatory factor analysis (CFA) was conducted to evaluate the assumption of unidimensionality of the scale. Mardia’s multivariate normality test was used in the MVN package, Version 5.7 (Korkmaz, Goksuluk, & Zararsiz, 2014) to decide on the appropriate estimation procedure. To assess model fit, a chi-square test of model fit, the comparative fit index (CFI), the Tucker–Lewis index (TLI), and the root mean square error of approximation (RMSEA) were used. CFI and TLI values ≥0.90 or RMSEA values <0.05 indicate adequate fit of the model to the data (Kline, 2015).

Measurement Invariance

Tests of measurement invariance were performed according to guidelines provided by Brown (2014) and Chen (2007). A preliminary step in testing measurement invariance is to ensure that the factor structure of the CES-D applied to the different groups by HIV status and sexual orientation. Measurement invariance tests involve increasingly restrictive levels, starting with configural invariance, followed by equality of factor loadings or metric invariance, equality of indicator thresholds or scalar invariance, and equality of indicator residual variances or strict invariance. The recommended order of testing these levels of invariance is (1) configural invariance, (2) metric invariance, (3) scalar invariance, and (4) strict invariance (Brown, 2014). Fit indices previously described were used to test measurement invariance. According to Chen (2007), when sample size is N > 300, to test measurement invariance in factor loadings, a change of ≥−0.010 in CFI, in addition to a change of ≥0.015 in RMSEA or a change of ≥0.030 in standardized root mean residual (SRMR) would suggest noninvariance. To test measurement invariance of intercept or residual invariance, a change of ≥−0.010 in CFI, in addition to a change of ≥0.015 in RMSEA or a change of ≥0.010 in SRMR would suggest noninvariance.

Differential Item Functioning

DIF by HIV status and sexual orientation was tested using the lordif 0.3-3 package in R Version 1.1.453 (Choi et al., 2011), which utilizes ordinal logistic regression to test DIF, as indicated by tests of the models below:

Model 0: logit P(ui ≥ k) = αk

Model 1: logit P(ui ≥ k) = αk + β1 * CES-D

Model 2: logit P(ui ≥ k) = αk + β1 * CES-D + β2 *; HIV (or sexual minority status)

Model 3: logit P(ui ≥ k) = αk + β1 * CES-D + β2 *; HIV + β3 * CES-D *( HIV (or sexual minority status)

Total DIF effect is said to be present when there is a statistically significant difference between Models 1 and 3 at α = .01, which would suggest that the model with an interaction is significantly better than a model without the interaction between the scores and HIV status or sexual orientation. Uniform DIF, which suggests consistent item performance across all score groups, is said to be present when a statistically significant difference exists between Models 1 and 2 at α = .01. Nonuniform DIF, which is the more problematic form of DIF, indicates a probability of responding that is not constant across different score groups and is therefore denoted by a statistically significant difference between Models 2 and 3 at α = .01 (Choi et al., 2011).

With large sample sizes, −2 likelihood ratio chi-square tests may overestimate DIF. As such, empirical thresholds to identify DIF were derived from Monte Carlo simulations in DIF-free samples (α = .01, 1,000 replications). The highest empirical threshold derived from simulations was used to identify uniform DIF and nonuniform DIF in this study (Choi et al., 2011).

Results

Demographic Characteristics of Participants

Participants were an average of 36 years of age (SD = 9.69). Three fourths of participants (71%) were men. Approximately half (47%) of participants identified as White American, 49% as Black African American, and 5% as other; 34% identified as Hispanic. In terms of HIV status, 60% of participants were HIV-uninfected, and 40% were HIV-infected. One third (36%) identified as sexual minority (either gay, lesbian, bisexual, or MSM). Nearly half of participants (n = 409, 49%) identified as heterosexual and were HIV-uninfected; 16% (n = 133) identified as heterosexual and were HIV-infected. HIV-uninfected participants who identified as sexual minority constituted 11% of the sample (n = 92); HIV-infected participants who identified as sexual minority were 25% of the sample. The number of participants by ethnicity, race, and sexual orientation or HIV status is presented in Table 1.

Table 1.

Count and Percentage of White American, Black American, or Other by Ethnicity, HIV Status, and Sexual Orientation (N = 841).

Hispanic			Race
Hispanic			White American, n (%)	Black American, n (%)	Other, n (%)	Total, n (%)
Ethnicity by race by HIV status
Not Hispanic	HIV status	HIV uninfected	68 (12.20)	217 (38.90)	30 (5.40)	315 (56.50)
		HIV infected	52 (9.30)	180 (32.30)	11 (2.00)	243 (43.50)
Hispanic	HIV status	HIV uninfected	178 (62.90)	8 (2.80)	—	186 (65.70)
		HIV infected	94 (33.20)	3 (1.10)	—	97 (34.30
Total	HIV status	HIV uninfected	246 (29.30)	225 (26.80)	30 (3.60)	501 (59.60)
		HIV infected	146 (17.40)	183 (21.80)	11 (1.30)	340 (40.40)
Ethnicity by race by sexual orientation
Not Hispanic	Sexual minority	No	57 (10.20)	287 (51.40)	28 (5.00)	372 (66.70)
		Yes	63 (11.30	110 (19.70)	13 (2.30)	186 (33.30)
Hispanic	Sexual minority	No	161 (56.90	9 (3.20)	—	170 (60.10)
		Yes	111 (39.20	2 (0.70)	—	113 (39.90
Total	Sexual minority	No	218 (25.90)	296 (35.20)	28 (3.30)	542 (64.40)
		Yes	174 (20.70)	112 (13.30)	13 (1.50)	299 (35.60)

Scale Descriptive Statistics

The count and proportions for CES-D items are displayed in Table 2. As shown in Table 2, there was no need to merge any categories, as such, the original scale response options were used for analysis. There was no missing data in any of the items.

Table 2.

Center for Epidemiological Studies–Depression Scale Items Count and Proportions (N = 841).

Item	Rarely or none of the time (<1 day), n (%)	Some or a little of the time (1-2 days), n (%)	Occasionally or a moderate amount of time (3-4 days), n (%)	Most or all of the time (5-7 days), n (%)
1. I was bothered by things that usually don’t bother me.	473 (56.2)	162 (19.3)	125 (14.9)	81 (9.6)
2. I did not feel like eating; my appetite was poor.	504 (59.9)	151 (18)	113 (13.4)	73 (8.7)
3. I felt that I could not shake off the blues even with help from my family or friends.	523 (62.2)	157 (18.7)	99 (11.8)	62 (7.4)
4. I felt I was just as good as other people.	349 (41.5)	182 (21.6)	146 (17.4)	164 (19.5)
5. I had trouble keeping my mind on what I was doing.	402 (47.8)	164 (19.5)	151 (18)	124 (14.7)
6. I felt depressed.	434 (51.6)	170 (20.2)	110 (13.1)	127 (15.1)
7. I felt that everything I did was an effort.	266 (31.6)	169 (20.1)	174 (20.7)	232 (27.6)
8. I felt hopeful about the future.	352 (41.9)	172 (20.5)	158 (18.8)	159 (18.9)
9. I thought my life had been a failure.	554 (65.9)	135 (16.1)	80 (9.5)	72 (8.6)
10. I felt fearful.	522 (62.1)	137 (16.3)	102 (12.1)	80 (9.5)
11. My sleep was restless.	349 (41.5)	174 (20.7)	147 (17.5)	171 (20.3)
12. I was happy.	276 (32.8)	233 (27.7)	207 (24.6)	125 (14.9)
13. I talked less than usual.	543 (64.6)	123 (14.6)	78 (9.3)	97 (11.5)
14. I felt lonely.	479 (57)	132 (15.7)	116 (13.8)	114 (13.6)
15. People were unfriendly.	573 (68.1)	141 (16.8)	57 (6.8)	70 (8.3)
16. I enjoyed life.	340 (40.4)	194 (23.1)	162 (19.3)	145 (17.2)
17. I had crying spells.	556 (66.1)	152 (18.1)	72 (8.6)	61 (7.3)
18. I felt sad.	382 (45.4)	227 (27)	134 (15.9)	98 (11.7)
19. I felt that people dislike me.	591 (70.3)	122 (14.5)	59 (7)	69 (8.2)
20. I could not get “going.”	527 (62.7)	142 (16.9)	93 (11.1)	79 (9.4)

Note. Items 4, 8, 12, and 16 are reverse-coded.

Unidimensionality

The results of Mardia’s multivariate normality test revealed nonnormal multivariate data (Mardia’s estimation of multivariate skewness = 6482.9, p <.001; Mardia’s estimation of multivariate kurtosis = 74.29, p <.001). Therefore, a WLSMV (mean and variance adjusted weighted least squares) estimator with theta parameterization was used, given that it does not assume a normal distribution, and as such, is the preferred method for modeling ordinal data (Brown, 2014). A CFA was used to investigate the unidimensionality of the CES-D using a four-factor model of depressive symptoms (1) Depressive Affect, (2) Somatic Symptoms, (3) Positive Affect, and (4) Interpersonal, under a general depressive symptoms factor. Table 3 includes the standardized factor loadings of the CES-D, the standard error, and their respective level of significance for the CES-D factors. Results of the CFA of depressive symptoms had a good fit to the data χ² = 910.89, p < .001, CFI = 0.91, TLI = 0.90, and RMSEA = 0.073. The bifactor model had a slightly worse fit, χ² = 754.07, p < .001, CFI = 0.93, TLI = 0.92, and RMSEA = 0.071. A hierarchical model had the best fit, χ² = 418.30, p < .001, CFI = 0.97, TLI = 0.96, and RMSEA = 0.043. Because the hierarchical model had the best fit to the data, the hierarchical model was used (see Table 3).

Table 3.

Center for Epidemiological Studies–Depression Scale (CES-D) Item Factor Loadings (N = 841).

Factor	Factor loading	Standard error (SE)	Factor loading/SE	p
Depressive Affect	Factor loading	Standard error (SE)	Factor loading/SE	p
I felt that I could not shake off the blues even with help from my family or friends.	0.77	0.02	48.70	<.001
I felt depressed.	0.86	0.01	81.87	<.001
I thought my life had been a failure.	0.71	0.02	38.57	<.001
I felt fearful.	0.66	0.02	31.96	<.001
I felt lonely.	0.72	0.02	39.97	<.001
I had crying spells.	0.62	0.02	27.54	<.001
I felt sad.	0.81	0.01	61.52	<.001
Somatic Symptoms
I was bothered by things that usually don’t bother me.	0.71	0.02	37.29	<.001
I did not feel like eating; my appetite was poor.	0.65	0.02	29.57	<.001
I had trouble keeping my mind on what I was doing.	0.67	0.02	32.33	<.001
I felt that everything I did was an effort.	0.53	0.03	20.04	<.001
My sleep was restless.	0.62	0.02	27.04	<.001
I talked less than usual.	0.44	0.03	14.70	<.001
I could not get “going.”	0.75	0.02	43.64	<.001
Positive Affect
I felt I was just as good as other people.	0.67	0.02	29.46	<.001
I felt hopeful about the future.	0.71	0.02	34.05	<.001
I was happy.	0.80	0.02	47.95	<.001
I enjoyed life.	0.84	0.02	52.93	<.001
Interpersonal Relations
People were unfriendly.	0.71	0.02	29.24	<.001
I felt that people dislike me.	0.86	0.02	38.30	<.001
CES-D factors
Depressive Affect	1.00	0.01	75.16	<.001
Somatic Symptoms	0.97	0.01	67.55	<.001
Positive Affect	0.28	0.04	7.79	<.001
Interpersonal Relations	0.71	0.03	27.45	<.001

Note. Items 4, 8, 12, and 16 are reverse-coded.

Measurement Invariance

As noted in Table 4, a hierarchical structure of the CES-D was supported in HIV-infected and HIV-uninfected participants, χ²(332) = 505.81 (p < .001), RMSEA = 0.035, SRMR = 0.40, CFI = 0.97, and TLI = 0.96. Given that configural invariance was established, metric invariance was then tested by constraining factor loadings to be equal in HIV-infected and HIV-uninfected participants. Metric invariance was supported, ΔCFI = 0.01, ΔRMSEA = −0.01, ΔSRMR = 0.01, χ²(20)_diff = 39.98, p = .005. Because metric invariance was supported, scalar invariance was tested. Scalar invariance was not supported, ΔCFI = 0.02, ΔRMSEA = 0.01, ΔSRMR = 0.04, χ²(20)_diff = 121.78, p < .001. Therefore, strict invariance was not tested.

Table 4.

Tests of Measurement Invariance of the Center for Epidemiologic Studies–Depression Scale by HIV Status.

	χ²	df	χ²_diff	Δdf	RMSEA [90% CI]	SRMR	CFI	TLI
Measurement invariance
Equal form (configural invariance)	505.81***	332			0.035 [0.029, 0.041]	0.040	0.97	0.96
Equal factor loadings (metric invariance)	545.79***	352	39.98**	20	0.036 [0.030, 0.042]	0.048	0.96	0.96
Equal indicator intercepts (scalar invariance)	667.57***	372	121.78***	20	0.043 [0.038, 0.049]	0.092	0.94	0.92
Equal indicator error variances (strict invariance)	—	—	—	—	—	—	—	—

Note. χ²_diff = nested difference; RMSEA = root mean square error of approximation; 90% CI = 90% confidence interval for RMSEA; SRMR = standardized root mean square residual; CFI = comparative fit index; TLI = Tucker–Lewis index.

p < .05. **p < .01. ***p < .001.

To test measurement invariance by sexual orientation, an attempt was made to similarly use a hierarchical structure. However, when testing configural invariance by sexual orientation, the model would not converge. As such, two separate hierarchical CFAs were conducted by group. The model with a hierarchical structure for heterosexual participants converged, but it did not converge for sexual minority participants. A bifactor structure also did not converge among sexual minority participants. As such, a single-factor structure of the CES-D was used to test measurement invariance by sexual orientation. As shown in Table 5, the one-factor structure of the CES-D was supported in heterosexual and sexual minority participants, χ²(334) = 891.13 (p < .001), RMSEA = 0.063, SRMR = 0.078, CFI = 0.89, and TLI = 0.87. Because configural invariance was established, metric invariance was then tested by constraining factor loadings to be equal in heterosexual and sexual minority participants. Metric invariance was supported, ΔCFI = 0.01, ΔRMSEA = 0.01, ΔSRMR = 0.01, χ²(20)_diff = 45.32, p < .001. Scalar invariance was not supported, however, ΔCFI = 0.04, ΔRMSEA = 0.01, ΔSRMR = 0.05, χ²(20)_diff = 219.69, p < .001. Because scalar invariance was not supported, strict invariance was not tested.

Table 5.

Tests of Measurement Invariance of the Center for Epidemiologic Studies–Depression Scale by Sexual Orientation.

	χ²	df	χ²_diff	Δdf	RMSEA [90% CI]	SRMR	CFI	TLI
Measurement invariance
Equal form (configural invariance)	891.13***	334			0.063 [0.058, 0.068]	0.078	0.89	0.87
Equal factor loadings (metric invariance)	936.45***	354	45.32***	20	0.063 [0.058, 0.067]	0.088	0.88	0.87
Equal indicator intercepts (scalar invariance)	1156.140***	374	219.69***	20	0.071 [0.066, 0.075]	0.135	0.84	0.84
Equal indicator error variances (strict invariance)	—	—	—	—	—	—	—	—

p < .05. **p < .01. ***p < .001.

Differential Item Functioning by HIV Status

A total of five items (4, 12, 13, 16, and 17) of the CES-D were flagged for DIF by HIV status at α = .01. Model comparisons for Models 1 versus 2, Models 1 versus 3, and Models 2 versus 3 for all 20 items are shown in Table 6. Uniform DIF in the CES-D was detected in Items 4 (“I felt I was just as good as other people”), 12 (“I was happy”), and 16 (“I enjoyed life”), p < .001 (Table 6). Nonuniform DIF, which indicated DIF in favor of one group versus the other, was detected in Items 13 (“I talked less than usual”) and 17 (“I had crying spells”). Item true score functions by HIV status for all five items are presented in Figure 1. The characteristic curves for all items and DIF items are presented in Figure 2.

Table 6.

Differential Item Functioning (DIF) Log-Likelihood Chi-Square Model Comparisons by HIV Status for 20 Items of the Center for Epidemiological Studies–Depression Scale.

Item	Uniform DIF	McFadden pseudo ΔR²	Total DIF	Nonuniform DIF	McFadden pseudo ΔR²
Item	Models 1 versus 2 p	McFadden pseudo ΔR²	Models 1 versus 3 p	Models 2 versus 3 p	McFadden pseudo ΔR²
1. I was bothered by things that usually don’t bother me.	.210	—	.442	.806	—
2. I did not feel like eating; my appetite was poor.	.350	—	0.213	.136	—
3. I felt that I could not shake off the blues even with help from my family or friends.	.952	—	.231	.087	—
4. I felt I was just as good as other people.	.005	.0035	.009	.210	.0000
5. I had trouble keeping my mind on what I was doing.	.481	—	.676	.592	—
6. I felt depressed.	.110	—	.279	.956	—
7. I felt that everything I did was an effort.	.363	—	.451	.382	—
8. I felt hopeful about the future.	.820	—	.673	.389	—
9. I thought my life had been a failure.	.463	—	.493	.349	—
10. I felt fearful.	.264	—	.407	.457	—
11. My sleep was restless.	.270	—	.539	.890	—
12. I was happy.	.003	.0040	.002	.060	.0016
13. I talked less than usual.	.955	.0000	.006	.002	.0058
14. I felt lonely.	.114	—	.287	.960	—
15. People were unfriendly.	.239	—	.093	.067	—
16. I enjoyed life.	.000	.0057	.000	.031	.0021
17. I had crying spells.	.004	.0051	.000	.007	.0045
18. I felt sad.	.931	—	.922	.693	—
19. I felt that people dislike me.	.485	—	.483	.325	—
20. I could not get “going.”	.595	—	.522	.313	—

Note. Boldfaced values denote statistically significant model comparison at α = .01.

Figure 1.

Item true score functions for items exhibiting DIF by HIV status.

Figure 2.

Test characteristic curves by HIV status.

The highest Monte Carlo simulation–derived empirical threshold from DIF-free samples was McFadden ΔR² = .0048 for uniform DIF. The empirical threshold for nonuniform DIF was also McFadden ΔR² = .0048. According to these empirical thresholds, Items 16 (ΔR² = .0057) and 17 (ΔR² = .0051) exhibited uniform DIF. Only Item 13 (ΔR² = .0058) exhibited nonuniform DIF. The Monte Carlo threshold for the beta change was 0.1304. Only Item 16 (Δβ = 0.1761) met this threshold for HIV status.

Differential Item Functioning by Sexual Orientation

A total of three items (2, 15, and 19) of the CES-D were flagged for DIF by sexual orientation at α = .01 (see Table 7). Model comparisons for Models 1 versus 2, Models 1 versus 3, and Models 2 versus 3 for all 20 items are shown in Table 7. Uniform DIF in the CES-D was detected in Items 2 (“I did not feel like eating; my appetite was poor”), 15 (“People were unfriendly”), and 19 (“I felt that people dislike me”). Nonuniform DIF, which indicates that DIF is not constant across different levels of depressive symptoms, was detected in Item 2 (“I did not feel like eating; my appetite was poor”). Item true score functions by sexual orientation for the three items exhibiting DIF are presented in Figure 3. The characteristic curves for all items and DIF items are presented in Figure 4.

Table 7.

Differential Item Functioning (DIF) Log-Likelihood Chi-Square Model Comparisons by Sexual Orientation for 20 Items of the Center for Epidemiological Studies–Depression Scale.

Item	Uniform DIF	McFadden Pseudo ΔR²	Total DIF	Nonuniform DIF	McFadden Pseudo ΔR²
Item	Models 1 versus 2 p	McFadden Pseudo ΔR²	Models 1 versus 3 p	Models 2 versus 3 p	McFadden Pseudo ΔR²
1. I was bothered by things that usually don’t bother me.	.019	—	.060	.695	—
2. I did not feel like eating; my appetite was poor.	.007	.0039	.000	.002	.0050
3. I felt that I could not shake off the blues even with help from my family or friends.	.889	—	.616	.330	—
4. I felt I was just as good as other people.	.372	—	.223	.138	—
5. I had trouble keeping my mind on what I was doing.	.166	—	.382	.943	—
6. I felt depressed.	.570	—	.789	.697	—
7. I felt that everything I did was an effort.	.911	—	.855	.584	—
8. I felt hopeful about the future.	.097	—	.039	.053	—
9. I thought my life had been a failure.	.033	—	.038	.159	—
10. I felt fearful.	.172	—	.196	.238	—
11. My sleep was restless.	.648	—	.825	.673	—
12. I was happy.	.546	—	.245	.118	—
13. I talked less than usual.	.026	—	.066	.506	—
14. I felt lonely.	.146	—	.323	.702	—
15. People were unfriendly.	.004	.0052	.005	.134	.0014
16. I enjoyed life.	.347	—	.531	.536	—
17. I had crying spells.	.411	—	.276	.168	—
18. I felt sad.	.215	—	.447	.787	—
19. I felt that people dislike me.	.005	.0050	.014	.361	.0000
20. I could not get “going.”	.571	—	.809	.750	—

Note. Boldface values denote statistically significant model comparison at α = .01.

Figure 3.

Item true score functions for items exhibiting DIF by sexual orientation.

Figure 4.

Test characteristic curves by sexual orientation.

The highest Monte Carlo simulation–derived empirical threshold from DIF-free samples was McFadden ΔR² = .0046 for uniform DIF. The empirical threshold for nonuniform DIF was also McFadden ΔR² = .0046. According to these empirical thresholds, Items 16 (ΔR² = .0052) and 19 (ΔR² = .0050) exhibited uniform DIF. Only Item 2 (ΔR² = .0050) exhibited nonuniform DIF. The Monte Carlo threshold for the beta change was 0.1662. No items met this threshold by sexual orientation.

Discussion

This study examined DIF of the CES-D (Radloff, 1977) by HIV status and sexual orientation. Based on past research and theory (Gay et al., 2016; Meyer, 2003; Valentine & Shipherd, 2018), it was hypothesized that DIF would be present by sexual orientation and HIV status. This hypothesis was only partially supported; DIF was found for sexual orientation in the two items of the Interpersonal factor and for HIV status in three items of the Positive Affect factor, which suggests that these items may be interpreted differently by different populations, leading to bias in scores and limited meaningful comparison between groups. However, the level of DIF found is negligible in magnitude (Jodoin & Gierl, 2001). This supports the conclusion that the CES-D may be somewhat invariant given that it achieved invariance at the configural and metric levels.

As anticipated, sexual orientation was associated with uniform DIF for the two items from the Interpersonal factor, “People were unfriendly” and “I felt like people disliked me.” In contrast with previous research, sexual minorities were less likely to endorse these items than heterosexual individuals. The same effect in the Interpersonal factor has also been noted in other stigmatized populations, particularly in Black African American individuals (Gay et al., 2016; Kim, Chiriboga, & Jang, 2009), suggesting that these items may be related to other constructs, such as experiences of discrimination and rejection from others (Perreira, Deeb-Sosa, Harris, & Bollen, 2005). In the present study, most of the participants who did not belong to a sexual minority group identified as Black African American or other racial minority (59.7%), whereas more than half of the sexual minority participants were White American (58.1%). It is possible that stigma related to race and ethnicity may have influenced the probability of responding to these items in a greater and more significant way than sexual orientation, leading to a more frequent endorsement by non–sexual minority participants. However, despite exhibiting differential patterns of responding, the total level of DIF was negligible.

In this same line, it is important to note that several stigmatized attributes were present in this sample—that is, sexual orientation, HIV, race, and ethnicity. The intersection and interaction of multiple stigmas can exacerbate their negative impact (Logie et al., 2011) and increase bias in response to items related to interpersonal rejection and discrimination. Even though these results are not conclusive to this respect, they suggest the relevance of continuing to study DIF of CES-D Interpersonal items in highly stigmatized populations in order to promote a reliable use of its scores and a more accurate estimation of depressive symptoms in specific groups. It is also important to highlight that both the hierarchical and bifactor models did not converge in the sexual minority participants, suggesting different factor structures by sexual orientation. This suggests that the total score in the CES-D, rather than separate factors, may be better suited for measuring depressive symptoms among sexual minorities.

Uniform DIF was found for three of the four items of the Positive Affect factor, in the comparison by HIV status. These items (“I felt I was just as good as other people,” “I was happy,” and “I enjoyed life”) were more frequently endorsed by HIV-uninfected individuals. In line with this result, Gay et al. (2016) found DIF for one item of the Positive Affect factor, which was more easily endorsed by people with no AIDS diagnosis. Such differential response may reflect the adverse impact that an HIV diagnosis, stressful and potentially traumatic, can have on the self-concept and on the attitude toward one’s life and future (Park, 2013; Yu, Chen, Ye, Li, & Lin, 2016). However, DIF of these items may also be a result of internal characteristics of the CES-D and not simply a product of characteristics of responding individuals. Previous research has shown that items in the Positive Affect factor were poorly correlated with the rest of the scale and were not useful to discriminate depressed from nondepressed adults with HIV (Schroevers, Sanderman, van Sonderen, & Ranchor, 2000; Stansbury, Ried, & Velozo, 2006). In this sense, some evidence suggests the advantages of removing the Positive Affect items from the scale, as they may be measuring a completely different construct (Stansbury et al., 2006). Tsutsumi et al. (2009) concluded that removal of items with DIF can increase comparability between groups. However, it can also result in loss of useful information. Future research could contribute to clarify the usefulness of maintaining or removing the Positive Affect items for specific populations, such as PLWH.

The present findings have implications for the assessment of the frequency and severity of depressive symptoms in medical and mental health care settings and for research on depression among PLWH and sexual minority populations. In line with the recommendations of Gay et al. (2016), results suggest the importance of testing the clinical cutoffs of the CES-D that are currently employed across populations for diagnosis. Given the presence of DIF across groups, cutoffs established for general population may not apply equivalently for specific subgroups, leading to inaccurate estimations of the presence and severity of depressive symptoms. Therefore, practitioners should be cautious when diagnosing or arriving at conclusions regarding depression on HIV or sexual minority patients using the CES-D scores. In a similar vein, researchers should be particularly careful when comparing the CES-D scores of PLWH with those of other groups. This is particularly relevant since the CES-D is the most widely used measure of depressive symptoms in HIV research (Simoni et al., 2011) and comparisons between HIV-infected and HIV-uninfected individuals are frequent. Consequently, future studies should explore and adjust cutoffs so that they are appropriate for specific groups.

Nonetheless, as noted, it is relevant to highlight that the CES-D reached metric measurement invariance. The levels of DIF found, though significant, were low in magnitude, which support the findings of some level of invariance in this sense, as well. For this reason, this evidence also allows to conclude that as differences across these groups are minimal, the CES-D continues to be a valid measure of depressive symptoms and their severity. Though caution is recommended when comparing depression across these groups, the possibility of comparisons between them are not severely or significantly compromised.

Study limitations should be taken into consideration when interpreting these results. The sample was predominantly male, which may limit the generalizability of the present findings; however, this is generally representative of HIV-infected samples. Nevertheless, it is possible that different results may have been obtained if more women were included. Results regarding sexual minorities may also be limited by the high proportion of individuals living with HIV, which may not be representative of the sexual minority population as a whole. Despite these limitations, original contributions from the present study should be noted. Previous studies have primarily assessed DIF by AIDS diagnosis, but not HIV diagnosis (Gay et al., 2016), and these findings contribute to fill this gap in the literature. In addition, item response theory analyses were used to detect DIF, which has the advantage of being sample independent when there is sufficient heterogeneity in the sample (Hambleton et al., 1991). This ethnically and racially diverse sample contributed to that end, including a significant number of White and Black African Americans, as well as individuals of Hispanic ethnicity.

Conclusions

DIF by sexual orientation and HIV status was found for several items of the CES-D, but these levels of DIF were negligible (Jodoin & Gierl, 2001). Given that it continues to be one of the most widely used measures of the frequency and severity of depressive symptoms, it is important to continue reviewing and improving this measure, especially as depression is a critical variable in HIV, sexual minorities, and mental health research. Depressive symptoms, for instance, are related to poor adherence to treatment and more adverse clinical events, among other negative outcomes for PLWH (Brandt et al., 2015; Mayston et al., 2012). For this reason, it is fundamental that depression be reliably screened and assessed in these populations without the influence of other factors that are unrelated to its measurement.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This study was funded by the National Institutes of Drug Abuse, National Institutes of Health (Grants Nos. R01DA034589 and R01-DA031201). A part of the article was carried out under a Ford Foundation Fellowship to Violeta J. Rodriguez.

ORCID iD

Pablo D. Radusky

References

Adams

Zacharia

Masters

Coffey

Catalan

(2016). Mental health problems in people living with HIV: Changes in the last two decades: The London experience 1990-2014. AIDS Care, 28, 56-59. doi:10.1080/09540121.2016.1146211

Bernstein

D. P.

Ahluvalia

Pogge

Handelsman

(1997). Validity of the Childhood Trauma Questionnaire in an adolescent psychiatric population. Journal of the American Academy of Child & Adolescent Psychiatry, 36, 340-348. doi:10.1097/00004583-199703000-00012

Bernstein

D. P.

Stein

J. A.

Newcomb

M. D.

Walker

Pogge

Ahluvalia

. . . Zule

(2003). Development and validation of a brief screening version of the Childhood Trauma Questionnaire. Child Abuse & Neglect, 27, 169-190.

Brandt

C. P.

Bakhshaie

Zvolensky

M. J.

Grover

K. W.

Gonzalez

(2015). The examination of emotion dysregulation as a moderator of depression and HIV-relevant outcome relations among an HIV+ sample. Cognitive Behaviour Therapy, 44, 9-20. doi:10.1080/16506073.2014.950323

Brown

T. A.

(2014). Confirmatory factor analysis for applied research. New York, NY: Guilford Press.

Carrico

A. W.

Rodriguez

V. J.

Jones

D. L.

Kumar

M. K.

(2017). Short circuit: Disaggregation of adrenocorticotropic hormone and cortisol levels in HIV-positive, methamphetamine-using men who have sex with men. Human Psychopharmacology Clinical and Experimental, 33. doi:10.1002/hup.2645

Chen

F. F.

(2007). Sensitivity of goodness of fit indexes to lack of measurement invariance. Structural Equation Modeling, 14, 464-504. doi:10.1080/10705510701301834

Choi

S. W.

Gibbons

L. E.

Crane

P. K.

(2011). Lordif: An R package for detecting differential item functioning using iterative hybrid ordinal logistic regression/item response theory and Monte Carlo simulations. Journal of Statistical Software, 39(8), 1-30.

Cholera

Pence

B. W.

Gaynes

B. N.

Bassett

Qangule

Pettifor

. . . Miller

W. C.

(2017). Depression and engagement in care among newly diagnosed HIV-infected adults in Johannesburg, South Africa. AIDS and Behavior, 21, 1632-1640. doi:10.1007/s10461-016-1442-6

10.

Ciesla

J. A.

Roberts

J. E.

(2001). Meta-analysis of the relationship between HIV infection and risk for depressive disorders. American Journal of Psychiatry, 158, 725-730.

11.

Covic

Pallant

J. F.

Conaghan

P. G.

Tennant

(2007). A longitudinal evaluation of the Center for Epidemiologic Studies-Depression Scale (CES-D) in a rheumatoid arthritis population using Rasch analysis. Health and Quality of Life Outcomes, 5, Article 41. doi:10.1186/1477-7525-5-41. Retrieved from https://hqlo.biomedcentral.com/articles/10.1186/1477-7525-5-41

12.

Devins

G. M.

Orme

C. M.

Costello

C. G.

Binik

Y. M.

Frizzell

Stam

H. J.

Pullin

W. M.

(1988). Measuring depressive symptoms in illness populations: Psychometric properties of the Center for Epidemiologic Studies Depression (CES-D) Scale. Psychology & Health, 2, 139-156. doi:10.1080/08870448808400349

13.

Gay

C. L.

Kottorp

Lerdal

Lee

K. A.

(2016). Psychometric limitations of the Center for Epidemiologic Studies-Depression Scale for assessing depressive symptoms among adults with HIV/AIDS: A Rasch analysis. Depression Research and Treatment, 2016. doi:10.1155/2016/2824595. Retrieved from https://www.hindawi.com/journals/drt/2016/2824595/

14.

Hambleton

R. K.

Swaminathan

Rogers

H. J.

(1991). Fundamentals of item response theory (Vol. 2). Thousand Oaks, CA: Sage.

15.

Herek

G. M.

Saha

Burack

(2013). Stigma and psychological distress in people with HIV/AIDS. Basic and Applied Social Psychology, 35, 41-54. doi:10.1080/01973533.2012.746606

16.

Jang

Kwag

K. H.

Chiriboga

D. A.

(2010). Not saying I am happy does not mean I am not: Cultural influences on responses to positive affect items in the CES-D. Journal of Gerontology: Psychological Sciences, 65B, 684-690. doi:10.1093/geronb/gbq052

17.

Jodoin

M. G.

Gierl

M. J.

(2001). Evaluating type I error and power rates using an effect size measure with the logistic regression procedure for DIF detection. Applied Measurement in Education, 14, 329-349. doi:10.1207/S15324818AME1404_2

18.

Kim

Chiriboga

D. A.

Jang

(2009). Cultural equivalence in depressive symptoms in older White, Black and Mexican-American adults. Journal of the American Geriatrics Society, 57, 790-796.

19.

Kline

R. B.

(2015). Principles and practice of structural equation modeling. New York, NY: Guilford Press.

20.

Korkmaz

Goksuluk

Zararsiz

(2014). MVN: An R package for assessing multivariate normality. The R Journal, 6, 151-162.

21.

Logie

C. H.

James

Tharao

Loutfy

M. R.

(2011). HIV, gender, race, sexual orientation, and sex work: A qualitative study of intersectional stigma experienced by HIV-positive women in Ontario, Canada. PLoS Medicine, 8(11), e1001124. doi:10.1371/journal.pmed.1001124

22.

Mayston

Kinyanda

Chishinga

Prince

Patel

(2012). Mental disorder and the outcome of HIV/AIDS in low-income and middle-income countries: A systematic review. AIDS, 26, 117-135. doi:10.1097/QAD.0b013e32835bde0f

23.

McDowell

M. J.

Hughto

J. M. W.

Reisner

S. L.

(2019). Risk and protective factors for mental health morbidity in a community sample of female-to-male trans-masculine adults. BMC Psychiatry, 19(1), 16. doi:10.1186/s12888-018-2008-0

24.

Meyer

I. H.

(2003). Prejudice, social stress, and mental health in lesbian, gay and bisexual populations: Conceptual issues and research evidence. Psychological Bulletin, 129, 674-697. doi:10.1037/0033-2909.129.5.674

25.

Nacher

Andriouch

Godard Sebillotte

Hanf

Vantilcke

El Guedj

. . . Couppié

(2010). Predictive factors and incidence of anxiety and depression in a cohort of HIV-positive patients in French Guiana. AIDS Care, 22, 1086-1092. doi:10.1080/09540121003599232

26.

Nanni

M. G.

Caruso

Mitchell

A. J.

Meggiolaro

Grassi

(2015). Depression in HIV infected patients: A review. Current Psychiatry Reports, 17, 530-541. doi:10.1007/s11920-014-0530-49

27.

Norcini Pala

Stecca

(2011). Negative emotion and illness perception predict low CD4 counts and high viral load. Psychology & Health, 26(Suppl. 2), 183.

28.

Park

C. L.

(2013). The meaning making model: A framework for understanding meaning, spirituality, and stress-related growth in health psychology. The European Health Psychologist, 15, 40-47.

29.

Perreira

K. M.

Deeb-Sosa

Harris

Bollen

(2005). What are we measuring? An evaluation of the CES-D across race/ethnicity and immigrant generation. Social Forces, 83, 1567-1602. doi:10.1353/sof.2005.0077

30.

Rabkin

J. G.

(2008). HIV and depression: 2008 review and update. Current HIV/AIDS Reports, 5, 163-171.

31.

Radloff

L. S.

(1977). The CES-D Scale: A self-report depression scale for research in the general population. Applied Psychological Measurement, 1, 385-401. doi:10.1177/014662167700100306

32.

Rodriguez

V. J.

Butts

S. A.

Mandell

L. N.

Weiss

S. M.

Kumar

Jones

D. L.

(2019). The role of social support in the association between childhood trauma and depression among HIV-infected and HIV-uninfected individuals. International Journal of STD & AIDS, 30, 29-36.

33.

Schroevers

M. J.

Sanderman

van Sonderen

Ranchor

A. V.

(2000). The evaluation of the Center for Epidemiologic Studies Depression (CES-D) scale: Depressed and positive affect in cancer patients and healthy reference subjects. Quality of Life Research, 9, 1015-1029.

34.

Shafer

A. B.

(2006). Meta-analysis of the factor structures of four depression questionnaires: Beck, CES-D, Hamilton, and Zung. Journal of Clinical Psychology, 62, 123-146. doi:10.1002/jclp.20213

35.

Shinar

Gross

C. R.

Price

T. R.

Banko

Bolduc

P. L.

Robinson

R. G.

(1986). Screening for depression in stroke patients: The reliability and validity of the Center for Epidemiologic Studies Depression Scale. Stroke, 17, 241-245.

36.

Simoni

J. M.

Safren

S. A.

Manhart

L. E.

Lyda

Grossman

C. I.

Rao

. . . Wilson

I. B.

(2011). Challenges in addressing depression in HIV research: Assessment, cultural context, and methods. AIDS & Behavior, 15, 376-388. doi:10.1007/s10461-010-9836-3

37.

Stansbury

J. P.

Ried

L. D.

Velozo

C. A.

(2006). Unidimensionality and bandwith in the Center for Epidemiologic Studies Depression (CES-D) Scale. Journal of Personality Assessment, 86, 10-22. doi:10.1207/s15327752jpa8601_03

38.

Sumari-de Boer

I. M.

Sprangers

M. A. G.

Prins

J. M.

Nieuwkerk

P. T.

(2012). HIV stigma and depressive symptoms are related to adherence and virological response to antiretroviral treatment among immigrant and indigenous HIV infected patients. AIDS and Behavior, 16, 1681-1689. doi:10.1007/s10461-011-0112-y

39.

Tsutsumi

Iwata

Watanabe

de Jonge

Pikhart

Fernández-López

J. A.

. . . Siegrist

(2009). Application of item response theory to achieve cross-cultural comparability of occupational stress measurement. International Journal of Methods in Psychiatric Research, 18, 58-67. doi:10.1002/mpr.277

40.

Valentine

S. E.

Shipherd

J. C.

(2018). A systematic review of social stress and mental health among transgender and gender non-conforming people in the United States. Clinical Psychology Review, 66, 24-38. doi:10.1016/j.cpr.2018.03.003

41.

Wells

V. E.

Klerman

G. L.

Deykin

E. Y.

(1987). The prevalence of depressive symptoms in college students. Social Psychiatry, 22, 20-28.

42.

Yang

F. M.

Jones

R. N.

(2007). Center for Epidemiologic Studies–Depression Scale (CES-D) item response bias found with Mantel-Haenszel method was successfully replicated using latent variable modeling. Journal of Clinical Epidemiology, 60, 1195-1200. doi:10.1016/j.jclinepi.2007.02.008

43.

N. X.

Chen

Lin

(2016). Impacts of making sense of adversity on depression, posttraumatic stress disorder, and posttraumatic growth among a sample of mainly newly diagnosed HIV-positive Chinese young homosexual men: The mediating role of resilience. AIDS Care, 29, 79-85. doi:10.1080/09540121.2016.1210073