Abstract
Introduction
The validity and reliability of the Japanese Interest Checklist for the Elderly were examined.
Method
687 participants responded, using the new scale system: ‘currently participate based on interest,’ ‘participate less because of health status’ or ‘no interest.’ The convergent and discriminant validity of the factorial structure were examined using two-stage Confirmatory Factor Analysis approaches. The discriminant validity and reliability of the scale system were examined using two-stage Item Response Theory approaches.
Results
The first Confirmatory Factor Analysis stage indicated values representing good (factor loadings: 0.99–0.75; Comparative Fit Index: 0.99; Tucker-Lewis Index: 0.98) to adequate (Root Mean Square Error of Approximation: 0.054) fit levels. Both the discriminant validity and convergent validity were identified to be high. The second Confirmatory Factor Analysis stage with a Path analysis and consideration of age and gender indicated values representing a good fit (factor loadings: 0.99–0.78; Comparative Fit Index: 0.99; Tucker-Lewis Index: 0.98; Root Mean Square Error of Approximation: 0.048). The first Item Response Theory stage indicated the values for the discriminant validity in the expected ranges; however, it displayed lower reliability in some activities. The second Item Response Theory state with the latent-class model-based multi-group Item Response Theory confirmed the pattern of invariance.
Conclusion
The factorial structure was valid across different groups of people. The scale system has to be improved.
Introduction
According to the Model of Human Occupation (MOHO; Kielhofner, 2008), interests are defined as what a client finds enjoyable or satisfying to do. Inquiring about the client’s interests is essential to a successful initial occupational therapy evaluation and to ongoing treatment planning from a client-centered perspective (Kielhofner, 2008). The Neuropsychiatric Institute Interest Checklist (NPIIC; Matsutusyu, 1969), the classical interest checklist, was developed to assess a client’s self-reported interests. The NPIIC proposed five factors (activities of daily living; manual skills; cultural/educational; physical sports; and social recreation) and a scale system (strong, casual or no interest).
Issues in interest checklists
The NPIIC was being used routinely in association with the MOHO. However, the NPIIC has been criticized in terms of its adaptability to developmental levels and gender (Katz, 1988; Klyczek et al., 1997; Rogers et al., 1978). Moreover, the underlying factors are not theoretically consistent in terms of the way that the MOHO construes interests within the broader category of volition. Various interest checklists have been developed according to target populations based on the NPIIC; however, issues involving inconsistency in the factorial structure of these instruments and problems with adaptability of the scale system remain.
Given these difficulties with the NPIIC (Katz, 1988; Klyczek et al., 1997; Rogers et al., 1978), Nakamura-Thomas and Yamada (2011) studied the relevance of the Japanese Interest Checklist for the Elderly (JICE) in an elderly population. The JICE demonstrated independence between factors in a sample of 967 community-dwelling older Japanese people. The identified six factors were activities parallel to daily living (APDL), pleasurable outings, cultural/educational activities, entertainment activities, nature-related activities and social activities (Nakamura-Thomas and Yamada, 2011).
In terms of the NPIC scale system, a bigger issue seems to be a gap between the scale system and the propositions (for instance, interests can sustain action and interests reflect self-perception). Rogers et al. (1978) proposed a 5-point Likert scale system (like very much, like, indifferent, dislike or dislike very much) for adolescent clients. The system (Rogers et al., 1978) seemed to be guided by one of the NPIIC’s prepositions (interests evoke affective response); however, it has been adopted in the Occupational Questionnaire (Smith et al., 1986) but not in later interest checklists.
Kielhofner (2008) proposed a scale system which asks participants concerning the degree of interest in each activity in the past year, and whether they currently participate and they would like to participate in the future. The scale is rooted in the theoretical concept of the MOHO that interests reflect tastes generated from the cycle of anticipating, choosing, experiencing and interpreting one’s action (Kielhofner, 2008). Using this system, a study (Nakamura-Thomas, Kyougoku and Forsyth, 2014) employed a new set of scales (having or not having interest, current participation or desire for future participation) to identify relationships between these scales. Responses in interest, current participation and desire for future participation were analyzed across each activity and gender. Participants were 375 community-dwelling older Japanese people. As a result, interest, current participation and desire for future participation correlated significantly and positively in each activity for both genders. Moreover, numbers in interest parameters (having or not having interest, current participation and wish for future participation) for each JICE’s factor, excluding the factor of entertainment activities, correlated significantly and positively with the scores in subjective health related quality of life, measured by MOS Short-Form 36-Item Health version 2 (Nakamura-Thomas and Kyougoku, 2013).
Aims
Interest checklists have to be vigorously exposed to various examinations to provide useful information for clinical application as one of the most frequently used assessments (Lee et al., 2008). The JICE had to be verified to be a promising instrument and to obtain potential directions for further revisions.
The first objective of this study was to verify the JICE’s factorial structure through a Confirmatory Factor Analysis (CFA) approach, a multivariate statistical method that is commonly used to test whether a proposed structure fits the sample data (Brown, 2006). The second objective was to examine a new scale system, established for this study, to address the cycle proposed in the MOHO (Kielhofner, 2008) through an Item Response Theory (IRT) approach. Listed activities in the JICE refer to items in this study. The third objective was to confirm the robustness of the results in those objectives through a Path analysis and a multi-group analysis based on a latent class model.
Method
Ethics approval
This research protocol was reviewed and approved by the Research Ethics Committee in the organization of employment of the first author.
Participants
Participants were community-dwelling, retired older Japanese people in a large suburban area of the capital city. They were recruited by flyers and posters distributed from municipal officers in the area. They were identified as the current study participants after providing informed consent for this study. People unable to respond independently were excluded from this study.
Data collection
Data collection format.
APDL = Activities Parallel to Daily Living, PLE = Pleasurable Outings, CUL = Cultural/Educational Activities, NAT = Nature-related Activities, SOC = Social Activities, ENT = Entertainment Activities.
Data analysis
The responses were converted to the numerical form as follows: 1 = currently participate on the basis of my interest, 2 = participate less because of health status, 3 = do not bother to participate because of no interest. Data from the participants with no missing data were analyzed. For observing overall patterns, response frequencies (%) in each scale according to each activity were examined. We conducted the following analysis with Mplus version 7.3 (Muthén and Muthén, 2001).
Overall approaches
We used CFA and IRT approaches to meet the objectives in this study for the following reasons. Firstly, a Polychric correlation was used for examining a correlation between factors with a Likert scale system in IRT approaches. A Pearson correlation and Cronbach Alpha have been traditionally used in factorial analysis; however, they make the determination of factor numbers and a model fit level uncertain for factors with the system (Flora et al., 2012; Schmitt, 1996). Secondly, invariance levels were examined using IRT approaches, as CFA approaches account for covariance between items (Reise et al., 1993). Thirdly, the robustness of item parameter estimation was confirmed using IRT approaches (Rose et al., 2014). Lastly, an IRT approach is free from a sample dependency. It identifies item characteristics by providing higher sensitivity of estimation in items (item discrimination and difficulty) and graphs which depict the true relationship between a latent trait and the responses to the item (Yang and Kao, 2014).
Factorial structure
We performed two-stage CFA approaches using a robust weighted least square factoring method. The estimator performs best in the CFA modeling of categorical data (Brown, 2006). Model fit indices were factor loadings, Comparative Fit Index (CFI), Tucker-Lewis Index (TLI) and Root Mean Square Error of Approximation (RMSEA). No limitation was set up to obtain those values.
Factor loadings indicate the relationship between latent variables and observed variables (Brown, 2006). A value above 0.95 was expected both in CFI and TLI as representing a good fit. CFI assesses goodness-of-fit adjusted for model complexity or parsimony. TLI is based on comparisons between a null model and an incrementally more complex model. A value below 0.05 represents a good fit and a value between 0.05 and 0.08 represents an adequate fit in RMSEA. It is an absolute measure of fit, not considered to be highly sensitive to sample size or distribution (Munro, 2005).
Convergent validity
We performed a simple CFA approach as the first stage in this process. We then observed values in the average square of the standardized coefficients for each factor and the r-square for each activity. The standardized coefficients reflect the weight associated with standardized scores on the variables (Munro, 2005). For convergent validity, a value 0.5 and higher is recommended (Hair et al., 2015). An r-square for each activity, ranging between 0 and 1, explains a certain amount of information in the target variable. When an r-square indicates 0.927, 93% of the activity is able to be explained by the factor, suggesting the least 7% of the activity is explained by other factors (Kosugi and Shimizu, 2014).
Discriminant validity
We compared the values in the average square of the standardized coefficients for each factor and the square of the correlation coefficient (r2) between factors. An r2 between factors is a useful value in linear regression. It indicates the estimation which represents how much of the variability in the response variable is explained by the explanatory variable, ranging between 0 and 1 (Lang and Secic, 2006).
Robustness
We performed the second stage of a CFA approach with a Path analysis and consideration of age and gender. We established a Path diagram, using Ωnyx version 1.0 (University of Virginia & Max Planck Institute for Human Development).
Scale system
We performed two-stage IRT approaches using the maximum likelihood with robust standard error as the estimator.
Discriminant validity
We performed the first stage of an IRT approach to examine item discrimination and difficulty in each activity with a graded response model in this process. Item discrimination allows for the determination of how well items identify respondents at different levels of the latent trait, ranging between 0.5 and 2.5 (Yang and Kao, 2014).
Item difficulty is used to describe how difficult it is to achieve a 0.5 probability of a correct response for a specific item given the respondent’s score on the latent variable, ranging between −4.0 and 4.0. A higher value means the item is difficult to respond while a lower value means the reverse (Yang and Kao, 2014).
In this study, item difficulty was obtained in the second and third scales for each activity. We intended to revise the factorial structure and scale system when the values in those indices exceeded the expected ranges respectively: a value between 0.5 and 2.5 for item discrimination; a value between −4.0 and 4.0 for item difficulty.
Reliability
We observed the following graphical data: test information curve (TIC); item information curves (IIC); and item response category characteristics curve (IRCCC). TIC is derived from the sum of the individual items’ information curves and indicates the total test information. IIC reports an amount of information in an item, showing how much information each item produces for measuring the latent trait. IRCCC reports an item’s characteristic, indicating how well the scale system works. In IRCCC, the horizontal axis is fixed and ranged from −3 to +3 logits. Curves represent the ordinal categorical data (Toyoda, 2012).
A curve tilted to the left means the item is relatively easy to respond while a curve tilted to the right means the reverse. A steep curve means that the item discrimination is higher and differences of response probabilities between the ordinal categorical data are bigger, suggesting higher reliability. A slow curve means the reverse (Toyoda, 2012).
Robustness
We performed a latent-class model-based multi-group IRT as the second stage in this process to examine invariance levels. We examined pattern invariance (to verify whether the groups had the same factor structure), metric invariance (the same factor leadings) and measurement invariance (the same indicator intercepts, such as factor loadings, variance and covariance). A model with smaller values was selected, using Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) (Munro, 2005).
Results
Participant demographics
There were 687 participants in total: 468 (68.1%) were women and 219 (31.9%) were men. We compared participants according to age groups as older people are grouped in the younger group (between 65 and 74 years old) and older group (75 years old and older) under the current social system in Japan. Among the women, 50.2% were in the younger group and 49.8% were in the older group. Among the men, 44.3% were in the younger group and 55.7% were in the older group. There was no statistically significant difference between genders in the mean age of each age group (younger women, 69.2 ± 3.1, and younger men, 69.1 ± 3.2; older women, 79.9 ± 3.8, and older men 80.0 ± 3.8).
Overall patterns of responses
Throughout activities, responses in the third scale (not bother to participate) were less than or little above 10%. Only sewing in men was the exception (Figure 1).
Response frequencies (%) in each scale according to activities (n = 687).
Factorial structure
The first CFA stage indicated values representing a good fit as follows: between 0.99 and 0.75 for factor loadings (all p < 0.001), 0.99 for CFI and 0.98 for TLI. Only RMSEA indicated an adequate fit level: 0.054 (90% CI: 0.050, 0.058).
Values in the average square of the standardized coefficient in each factor and r2 between factors.
Values in the r-square in each activity.
Figure 2 shows the path diagram. Taken along with results in the first stage, the robustness of the factorial structure was approved because of the following values. The second CFA stage indicated values representing a good fit as follows: between 0.99 and 0.78 for factor loadings (all p < 0.001), 0.99 for CFI, 0.98 for TLI and 0.048 (90% CI: 0.044, 0.052) for RMSEA. Correlations between gender and each factor (APDL, pleasurable outings, cultural/educational, nature-related, social and entertainment activities) were −0.28, 0.14, 0.08, 0.11, 0.03 and 0.05 respectively. Correlations between age and each factor were 0.26, 0.30, 0.24, 0.30, 0.03 and 0.25 respectively. Those correlations suggested that gender and age influenced interest responses in the JICE’s factors at very low or ignorable levels.
A path diagram.
Scale system
Values in item parameters in each activity.
SE = Standardized error, 2nd scale = participate less, 3rd scale = do not bother to participate.
The TIC displayed a steep curve, suggesting higher reliability. The peak slightly tilted to the right on the horizontal axis, suggesting the system was slightly difficult to respond as the general characteristic (Figure 3).
The test information curve.
Figure 4 presents a part of the individual IIC for each activity. As the overall characteristic, a peak of the graph slightly tilted to the right on the horizontal axis, suggesting the system was slightly difficult to respond. A steep curve without prominent bimodality was observed in the following three activities, suggesting higher reliability: visiting acquaintances, traveling and driving. In other activities, a steep curve with bimodality or a slow curve was observed, suggesting the importance of observing the individual IRCCC.
Item information curves (a part).
Figure 5 presents a part of the individual IRCCC for the each activity based on the estimated item parameters in Table 3. The example IRCCC are enjoying literature activities as an ideal one and watching TV/movies as a least ideal one. In the following four activities, nearly ideal curves were obtained: socializing with the opposite gender, enjoying literature activities, playing gate ball and playing ground-golf. In those activities, an observed metric curve had a higher peak and the peak was nearly zero on the horizontal axis, indicating the each scale worked well, suggesting higher reliability. In terms of gardening/growing vegetables and watching TV/movies, a metric curve was prominently lower, indicating the second scale (participate less) didn’t work well, suggesting lower reliability. In the rest of the nineteen activities, a peak of a metric curve varied in a moderate range (a metric curve was between relatively higher and relatively lower), suggesting the reliability was moderate and varied between activities.
The item response category characteristics curve in each activity (a part).
Among the examinations of invariance levels, the pattern invariance indicated the smallest values than others, suggesting the robustness of the results, performed for investigating the factorial structure were supported. The obtained values were as follows: 21289.9 in AIC, 21747.6 in BIC for the pattern invariance; 21554.9 in AIC, 21908.5 in BIC for the metric invariance; 21550.9 in AIC, 21895.4 in BIC for the measurement invariance.
Discussion
This study investigated the factorial structure and scale system of the JICE throughout examination with various approaches. Our biggest insight was confirming the pattern invariance, indicating that the factorial structure was valid across different groups of people. This study, meanwhile, identified that factor loadings, variance and covariance were different between groups. The result has to be considered in future studies.
The first stage of our CFA approach displayed good to adequate fit levels, suggesting that we were able to understand the responses using the factorial structure. The second stage, which took gender and age into consideration, proved the robustness of the results obtained in the first stage. Interest responses have been previously reported to be dependent on gender (Katz, 1988; Klyczek et al., 1997; Nakamura-Thomas and Yamada, 2011; Rogers et al., 1978). Our new insight makes the JICE free from the complexity of analyzing data separately between men and women in future studies.
The indicated higher discriminant validity in the first IRT approach suggested that the scale system was able to be used for understanding a client’s activity participation according to the scales. An interest checklist is used to identify whether individuals participate in activities based on their interests or not (Kielhofner, 2008). In order to identify the activity participation, a previous JICE scale system (strong, casual or no interest) had to be revised. Our results are consistent with the aim of the utilitarian, suggesting the higher discriminant validity of the JICE was beneficial for clinical application.
Our TIC indicated higher reliability as an overall characteristic of the scale system; meanwhile, the second scale (participate less because of health status) didn’t work well in some activities. The scale system in this study did not employ additional questions to ask participants to indicate their degree of ‘less.’ Asking the frequency is difficult as the frequency of participation would differ between activities, for instance, daily, weekly or monthly (Kielhofner, 2008). No interest checklist defines the frequency. The ambiguity and complexity may have lowered the reliability. Meanwhile, among those activities indicating the second scale didn’t work well, watching TV/movies, listening to music and singing, grouped in the factor of entertainment activities, seemed to be an exception of the frequency issue as the factor didn’t correlate significantly with the subjective health related quality of life (Nakamura-Thomas and Kyougoku, 2013).
The JICE has to be further revised for both clinical friendliness and usefulness to understand a client’s interest-based activity participation. The JICE is designed for older people with potential issues in occupational participation; therefore, employing a system with smaller scale numbers and simple form is ideal for those clients. The number of qualitative studies based on interest checklists is limited (Nakamura-Thomas and Yamada, 2008) although importance of individual interview is clearly stated in the NPIIC (Matsutusyu, 1969). The aim of individual interviews of the NPIIC is to summarize a client’s history to guide the relationship between perceived performance capacity and activity choice. The importance of individual interview is emphasized as some participants preferred not to participate in an activity in the future although they currently participate based on one’s interest, and some participants responded the reverse (Nakamura-Thomas et al., 2014). Individual interviews may reveal the cycle a client perceives (Kielhofner, 2008). Taken along with those concerns, the JICE has to be improved, including the name (a checklist).
Limitations and future research directions
A random sampling is recommended in many studies. This study is expected to overcome the limitation as it uses a multi-group analysis based on a latent-class model. Future studies may randomly select samples from the data bank which authors have accumulated large samples from many areas.
Conclusion
This study examined the pre-specified factorial structure and the newly employed scale system, using CFA and IRT approaches. The factorial structure was valid and the robustness of the results in terms of the factorial structure was sustained. The system was indicated to have an ability to discriminate between the people who participated in activities based on interest and the people who did not; however, the scale was identified to be improved due to lower reliability in some activities. An interview guide may have to be created to truly understand a client’s interest-based activity participation.
Key findings
The pre-specified factorial structure in the target instrument was valid across different groups of people. The scale system possesses higher discriminant validity; although it has to be improved.
What the study has added
An interview guide may have to be created to truly comprehend interest-based activity participation to address the cycle of interests, taken along with the concern about the scale system.
Footnotes
Acknowledgements
The authors express our appreciation to the participants and research assistants for their cooperation.
Research ethics
Ethical approval was given by Saitama Prefectural University, Saitama, Japan (No. 22025, July 5th 2013).
Declaration of conflicting interests
The authors confirm that there are no conflicts of interest.
Funding
This research received no specific grant support from any funding agency in the public, commercial, or not-for-profit sectors.
