Persian version of executive function performance test for people with multiple sclerosis: A psychometric study

Abstract

Introduction:

Multiple Sclerosis (MS) is one of the most common neurological disorders, and it has a wide effect on performance. MS affects higher cognitive functions, interfering with independence and participation. The Executive Function Performance Test (EFPT) is a functional test that assesses executive function in performance. This study aims to translate and prepare a preliminary version of EFPT, assess its face and content validity, and examine inter-rater and intra-rater reliability and internal consistency.

Methods:

The English version of EFPT was translated into Persian using the IQOLA standard method. To assess face validity, participants evaluated the test’s appearance and functionality. The content validity ratio and index were analyzed using the Lawshe method. For reliability assessment, 30 individuals with MS were evaluated over a 2-week interval, and agreement between the two therapists was also examined.

Results:

The results showed the range of ICC in each category of the EFPT questionnaire between two examiners, between 0.964 and 0.998, and intra-rater, between 0.758 and 0.872. In test-retest reliability, each score’s ICC was between 0.877 and 0.977. Cronbach’s alpha was more than 0.7 for all items.

Conclusion:

The results showed that the Persian version of EFPF was properly valid and reliable for participants with MS.

Keywords

psychometric study multiple sclerosis reliability and validity occupational therapy executive function.

Introduction

Executive function is essential for effective performance and participation in daily activities (Bass et al., 2024). It includes skills such as sequencing, planning, and organization (Baughman et al., 2015; Carlson et al., 1999), all of which are crucial for creating, maintaining, and retaining meaningful occupations in life (Jefferson et al., 2006)

Executive function underlies higher-level cognitive abilities such as decision-making, self-correction, and judgment (Burgess, 2000; Burgess et al., 2006). These abilities are necessary for independent living, as they help individuals establish life goals, plan how to achieve them, and strive to accomplish them (Kaye et al., 1990; Lezak, 2004; Rose et al., 2012). These deficits can also affect Activities of Daily Living (ADLs) and may impact Instrumental Activities of Daily Living (IADLs), which involve more complex executive tasks (Roley et al., 2008). As a result, this can directly influence an individual’s participation and, consequently, their quality of life.

Individuals with Multiple Sclerosis (MS) face various challenges in their daily lives, particularly regarding their IADLs (Goverover et al., 2009). These difficulties may stem from physical or cognitive issues associated with MS symptoms (Baughman et al., 2015; Goverover et al., 2005; Rao et al., 1991). Previous studies indicate that individuals with cognitive disorders face various challenges in their occupations; however, there is insufficient evidence regarding the impact of cognitive deficits on IADLs in people with MS (Allataifeh et al., 2020; Baughman et al., 2015). Activities such as parenting, work, or social engagement all require skills such as attention, organization, planning, initiation, and other executive function components. This fact highlights the need to use standard executive function scales to assess those who show insufficiency in their participation in IADL.

There are a few cognitive skills that are developed and being used by occupational therapists to assess cognitive skills within activities such as leather swing via Allen’s cognitive level test (Velligan et al., 1998), or functional IADL tolls such as Assessment of Motor and Process Skills (Oakley and Sunderland, 1997), however none of these tools can assess the level of cognitive skills if the client is not able to processd the activity alone and rquires external assistance. Kitchen Task Assessment (KTA) can assess functional executive function within cooking activity (Baum and Edwards, 1993).

Although each of these skills provides valuable feedback on executive functioning within a controlled setting, executive function is highly relevant to the type of occupation which a person might pursue, and cultural and societal norms can highly affect its definition, performance, and definition of lack or shortage. For example, planning or punctuality might be a high-value trait in a specific culture, shaping a client’s life when this behavior is impaired; however, in another culture, it can be ignored or considered less important. In addition, some occupations, such as cooking, are culturally more important in certain contexts and require closer examination to ensure proper functioning for the client. All that said, current assessments may not predict an individual’s functional abilities in real-life situations, given cultural and contextual factors. Considering that Real life is much more complex and multidimensional to be evaluated within non-functional scales.

The Executive Function Performance Test (EFPT) is a functional assessment designed to address and resolve issues related to the KTA. In addition, the EFPT includes several activities, such as cooking, managing medications, using a phone, and paying bills. This test also allows for the assessment of cognitive skills while utilizing external support and assistance (Abdollahipour et al., 2016). This test has been studied in individuals with schizophrenia, traumatic brain injuries, cerebrovascular accidents, and substance abuse (Baum et al., 2008; Katz et al., 2007; Raphael-Greenfield, 2012).

Recognizing the importance of participation, specifically within IADL occupations in maintaining independence and the impact of executive function on IADL, we aimed to evaluate the psychometric properties of the EFPT for individuals with MS. Highlighting the importance of context on performance, we are modifying the assessment to the cultural context of our participants. This assessment is intended to enhance the diagnosis of cognitive skills related to daily living and ensure timely occupational therapy services, supporting the health care team in making the decision to discharge people with MS while they are identified to be able to live on their own. The primary goal of this study was to determine the validity and reliability of the EFPT for people with MS within a Persian context. This validation contributes to the global applicability of the EFPT, enabling culturally sensitive, evidence-based occupational therapy assessments in diverse populations. The objective of this study is:

To translate the test into Persian and obtain a culturally relevant test

To assess the face and content validity of the Persian version of EFPT

To determine the inter-rater and intra-rater reliability of the Persian version of EFPT

To examine the internal consistency of the Persian version of EFPT

To determine the test-retest reliability of the Persian version of EFPT

Methods

Design

This is a Psychometric study.

Participants with MS

People with MS within the range of 18 to 60 years old were recruited with convenience sampling from clinics in Saqqez, Kurdistan. Participants were screened with the Mini-Mental State Examination (MMSE) after signing the informed consent form and entered into the study if they were diagnosed as not having a cognitive disorder and had a score higher than 24. Their medical records were checked to make sure they do not have severe depression, as it can be a factor contributing to IADL difficulties. People should not have any major neurological disorders such as Parkinson’s, dementia, or CVA according to their medical history. Although we collected data on MS type, recruitment was not restricted by MS type.

Expert participants

Twelve experts in the field of neurological rehabilitation were invited to this research. Participants were recruited through snowball sampling. These participants participated in a content validity study. Of these 12 people, 4 had a PhD in occupational therapy, and 8 had a master’s in occupational therapy.

Ten occupational therapists working in adult neurology clinics also participated in the study, using snowball sampling for a face validity study.

EFPT full definitions

The EFPT assesses various aspects of executive function, including self-initiation, organization, sequencing, judgment, and accomplishment. Each of these components will be scored individually, and a cumulative score will also be calculated.

Participants will engage in four primary tasks after a handwashing screening. If they successfully complete the handwashing task, they will proceed to the next tasks, which include simple cooking, using a phone, managing medication, and paying two bills. The cooking task has been adapted to reflect Iranian culture by incorporating a meal more common in that context, since the original test was not designed for Persian speakers.

The test generates three types of scores: one for executive function, one for each individual task, and one overall score. The executive function score is derived from the total score of all four tasks, evaluated in the sub-categories of self-initiation, organization, sequencing, judgment, and accomplishment, with each category scored on a scale from 0 to 5. Consequently, each activity will receive a score between 0 and 20. The overall score can vary from 0 to 100.

Study process

This study began in 2019, and sampling began during the COVID-19 pandemic. Therefore, assessments were online. Ethical approval was obtained from the University of Social Welfare and Rehabilitaion Sciences ethical committee. This study had three main phases: first, the test was translated and adapted for Persian users; then content and face validity were assessed; and in the third phase, test-retest and interrater reliability were examined. All participants provided informed consent forms.

Translation

In this phase, the English version was translated into Persian using the IQOLA standard translation process. Two translators, fluent in Persian and English, translated the test and created a list of possible translations for some words. They then met to study the translations and agreed on a Persian version. Following that, two other translators translated the Persian version back into English. These two English versions were then discussed and combined into a single unified version.

A meeting was held to review and compare the English version to the original. All translators participated and agreed that the translation was accurate.

Validity

Two types of validity were assessed in this study: face and content validity.

Face validity

Ten experts assessed the translated tool’s appearance to evaluate its face validity. The face validity of EFPT was gauged using the item impact score. Experts scored each item on a Likert scale from 1 (not important at all) to 5 (very important). The item impact score was then calculated using the formula: Importance * percentage of people who scored the item 4 or 5 = item impact score. Any item with an average score of more than 1.5 was deemed appropriate.

Content validity

The Lawshe method was utilized to evaluate content validity (Lawshe, 1975). Lawshe is a widely recognized quantitative approach for determining content validity. In this method, a panel of experts assesses the significance and necessity of items in the test. To do this, both the content validity ratio (CVR) and content validity index (CVI) were computed. In CVR, an item is evaluated based on its necessity within three main categories (not necessary, beneficial but not necessary, and necessary), while CVI assesses each item as related, simple, and clear.

Following the collection of expert opinions, CVR is calculated using the formula:

\begin{array}{l} CVR = (number of p a r ticipants who selected \\ “ necessary ” * all participants/ 2) / all participants / 2 \end{array}

According to the established standard scores, the acceptable CVR for a study involving 12 participants should be at least 0.56.

For CVI, the following formula was employed:

\begin{matrix} CVI = (number of participants who scored \\ 3 and 4) / all participants \end{matrix}

Reliability

To assess reliability, both internal consistency and test-retest reliability were examined. Also, inter-rater reliability and intra-rater reliability were examined to examine the consistency of scores between and within raters.

Internal consistency

To measure internal consistency, we calculated Cronbach’s alpha. A Cronbach’s alpha value above 0.9 is considered perfect, between 0.7 and 0.9 is considered good, between 0.6 and 0.7 is considered acceptable, and below 0.6 is considered poor (Nunnally and Bernstein, 1994).

Test-retest reliability

To assess test-retest reliability, the Persian version of the EFPT was administered to 30 participants aged 18 to 60. After an initial assessment, the same participants were reassessed two weeks later. Their scores were compared using the absolute agreement method with a two-way random model to calculate the intraclass correlation coefficient (ICC). Reliability was categorized as very good if the ICC was higher than 0.8, moderate if between 0.6 and 0.8, and poor if less than 0.6. (Munro, 2005).

Results

This study involved 114 participants. The descriptive statistics are presented in Table 1. The participants ranged in age from 18 to 60, with an average age of 38 (±8.9). 76% of the participants were women. In addition, 46.5% of the participants had higher education, and 41.2% were aged 35–45.

Table 1.

Descriptive statistics.

Variables		Number of participants	Percentage
Gender	Female	76	66.7
Gender	Male	38	33.3
Age	Less than 25	18	15.8
	25–35	22	19.3
	35–45	47	41.2
	More than 45	27	23.7
Level of education	High-school level	17	14.9
	Under-graduated	44	38.6
	Graduated	53	46.5

Face validity

No item was deleted in this stage. All scores were more than 1.5. See Table 2 for detailed scores of each item’s importance and impact.

Table 2.

Face validity.

Test’s items	Importance		Impact score	Results
Test’s items	Number	Average	Impact score	Results
(1) Simple Cooking	6	2.76	1.65	accept
(2) Using the Telephone	8	3.52	2.81	accept
(3) Taking Medication	8	3.84	3.07	accept
(4) Paying Two Bills	8	3.52	2.81	accept

Content validity was determined by CVI and CVR (see Table 3). The minimum acceptable score for CVR was 0.56, and all items got a 0.67 score. Table 3 shows scores and CVI and CVR, as well as the relative clarity and simplicity of each item.

Table 3.

Content validity.

Test’s items	CVR		CVI
Test’s items	Minimum acceptable score: .56		Relatively	Clarity	Simplicity
(1) Simple Cooking	0.67	Accept	1	1	1
(2) Using the Telephone	0.67	Accept	1	1	0.91
(3) Taking Medication	0.67	Accept	0.83	0.91	0.91
(4) Paying Two Bills	0.67	Accept	1	1	1

Internal consistency for each item was calculated using Cronbach’s alpha, and all items had scores between 0.71 and 0.86, as shown in Table 4.

Table 4.

Test-retest, inter-rater, and intra-rater reliability with confidence interval 95%.

Items	Test-retest			Inter-rater			Intra-rater
Items	Upper limit	Lower limit	ICC	Upper limit	Lower limit	ICC	Upper limit	Lower limit	ICC
Simple cooking	0/939	0/853	0/886	0/995	0/843	0/980	0/930	0/839	0/872
Taking Medication	0/925	0/827	0/877	0/994	0/937	0/982	0/926	0/850	0/867
Using the phone	0/998	0/922	0/977	0/988	0/891	0/964	0/901	0/801	0/831
Paying bills	0/986	0/858	0/914	0/999	0/994	0/998	0/886	0/706	0/758

The reliability of the study was examined in three types of tests: inter-rater and intra-rater. The ICC scores in test-reset ranged from 0.89 to 0.92, in inter-rater reliability from 09.96-0.98 and for intra-rater reliability from 0.76 to 0.87. Details are shown in Table 4.

Discussion

This study aimed to assess the psychometric properties of the Persian version of EFPF in individuals with MS. The results indicated that hand washing is an effective screening tool. In a clinical setting, observing a client wash their hands can be a simple indicator of readiness to transition to independent living, which can be vital for decision-making points such as discharge planning. Managing bills and medication had the highest face and content validity across inter- and intra-rater reliability and test-retest reliability, indicating that the Persian version of EFPT was a good representative of what it aimed to measure. The items on phone use and cooking also showed good Cronbach’s alpha, indicating that the Persian version of the EFPT has the potential to support clinicians in assessing clients’ executive function in MS.

The Persian translation was deemed accurate because the final version closely resembled the original. The face and content validity of this study were found to be appropriate. This finding was consistent with other studies, which reported an average validity of 88% for EFP (Cederfeldt et al., 2011).

In this study, the ICC was used to assess intra-rater reliability. The highest score was observed for cooking (0.87), and the lowest score for paying bills (−0.75). Overall, all items showed correlations greater than 0.75, which is the threshold for accepting a test’s reliability (Scheel et al., 2018).

For inter-rater reliability, all subscales had ICCs over 0.96, with the highest in the “paying bills” subscale. However, because of the lack of other studies on people with MS and the disorder’s specific characteristics, we cannot compare these findings with other evidence. Nevertheless, a Brazilian study of people with CVA also showed agreement over 0.7 between raters (Conti & Brucki, 2018). The same results were found in another study involving breast cancer patients (Boone and Wolf, 2021).

The test-retest reliability results showed that the highest scores were for simple cooking and medication management, while the lowest score was for paying bills. Since this study is the first of its kind using EFPT in people with MS, it is not possible to make direct comparisons. However, it's important to note that medication management is a crucial aspect of the daily life of people with MS, and the good reliability scores reflect the importance of medication use and daily living for people with MS (Silavanich et al., 2019). Using medication is a learned behavior, and changing that pattern is not easy. Therefore, lower agreement between the two assessments was expected (Rottman et al., 2017). Also, paying bills can be done in different ways, often requiring an internet connection. Family members may also assist in paying bills, which could explain discrepancies between the two tests.

Cronbach’s alpha was used to assess internal consistency, and all items scored higher than 0.7 (0.71–0.86), indicating good consistency(Tavakol and Dennick, 2011). These findings were parallel to other studies, which found Cronbach’s alpha between 0.58 and 0.8 for drug disuse(Juntorn et al., 2020) and more than 0.88 for people with schizophrenia (Katz et al., 2007).

In General, we can conclude that the Persian version of EFPT has proper psychometric properties for use in people with MS.

Conclusion

EFPT is a test that assesses higher cognitive skills essential for daily living. This test evaluates executive function in real-life situations, making it very valuable. This study introduced a Persian version of EFPT and demonstrated its validity and reliability among people with MS, based on participant data. This tool can help determine discharge plans for individuals with MS, enabling better predictions of their independence after discharge. More research is needed on predictors of discharge plans using EFPT.

Limitations

Although there are standard guidelines, such as the COSMIN framework, in this study, we did not fully follow this framework. Although several measurement properties, such as reliability and validity, were assessed, the study was not designed as a full COSMIN-based psychometric evaluation, and some components (e.g., construct validity) were beyond the scope of the present design.

Footnotes

Acknowledgements

Thanks to all the participants, from experts to people with MS and their families, for participating in this study.

Ethical considerations

Ethical approval was received from the ethical committee of the University of Social Welfare and Rehabilitation Sciences

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors declared no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Minoo Dabiri Golchin

Patient and public involvement data

During the development, progress, and reporting of the submitted research, Patient and Public Involvement in the research was not included at any stage of the research.

References

Abdollahipour

Alizadeh Zarei

Akbar Fahimi

, et al (2016) Study of face and content validity of the Persian version of behavior rating inventory of executive function, preschool version. Archives of Rehabilitation 17(1): 12–19 Available at: https://rehabilitationj.uswr.ac.ir/article-1-1779-en.html (accessed April 2025).

Allataifeh

Khalil

Almhdawi

, et al (2020) The clinical correlates of participation levels in people with multiple sclerosis. NeuroRehabilitation 47(2): 153–160.

Bass

Marchant

de Sam Lazaro

, et al (2024) Application of the person–environment–occupation–performance model: A scoping review. OTJR: Occupational Therapy Journal of Research 44(3): 521–540.

Baughman

Basso

Sinclair

, et al (2015) Staying on the job: The relationship between work performance and cognition in individuals diagnosed with multiple sclerosis. Journal of Clinical and Experimental Neuropsychology 37(6): 630–640.

Baum

Edwards

(1993) Cognitive performance in senile dementia of the Alzheimer’s type: The Kitchen Task Assessment. The American Journal of Occupational Therapy 47(5): 431–436.

Baum

Connor

Morrison

, et al (2008) Reliability, validity, and clinical utility of the Executive Function Performance Test: A measure of executive function in a sample of people with stroke. The American Journal of Occupational Therapy 62(4): 446–455.

Boone

andWolf

(2021) Initial development and evaluation of the executive function performance test-enhanced (EFPT-E) in women with cancer-related cognitive impairment. The American Journal of Occupational Therapy 75(2): 7502345020p7502345021–7502345020p7502345027.

Burgess

(2000) Strategy application disorder: The role of the frontal lobes in human multitasking. Psychological Research 63(3–4): 279–288.

Burgess

Alderman

Forbes

, et al (2006) The case for the development and use of “ecologically valid” measures of executive function in experimental and clinical neuropsychology. Journal of the International Neuropsychological Society 12(2): 194–209.

10.

Carlson

Fried

Xue

Q-L

, et al (1999) Association between executive attention and physical functional performance in community-dwelling older women. The Journals of Gerontology Series B: Psychological Sciences and Social Sciences 54(5): S262–S270.

11.

Cederfeldt

Widell

Andersson

, et al (2011) Concurrent validity of the Executive Function Performance Test in people with mild stroke. British Journal of Occupational Therapy 74(9): 443–449.

12.

Conti

Brucki

SMD

(2018) Executive function performance test: Transcultural adaptation, evaluation of psychometric properties in Brazil. Arquivos de Neuro-Psiquiatria 76(11): 767–774. https://doi.org/10.1590/0004-282x20180127

13.

Goverover

Chiaravalloti

Gaudino-Goering

, et al (2009) The relationship among performance of instrumental activities of daily living, self-report of quality of life, and self-awareness of functional status in individuals with multiple sclerosis. Rehabilitation Psychology 54(1): 60.

14.

Goverover

Kalmar

Gaudino-Goering

, et al (2005) The relation between subjective and objective measures of everyday life activities in persons with multiple sclerosis. Archives of Physical Medicine and Rehabilitation 86(12): 2303–2308.

15.

Jefferson

Paul

Ozonoff

, et al (2006) Evaluating elements of executive functioning as predictors of instrumental activities of daily living (IADLs). Archives of Clinical Neuropsychology 21(4): 311–320.

16.

Juntorn

Thichanpiang

Wangkawan

, et al (2020) Reliability and validity of culturally adapted executive function performance test for Thai people with substance-induced disorders. Journal of Associated Medical Sciences 54(1): 35–43. Available at: https://he01.tci-thaijo.org/index.php/bulletinAMS/article/view/244332 (accessed April 2025).

17.

Katz

Tadmor

Felzen

, et al (2007) Validity of the Executive Function Performance Test in individuals with schizophrenia. OTJR: Occupation, Participation and Health 27(2): 44–51.

18.

Kaye

Grigsby

Robbins

, et al (1990) Prediction of independent functioning and behavior problems in geriatric patients. Journal of the American Geriatrics Society 38(12): 1304–1310.

19.

Lawshe

(1975) A quantitative approach to content validity. Personnel Psychology 28(4): 563–575.

20.

Lezak

(2004) Neuropsychological Assessment. Oxford University Press, USA.

21.

Munro

(2005) Statistical Methods for Health Care Research, Vol. 1. Lippincott Williams & Wilkins.

22.

Nunnally

Bernstein

(1994) Psychometric Theory. McGraw-Hill, p. 136.

23.

Oakley

Sunderland

(1997) Assessment of motor and process skills as a measure of IADL functioning in pharmacologic studies of people with Alzheimer’s disease: A pilot study. International Psychogeriatrics 9(2): 197–206.

24.

Rao

Leo

Ellington

, et al (1991) Cognitive dysfunction in multiple sclerosis: II. Impact on employment and social functioning. Neurology 41(5): 692–696.

25.

Raphael-Greenfield

(2012) Assessing executive and community functioning among homeless persons with substance use disorders using the executive function performance test. Occupational Therapy International 19: 135–143.

26.

Roley

DeLany

Barrows

, et al (2008) Occupational therapy practice framework: Domain and process. The American Occupational Therapy Association 62(6): 625–683.

27.

Rose

Feldman

Jankowski

(2012) Implications of infant cognition for executive functions at age 11. Psychological Science 23(11): 1345–1355.

28.

Rottman

Marcum

Thorpe

, et al (2017) Medication adherence as a learning process: Insights from cognitive psychology. Health Psychology Review 11(1): 17–32.

29.

Scheel

Mecham

Zuccarello

, et al (2018) An evaluation of the inter-rater and intra-rater reliability of OccuPro's functional capacity evaluation. Work 60(3): 465–473.

30.

Silavanich

Nathisuwan

Phrommintikul

, et al (2019) Relationship of medication adherence and quality of life among heart failure patients. Heart & Lung 48(2): 105–110.

31.

Tavakol

Dennick

(2011) Making sense of Cronbach’s alpha. International Journal of Medical Education 2: 53–55.

32.

Velligan

Bow-Thomas

Mahurin

, et al (1998) Concurrent and predictive validity of the Allen Cognitive Levels Assessment. Psychiatry Research 80(3): 287–298.