Abstract
The ABILHAND is among the most widely used questionnaires in upper limb rehabilitation. This study aimed to evaluate whether self-report procedure of the ABILHAND-Stroke is concordant with performance observation-based procedure. Two assessments were performed with each patient on the same day using the Beninese version of the ABILHAND. Intraclass correlation coefficient (ICC2,1) and Bland–Altman plot were used to evaluate the agreement and the relationships between ABILHAND measures. A total of 123 people with chronic stroke were included in the study. ICC was .77 (95% confidence interval [CI] = [.67, .82]) with p < 10−6 demonstrating a good concordance between both assessment methods despite significant difference between patients’ mean measures (self-report = −0.06 ± 2.64 logit; performance-based = 1.28 ± 3.57 logit; p value < .0001). Results confirmed the concordance of the self-report regarding the performance-based measures. In clinical routine self-report of ABILHAND scale might be useful for initial screening purposes while for further investigation the performance observation-based procedure should be preferred.
Plain Language Summary
In this study, we evaluated the reliability of the self-report administration of the ABILHAND-Stroke questionnaire compared with the clinician-observed performance measures in people with chronic stroke. We found that though both administration procedures yielded similar results, the performance-observed procedure should be preferred over the self-reported procedure.
Introduction
Stroke is the leading cause of neurological disability in adults worldwide and is a major public health issue in Africa due to the epidemiological transition (Adoukonou et al., 2020, 2021; Cossi et al., 2012; Kossi et al., 2016; Youkee et al., 2023). The upper limb function impairments are the most frequently encountered after a stroke and they are harder to restore than the lower limb functions (Alt Murphy et al., 2022; Y. W. Kim, 2022; Zhi et al., 2022). Several years post stroke, almost 50% of stroke patients will still present some functional upper limb impairments that might negatively impact the functional independence and quality of life of patients (Y. W. Kim, 2022; Niama Natta et al., 2019).
Several tools are dedicated to the assessment of the upper extremity activity after a stroke, such as the Action Research Arm Test, the Arm Motor Ability Test or the ABILHAND (Penta et al., 2001). Currently, ABILHAND is among the most widely used questionnaires in the hand and upper limb rehabilitation (H. Kim & Shin, 2022). The ABILHAND scale was originally developed in 1998 to evaluate the ability of people with rheumatoid arthritis to manage daily activities that require the use of the upper limbs, whatever the strategies involved (Penta et al., 1998). Since then, its psychometric qualities have been examined in several pathologies and sociocultural contexts such as in Benin (Barrett et al., 2013; Başakci Çalik et al., 2019; Niama Natta et al., 2019; Simone et al., 2011). The ABILHAND Stroke Benin was recently proposed as an interview-based scale to assess perceived manual ability within the International Classification of Functioning Disability and Health (ICF) activity domain in people with chronic stroke (Niama Natta et al., 2019). The ABILHAND Stroke Benin scale is a disease-specific scale with good psychometric properties, including unidimensionality and invariance. As a Rasch-built scale, this scale allows a linear transformation of the ordinal raw scores.
Several advantages are associated with objective measures based on the observation of the patient’s performance. These advantages include more precise and valid results with an increased sensitivity to changes over time. However, objective measures are usually difficult to apply in clinical routine because they tend to be resource and time-consuming (Kossi et al., 2024; Nunes et al., 2015; Prince et al., 2008). By cons, patient-reported outcome measures (PROMs) are subjective and they are most of the time prone to a number of biases such as memory biases (Kossi et al., 2025; Nunes et al., 2015; Prince et al., 2008). However, PROMs are generally quick and easy to administer. They are therefore an effective way to reach larger groups at low costs (Nunes et al., 2015). Specifically, self-report might be useful for initial screening purposes in clinical routine. Despite their usefulness, the limitations inherent to PROMs impose caution in their use. Therefore, it is important to ensure that the answers provided by patients to a questionnaire do not deviate significantly from their actual performance. Hence the importance of checking the concordance of the PROMs in addition to their validity (Pallant & Tennant, 2007; Tennant & Conaghan, 2007).
In 2020, Avelino et al., validated the telephone-based application of ABILHAND as a patient-reported outcome measure (PROM) in Brazil context, although without comparison with the observational application (Avelino et al., 2020). This methodology was subsequently replicated in another study to investigate factors associated with the state of disability after hemiparesis in the chronic phase of stroke (Martins Dos Santos et al., 2024). Despite its validity and its psychometric properties investigated, more knowledge is needed about the ABILHAND-Stroke in Benin context such as the concordance of different assessment methods (self-report, observation of performance, etc.), test–retest reliability, responsiveness, etc. When a new measuring technique is introduced, an assessment of its concordance in respect with several measurement approaches is critically important (Kwiecien et al., 2011). Indeed, although PROMs are increasingly well integrated in clinical routine practice and research, several personal and sociocultural factors are potential bias factors for these procedures (Kossi et al., 2018, 2020; Vandervelde et al., 2010).
This study aimed at assessing the concordance of the self-report procedure of the ABILHAND-Stroke Benin scale in comparison with the performance-based clinician reported procedure in people with chronic stroke.
Method
Study Design and Setting
This is a cross-sectional study carried out between June and September 2022. The study was conducted at the Physiotherapy and Rehabilitation Department of the University Hospital of Parakou, Benin.
Ethics Considerations
This study was performed with respect to the principles of the Declaration of Helsinki. Approval was granted by the Ethics Committee of the University of Parakou, Benin (Date: February 28, 2022); certificate number: 0541/CLERB-UP/P/SP/R/SA). All participants included in the study provided their informed consent individually.
Participants and Inclusion Criteria
We consulted patients’ admission records to identify potential participants. Eligible participants were contacted by phone and they were invited to participate in the study on the basis of several inclusion criteria: (a) clinical diagnosis of a primary or recurring unilateral stroke that occurred at least 6 months previously, (b) age ≥ 18 years old, and (c) ability to read French, the official language in Benin. Patients with a major cognitive impairment (Community Screening Interview for Dementia score ≥ 7; Hall et al., 2000) were excluded from the study as were those who had other with permanent degenerative neurological impairments (Parkinson’s disease, Alzheimer disease, etc.).
Assessment Procedure and Variables
Assessment of Manual Ability
Two assessments were performed with each patient on the same day using the ABILHAND-Stroke Benin questionnaire which is a Rash-built 16-item questionnaire validated in 2019 in a sample of 233 people with chronic stroke recruited in Cotonou city, Southern of Benin. The ABILHAND-stroke is unidimensional, invariant, and allows a transformation of ordinal raw scores into linear measures in logit using a conversion table (Niama Natta et al., 2019).
The first assessment was based on self-report and the second was observation of performance. The patients first rated their manual ability by completing the ABILHAND-Stroke Benin questionnaire (self-reported [SR]) as “impossible” or “difficult” or “easy.” One hour later, the patients were asked to perform each of the tasks described across the 16 items of ABILHAND-Stroke Benin. This individual performance was observed and rated by a medical doctor (performance-based [PB]) on a separate ABILHAND-Stroke Benin sheet. The medical doctor who rated the patients was blinded to the SR data. Similarly to SR, the assessor rated each of the 16 tasks in a three-category rating scale as the patient performed them. To avoid bias, patients were only informed that the study was intended to investigate some psychometric qualities of the ABILHAND-Stroke questionnaire. They got access to the full and detailed purpose of the study only after having completed both the self-report and the performance-based assessments. Participants were instructed to maintain a comfortable speed during the performance of the tasks.
Assessment of Overall Disability
A medical doctor who had more than 2 years of experience in assessing patients with stroke evaluated the overall disability level among study participants using the modified Rankin scale (mRS; Rankin, 1957). mRS is a seven-level ordinal clinician-rated scale aiming at categorizing the severity of disability on the basis of observation. A score of 0 is no disability, 5 is disability requiring constant care for all needs, and 6 is death.
Sampling Strategy and Sample Size
We used a purposive sampling technique to recruit participants. All stroke patients meeting the inclusion criteria during the data collection period were invited to participate in the study. The minimum sample size was determined by Schwartz formula with 2% margin of error, 95% confidence level and 1.13% stroke prevalence (Adoukonou et al., 2020). Considering a margin of 15% of exclusion for incomplete data, the minimum sample size required was estimated at 123 participants.
Statistical Analyses
ABILHAND-Stroke raw scores for both the self-report and performance-based methods were converted to linear measures in logit using the conversion table (Niama Natta et al., 2019). Concordance between ABILHAND-Stroke self-report and performance-based measures was analyzed using intraclass correlation coefficients (ICC2,1). ICCs were interpreted using the Cicchetti guidelines, with agreement classified as: poor (ICC < .40), fair (.40 ≤ ICC < .60), good (.60 ≤ ICC < .75) and excellent (.75 ≤ ICC ≤ 1.00; Koo & Li, 2016). We subsequently computed the Bland–Altman plot, and paired t test to compare the mean value of both dependent outcomes of ABILHAND-Stroke measures obtained from the self-report and those from performance-based measures. Statistical analyses were performed using RStudio software version 4.2.3. For all analyses, p value < .05 was set as a statistically significant threshold.
Results
Sociodemographic and Clinical Characteristics of the Sample
Figure 1 gives the flow chart of identification and inclusion of participants. A total of 123 people with chronic stroke, 60.16% males, mean ± SD age of 53.90 ± 11.33 years met the inclusion criteria and were included in the study. The sociodemographic and clinical characteristics of the participants are presented in Table 1. Overall, most of participants (52.03%) had an ischemic stroke, left affected hemisphere (42.28%), deep location (capsulo-lenticular, 70.74%). The overall disability levels varied considerably among the study participants. About one quarter (26.83%) of participants had moderate disability, whereas 13.82% had mild disability and 14.63% moderately severe disability. Regarding the ABILHAND measure, the mean location of persons was −0.06 ± 2.64 logits for the self-report and 1.28±3.57 logit for observed performance. For both administration procedures, the ABILHAND-Stroke measures ranged from −4.37 to 4.640 logits, corresponding to an interval of 9.01 logits.

Flow Chart of Eligibility and Inclusion.
Socio-Demographic and Clinical Characteristics of the Sample.
Reliability of the Self-Report Assessment Method
Figure 2 shows the relationship between self-report and performance-based manual ability. ICC was .77 (95% confidence interval [CI] = [.67, .82]) with p < 10−6 demonstrating a good concordance between both assessment methods despite significant difference between patients’ mean measures (self-report = −0.06 ± 2.64 logit; performance-based = 1.28 ± 3.57 logit; p value < .0001).

Correlation Between Self-Reported and Performance-Based Manual Ability.
Figure 3 gives the Bland–Altman plot of differences between PB and SR, which were normally distributed (p = .319) with respect to the means of both assessments. Almost all values were within the 95% limits of agreement, which ranged from −2.76 to 4.95 logit.

Bland–Altman Plot of Agreement Between Performance-Based Score (PB) and Self-Reported Score (SR).
Discussion
To the best of our knowledge, this is the first study to assess the concordance of the self-report procedure of the ABILHAND-Stroke questionnaire in comparison with the performance-based clinician reported procedure. Results showed that despite yielding lower values compared with the performance-based clinician reported, self-report is a good method to administer the ABILHAND-Stroke Benin questionnaire as shown by the Bland–Altman plot.
The ABILHAND-Stroke Benin questionnaire was validated in 2019 in a sample of 233 people with chronic stroke recruited in Cotonou city, Southern of Benin. The characteristics of the sample were, mean age 54 ± 9.7 years, males 66.8%, ischemic stroke 35.4%, paretic hand being the dominant one 48.4%, median time since stroke 27 months, and mean person location on the ABILHAND scale 1.0 ± 1.37 logit (expected: between −0.5 and +0.5 logit; Niama Natta et al., 2019). These characteristics are relatively similar to those of the sample involved in the present study. In addition, in the present study, for both administration procedures the ABILHAND-Stroke measures ranged from −4.37 to 4.640 logits corresponding to an interval of 9.01 logit against an interval of 5.91 logit found in the validation study (Niama Natta et al., 2019). Practically, these results show that the ABILHAND-Stroke Benin is well targeted and has the potential for measuring manual ability beyond the levels encountered in the validation study (Kossi et al., 2018). In addition, our findings showed that 43% of participants had a maximal of 4.64 logits on the performance-based measure while 8% of them had the maximal on the self-reported approach. The results of the performance-based measure are perfectly in line with the modified Rankin scores which showed that 44.72% of participants presented with no disability or had no significant disability despite symptoms on the upper limb.
In the validation study of the ABILHAND-Stroke Benin, authors examined the internal reliability of the scale against the Rasch mathematical model. The authors concluded that the ABILHAND-Stroke Benin questionnaire fits well the model regarding the hierarchy of the difficulty of the 16 items and the proficiency of the 233 participants (Niama Natta et al., 2019). In the present study, our results showed that the self-report method yields significantly lower values compared with the clinician reported performance-based method. However, these discrepancies did not alter significantly the concordance of the self-report method as shown by the Bland–Altman plot. Importantly, our results demonstrated a good concordance between both assessment methods with good ICC (ICC = .77). Nonetheless, the agreement between the two assessment methods might be interpreted conservatively. Although ICC = .77 indicates good agreement, it does not imply that the two methods are similar. The Bland-Altman plot and paired t-test revealed a systematic error, with clinician-rated scores being significantly higher than self-reported scores. In addition, this measurement error showed a specific pattern: for individuals with poorer manual ability, clinician ratings tend to be lower than self-reports, whereas for those with better manual ability, clinician ratings tend to be higher than self-reports (as indicated by the left-bottom to right-top distribution in the Bland-Altman plot). This suggests a risk of overestimation or underestimation in self-reported scores for certain subgroups. Indeed, the actual disability in people with stroke depends on complex interactions between upper limb function and compensatory behaviors of the person, such as using the unaffected limb or dividing complex movements into simpler ones. Moreover, the learning and application of new motor processes is influenced by the subject’s motivational and emotional status, which is likely to be impaired by the stroke-related impairments (Wade, 1997).
Recently, Ekstrand et al. (2023) examined the clinical interpretation and cut-off scores for manual ability measured by the ABILHAND questionnaire in people with stroke. Authors found that the ABILHAND-Stroke scores matched well with self-reported Stroke Impact Scale Hand, but discrepancies were found with observed Fugl-Meyer Assessment for Upper Extremity and Action Research Arm Test (Ekstrand et al., 2023). These results seem to raise some ambiguities as to the concordance of the self-administration of the ABILHAND-Stroke questionnaire. The findings of the present study reassure clinicians and researchers about the concordance of the ABILHAND-Stroke scale regarding the self-report administration procedure.
Acquiring accurate information regarding functional outcomes including manual ability after stroke is essential as it can serve as a baseline for health care planning. It can also facilitate the establishment of therapeutic guidelines and interventions (Amanzonwé et al., 2024; Nindorera et al., 2023; Noukpo et al., 2024). In this perspective, several other psychometric properties of the ABILHAND-Stroke questionnaire should be further explored in future studies.
A potential limitation of this study would be its generalization to the whole stroke population given the exclusion of those who cannot read French, the official language in Benin. In 2015, results from a study in 31 Sub-Saharan African countries showed that more notably in Western and Central Africa a relatively small percentage of adult women can read, even after several years of primary school (Smith-Greenaway, 2015). Overall, even if self-administered measures tend to be less resource or time-consuming, linguistic and readability problems could complicate self-completion for some patients in African context (Smith-Greenaway, 2015). Interviewer-administered questionnaires may also be problematic to apply as some stroke patients may be unable to respond in an interview due to cognitive problems that are common in stroke survivors (Adoukonou et al., 2018; Kossi et al., 2021). As a consequence, observation-based measures and the possibility of using proxies (relative or close friend) seem to be relevant when measuring latent variables in stroke population in the Africa context (Kossi et al., 2020; Pereira et al., 2024). Another potential limitation of the study is that there was only one observer. A concurrence analysis versus two or more raters would give more insight. Also there were no other upper limb capacity tests to compare data with, such as the Action Research Arm Test or the Arm Motor Ability Test. However, these comparisons had already been done in the validation study of the ABILHAND-Stroke Benin (Niama Natta et al., 2019), therefore, to our opinion they would not necessarily add any new information on the concurrent validity of the questionnaire. Finally, the study sample may be subject to selection bias. Approximately one-third of the participants had a modified Rankin Scale score of 0 (no symptoms), which led to a ceiling effect in the performance-based ABILHAND scores (43% of participants achieved the maximum score). This bias not only may affect ICC estimates and error calculations but also may limit the generalizability of our findings to the whole stroke population.
Conclusion
The results of this study complement those of the validation of the ABILHAND-Stroke Benin scale. Different statistical methods were used to demonstrate that self-report administration procedure is in accordance with performance observation-based procedure to measure manual ability in everyday activities in people with chronic stroke. Self-report might be especially useful for initial screening purposes in clinical routine. However, for further investigation or for research purposes the performance observation-based procedure should be preferred over self-report. In addition, further studies should investigate the test–retest reliability, interrater reliability, concordance of proxy-respondent procedure, and the responsiveness of the ABILHAND-Stroke questionnaire in the Beninese context.
What Is Already Know on This Topic
ABILHAND is among the most widely used questionnaires in hand and upper limb rehabilitation;
Several advantages are associated with objective measures based on the observation of the patient’s performance. These advantages include precise and valid results. However, objective measures are usually difficult to apply in clinical routine because they tend to be resource and time-consuming;
Patient-reported outcome measures (PROMs) are subjective and they are often prone to a number of biases such as memory biases. However, PROMs are quick and easy to administer. They are therefore an effective way to reach larger groups at low costs.
What This Study Adds
The ABILHAND-Stroke Benin questionnaire can be administered either by self-report or by performance observation procedure;
The self-report method yields significantly lower values compared with the clinician reported performance-based method. However, these discrepancies did not alter significantly the concordance of both methods in the ABILHAND-Stroke Benin questionnaire;
Self-report method might be especially useful for initial screening purposes in clinical routine, but for further investigation or for research purposes the performance observation-based procedure should be preferred over self-report.
Footnotes
Author Contributions
All authors contributed to the study conception and design. Material preparation, data collection, and analysis were performed by O.K., M.A., V.T., G.G.T., and T.A. The first draft of the manuscript was written by O.K. and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Ethics Considerations
This study was performed in line with the principles of the Declaration of Helsinki. Approval was granted by the Biomedical Ethics Committee of University of Parakou, Benin (Date: February 28, 2022; certificate number: 0541/CLERB-UP/P/SP/R/SA). Informed consent was obtained from all individual participants included in the study.
Declaration of Conflicting Interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
