A Comparison of Test-Retest Reliability and Practice Effects of Short Portable Mental State Questionnaire and Montreal Cognitive Assessment in Patients with Stroke

Abstract

Objective

To compare the test-retest reliabilities and minimal detectable change (MDC) of the Short Portable Mental State Questionnaire (SPMSQ) and the Montreal Cognitive Assessment (MoCA) in patients with stroke.

Methods

63 patients were recruited from 1 medical center. The SPMSQ and MoCA were administered twice, 2 weeks apart.

Results

Both measures showed high intraclass correlation coefficients (SPMSQ: 0.87; MoCA: 0.89) and acceptable MDC%s (SPMSQ: 14.8%; MoCA: 19.6%). A small correlation (r = 0.30) was found between the absolute difference and average in each pair of assessments in the SPMSQ, which was close to the criterion of heteroscedasticity. A small practice effect was observed in the MoCA (Cohen’s d = 0.30).

Conclusion

The SPMSQ demonstrated smaller random measurement error and an absence of practice effect. When comparing the psychometric properties of the SPMSQ and MoCA as outcome measures for assessing cognitive function in patients with stroke, the SPMSQ appears to be a more suitable choice than the MoCA.

Keywords

short portable mental state questionnaire montreal cognitive assessment cognitive function stroke test-retest reliability random measurement error practice effect

Introduction

Cognitive decline is common in stroke survivors.^1,2 Decline in cognition may cause an inability to perform activities of daily living and influence patients’ quality of life.³ Assessing cognitive function is essential for identifying cognitive decline, selecting proper interventions, and managing treatment outcomes. Therefore, a measure that can assess patients’ cognitive function reliably is needed.

The Short Portable Mental State Questionnaire (SPMSQ) and the Montreal Cognitive Assessment (MoCA) are widely used to detect cognitive decline in the elderly, particularly dementia-related decline.^4,5 The SPMSQ is more efficient because it contains 10 items only and can be finished within 10 minutes.⁴ On the other hand, the MoCA contains 30 items assessing diverse domains of cognition (eg, executive functions and visuospatial abilities) and requires approximately 15-20 minutes for completion.⁶ Both measures have acceptable sensitivity and specificity to classify patients with cognitive decline and are useful in clinical and research settings.^7,8

Despite both SPMSQ and MoCA are used to classify patients with dementia-related cognitive decline, several studies have employed both SPMSQ and MoCA as outcome measures in patients with stroke.^9-14 Sound outcome measures should not only demonstrate good reliability and validity, but also remain unaffected by practice effects over repeated assessments.^15,16 The existing evidence suggests that the SPMSQ and MoCA have acceptable psychometric properties in non-stroke populations.^17-20 Consequently, users can employ both measures as outcome measures. However, for patients with stroke, the utility may be limited for 2 concerns. For one thing, while the test-retest reliability and the random measurement error of the MoCA have been examined in patients with stroke,²¹ those of the SPMSQ remain unknown. Moreover, such properties of the 2 measures have not been simultaneously examined, which is challenging for users to choose a better choice for cognitive assessment in patients with stroke. Additionally, the potential influence of practice effects of these 2 measures has not been investigated. It’s noted that scores in cognitive measures can increase on subsequent assessments due to familiarization with the test material rather than real improvement in cognitive function, particularly in the domains of working memory, executive functioning, and attention.^22,23 Furthermore, psychometric properties inherently depend on the specific sample which they are drawn from.²⁴ Considering the unique characteristics of stroke patients, there is no evidence supporting the psychometric properties of these 2 measures within this specific population when used as outcome measures. Without addressing these concerns, the changes in cognitive scores over repeated assessments may not be interpreted appropriately.

To address these concerns, the primary purpose of this study was to examine and compare the practice effects of the SPMSQ and MoCA in patients with stroke. The secondary purpose was to validate the test-reliability and MDCs of these measures. The findings of this study may help clinicians and researchers confirm the utility of the SPMSQ and MoCA as outcome measures when applied to patients with stroke and interpret the results appropriately. Given the exploratory nature of this study, our expectations were based on previous research indicating that both the SPMSQ and MoCA have good test-retest reliability and acceptable random measurement error in a non-stroke population.¹⁹ Additionally, we anticipated that the scores of the MoCA scores may be slightly higher on subsequent assessments due to repeated exposure to items in working memory or executive functioning, which could lead to practice effects.²³

Method

Participants

The participants were outpatients with stroke. All the participants were recruited from December 2018 to February 2019 via convenience sampling at the outpatient department of a medical center in Central Taiwan. The inclusion criteria were: 1. A diagnosis of ischemic stroke or cerebral hemorrhage; 2. Being post-stroke for at least 6 months, to ensure that the participant’s cognitive function remained stable during the enrollment and study process; 3. The ability to follow verbal instructions to complete assessments. Participants were excluded if they had: 1. A recurrent stroke between the first and second assessments of the SPMSQ and MoCA, as this could introduce additional cognitive changes not associated with the initial stroke; 2. Comorbidities like tumor or Parkinson’s disease that might independently affect cognitive performance; 3. Aphasia (as determined by medical records) due to their inability to provide oral responses, such as recalling and verbalizing words and repeating sentences; 4. Severe hearing or visual impairments, or other major diseases that might inhibit their engagement with 2 cognitive measures. The sample size was determined to be at least 10 participants to reach a minimal standard of reliability as recommended in the previous study.²⁵ However, the final target was set to at least 50 participants based on COSMIN guidelines to ensure robust results.²⁶

Procedure

The study was approved by the research ethics committee of the medical center. Informed consent was obtained from the participants personally and/or their caregivers.

The SPMSQ and MoCA were administered to the patients by a trained research assistant twice at an interval of 2 weeks. The National Institutes of Health Stroke Scale (NIHSS) was also administered by the assistant at the first assessment.

Instruments

National Institutes of Health Stroke Scale (NIHSS)

The NIHSS is commonly used to assess the severity of stroke.²⁷ The NIHSS contains 11 items, with a score range of 0-42. A higher score on the NIHSS indicates more severe stroke decline (<5, mild stroke; 5-14, moderate stroke; 15-24, moderate to severe stroke; >24, very severe stroke). The NIHSS has acceptable reliability and validity.^28,29

Short Portable Mental State Questionnaire (SPMSQ)

The SPMSQ is a screening test for cognitive decline. ⁴ The SPMSQ consists of 10 items, assessing orientation to surroundings, memory, information about current events, and the capacity to perform serial mathematical tasks. The score of the SPMSQ is computed from the total number of error items, ranging from 0-10. A higher score indicates more severe cognitive decline (0-2, normal cognitive function; 3-4, mild cognitive decline; 5-7, moderate cognitive decline; 8-10, severe cognitive decline).⁴

Montreal Cognitive Assessment (MoCA)

The MoCA was used to measure a broad area of cognitive functions, including visuospatial skills, executive functioning, language, attention, memory, and orientation to time and place.⁵ The MoCA has 30 items. The total score on the MoCA ranges from 0-30. A cut-off score of 20 to 27 might be more appropriate for detecting cognitive decline in patients with stroke.⁶ The MoCA has demonstrated sound psychometric properties.¹⁸

Data Analysis

The intraclass correlation coefficient (ICC) was used to examine the test-retest reliability of the SPMSQ and MoCA.²⁴ In the present study, the ICC was computed using a two-way random-effect model with absolute agreement (ie, ICC_2,1).³⁰ An ICC value ≥0.90 indicated high reliability; 0.80-0.89, good reliability; and 0.70-0.79, fair reliability.²⁴

Bland-Altman plots were used to visually examine the agreement between 2 assessments of the 2 measures.³¹ The differences between the 2 assessments (y-axis) were plotted against the average scores of both assessments (x-axis) for each participant.³² The variations in the differences between 2 assessments were described using 95% limits of agreement (LOA), which were calculated as the averaged difference $\pm 1.96 \times {SD}_{d i f f e r e n c e}$ . Bland-Altman plots were also used to visualize the heteroscedasticity, which represents a tendency of differences in repeated assessments to increase or decrease as the average scores of the assessment increase.³³ To determine whether heteroscedasticity existed, Pearson’s r between the absolute difference and the average in each pair of assessments was used. If r > 0.30, heteroscedasticity was considered to be present.³⁴

The minimal detectable change (MDC) was calculated to examine the random measurement errors of the 2 measures.³⁵ The MDC is the smallest real difference in score that is not due to random measurement error at a certain confidence level (CI), usually set at 95%.³⁶ The MDC was calculated from the standard error of measurement (SEM), using the following formulas

MDC = {z score}_{l e v e l o f c o n f i d e n c e} \times \sqrt{2} \times SEM

SEM = {SD}_{a l l t e s t i n g s c o r e s} \times \sqrt{(1 - I C C)}

In these formulas, the z score represents the confidence interval from the standard normal distribution (ie, 1.96 for 95% confidence interval in our study). The

\sqrt{2}

indicates the additional uncertainty caused by the use of difference scores across the 2 time points. The SD was calculated based on all scores in the 2 assessments.

The MDC% was calculated to show the relative amount of random measurement error. The MDC% was the MDC divided by the total scores of the 2 measures respectively and multiplied by 100%. An MDC% < 30% was considered acceptable, and <10%, excellent.³⁷

The paired t test and effect size (Cohen’s d) was used to evaluate the practice effect of the 2 measures. The paired t test was used to examine whether there was a significant difference between 2 assessments. The effect size was used to estimate the magnitude of the difference between 2 assessments. An effect size value ≥0.80 indicated large practice effect; 0.50-0.79, medium; 0.20-0.49, small; and <0.20, trivial. ³⁸

The correlation between the scores of the 2 cognitive measures and time since onset was investigated using Spearman’s ρ. This analysis was conducted to support the assumption for examining reliability and practice effects (ie, patients had no notable change in clinical characteristics with time since onset, so their cognitive function were assumed to be stable during the study periods).

Results

Sixty-five patients were eligible for the study, and 2 of them were excluded after preliminary screening based on exclusion criteria. A total of 63 patients completed the 2 assessments (Figure 1). The demographic and clinical characteristics of the participants are shown in Table 1. The mean age of all participants was 64 years old, with a range of 20 to 90, and 44 (about 70%) were male. In general, the patients were in the chronic stage (mean years between onset and the first assessment = 2.3, ranging from 0.5 to 10.5) and had moderate stroke (mean NIHSS scores = 6, ranging from 1 to 20). The mean score of the MoCA at the first assessments was 20.6, indicating that the participants had mild cognitive impairment.

Figure 1.

The flowchart of the study.

Table 1.

Demographic and Clinical Characteristics of the Patients in This Study (n = 63).

Characteristics	Values
Gender, n (%)
Male	44 (69.8%)
Female	19 (30.2%)
Stroke type, n (%)
Hemorrhagic	25 (39.7%)
Ischemic	35 (55.6%)
Subarachnoid hemorrhage	1 (1.6%)
Others	2 (3.2%)
Side of lesion, n (%)
Left	36 (57.1%)
Right	27 (42.9%)
Age, mean (SD), year	64.4 (11.4)
Years between onset and 1^st assessment, mean (SD)	2.9 (2.2)
SPMSQ score at 1^st assessment, mean (SD)	0.9 (1.5)
MoCA score at 1^st assessment, mean (SD)	20.6 (6.0)
NIHSS Score, mean (1^st–3^rd quartile)	6.0 (4.0–7.5)

SD: standard deviation; SPMSQ: Short Portable Mental State Questionnaire; MoCA: Montreal Cognitive Assessment; NIHSS: National Institutes of Health Stroke Scale.

At both assessments, the correlations between the time since onset and the scores in SPMSQ and MoCA were negligible (SPMSQ: ρ = 0.00 and 0.09 at the first and second assessment, respectively; MoCA: ρ = −0.08 and −0.12, respectively). These results indicated that the patients’ cognitive decline was not related to the time since onset, supporting the assumption that their cognitive functions were stable during the study periods.

Table 2 shows the results of test-retest reliability and random measurement error of the SPMSQ and MoCA. The ICCs of the SPMSQ (0.87, 95% CI = 0.79-0.92) and MoCA (0.89, 95% CI = 0.62-0.95) were similar. The SPMSQ had a lower MDC% (14.8%) than the MoCA (19.6%). Both the 2 measures had acceptable MDC%s.

Table 2.

Test-Retest Reliability and Random Measurement Error of the SPMSQ and MoCA (n = 63).

Measure	1^st assessment	2^nd assessment	Difference	ICC	SEM	MDC
Measure	Mean (SD)	Mean (SD)	Mean (SD)	(95% CI)	SEM	(MDC%)
SPMSQ	0.9 (1.5)	0.9 (1.4)	0.0 (0.8)	0.87 (0.79-0.92)	0.5	1.5 (14.8)
MoCA	20.6 (6.0)	22.5 (6.5)	1.9 (2.4)	0.89 (0.62-0.95)	2.1	5.9 (19.6)

SD: standard deviation; ICC: intraclass correlation coefficients; CI: confidence interval; SEM: standard error of measurement; MDC: minimal detectable change; SPMSQ: Short Portable Mental State Questionnaire; MoCA: Montreal Cognitive Assessment.

The LOA of the SPMSQ ranged from −1.5 to 1.5 (Figure 2), while the LOA of the MoCA ranged from −2.9 to 6.6 (Figure 3). The Pearson’s r between the absolute difference and average in each pair of assessments in the SPMSQ and MoCA were 0.30 (P = 0.019) and 0.03 (P = 0.812), respectively.

Figure 2.

Bland–Altman plot of the difference in scores against the mean scores of the SPMSQ (n = 63). The solid line represents the mean of the differences (0.0). The 2 dashed lines define the limits of agreement (d± 1.96 × SD = −1.5 to 1.5).

Figure 3.

Bland–Altman plot of the difference in scores against the mean scores of the MoCA (n = 63). The solid line represents the mean of the differences (1.9). The 2 dashed lines define the limits of agreement (d± 1.96 × SD = −2.9 to 6.6).

The results of practice effect are listed in Table 3. The scores of the MoCA were significantly different across the 2 assessments (P < 0.001). However, the scores of the SPMSQ were not significantly different. In addition, the practice effect of the MoCA was small (d = 0.30), while that of the SPMSQ was trivial (d = 0.00).

Table 3.

Practice Effects of the SPMSQ and MoCA (n = 63).

Measure	Difference	t	P	Cohen's d
Measure	Mean (SD)	t	P	Cohen's d
SPMSQ	0.0 (0.8)	0.0	1.000	0.00
MoCA	1.9 (2.4)	−6.2	<0.001	0.30

SD: standard deviation; SPMSQ: Short Portable Mental State Questionnaire; MoCA: Montreal Cognitive Assessment.

Discussion

We found that the SPMSQ and MoCA had high ICCs (0.87 and 0.89, respectively). The results indicate that both measures have good test-retest reliability in patients with stroke. The finding of the MoCA was similar to that found in a previous study (ICC = 0.85).²¹ Therefore, these measures appear to be reliable for the repeated assessment of cognitive decline in patients with stroke.

The MDC values for the SPMSQ and MoCA were determined to be 1.5 and 5.9, respectively. These results suggest that a change exceeding the MDC in repeated assessments can be considered significant, ensuring a degree of certainty (eg, 95%). Notably, the MDC value for the MoCA in our study (5.9 points) was higher than that reported in prior research (3.8 points).²¹ A plausible explanation for this discrepancy is the variance observed in MoCA scores in our cohort (standard deviations = 6.0 and 6.5) compared to those in the previous study (standard deviations = 4.0 and 3.0).²¹ This difference likely contributed to the larger SEM and consequently higher MDC values in our study. Regardless, as the MDC represents the variations of differences caused by random measurement error at 2 consecutive assessments, this index can be used to determine whether the change scores indicate real changes or just the consequence of random measurement error. For the SPMSQ, for example, the data indicated that a variation of less than 2 points in a patient’s score does not typically signify a genuine improvement or deterioration, but rather could be attributed to random measurement error. Conversely, a change score surpassing 2 points is likely indicative of an actual change in status. This information enables users employing these measures to interpret results from successive evaluations with a higher degree of confidence and make informed clinical decisions accordingly.

The values of the MDC% of both measures were under 30%. Moreover, the MDC% of the SPMSQ (14.8%) was slightly smaller than that of the MoCA (19.6%). These findings indicate that the scores provided by the SPMSQ may be less affected by random measurement error than those of the MoCA. The actual cause of the differences remains unclear and requires further investigations. However, based on the current findings, the SPMSQ may have yielded a more stable score than the MoCA. The SPMSQ seems to be an appropriate choice for reliably measuring patients’ cognitive decline.

In terms of heteroscedasticity, the correlation between the absolute value of the difference and the mean of the SPMSQ was 0.30, while that of the MoCA was 0.03. The result of the SPMSQ was close to the criterion of the presence of heteroscedasticity. This finding indicates that the amounts of random measurement error for the SPMSQ may vary according to the patient’s cognitive decline. According to suggestions from a previous study, if heteroscedasticity is present, a fixed value of MDC is not a proper threshold for determining a real change between repeated assessments.³⁹ If prospective users are concerned about such an issue, the MDC% is suggested for adjusting the random measurement error. Specifically, the MDC% of the SPMSQ can be multiplied by the patients’ initial score of the SPMSQ to achieve an adjusted MDC value. For instance, if a patient has an initial SPMSQ score of 5 points, a change of more than 0.74 (5 × 14.8%) is needed to indicate a real change. These adjustments will enable clinicians and researchers to more accurately interpret changes in SPMSQ scores for patients across repeated assessments. However, given that our study is the first to discover slight heteroscedasticity in the SPMSQ, further studies are warranted.

Small practice effects were observed in the MoCA (d = 0.30), but not in the SPMSQ (d = 0.00). Our results indicate that when patients’ scores of the MoCA increase at the second assessment, their real cognitive function might not have actually improved. On the contrary, the scores of the SPMSQ may be not affected by the practice effect. The most likely explanations for this discrepancy may be due to the following. First, the participants obtained low scores on the SPMSQ (mean score = 0.9), so their scores could not be further reduced by the practice effects at the second assessment. Second, the participants might have developed familiarity with the items in the MoCA.²³ For example, in the items of short-term memory, the participants might have remembered the 5 words, so they showed better performances in the repeated assessments. Using alternate forms may be a solution for reducing such effects (eg, adding more items),⁴⁰ which can be examined in future studies. Given that patients’ scores in the MoCA may increase due to repeated administration, it may cause users to overestimate improvement or underestimate deterioration of cognitive function in patients with stroke. Users should interpret the change scores of the MoCA conservatively in routine assessments, or consider expanding the interval between assessments to reduce practice effects. Alternatively, the SPMSQ appears to be a more suitable measure for assessing cognitive decline repeatedly in patients with stroke.

In light of the aforementioned findings, we recommend the SPMSQ as a more reliable measure for assessing cognitive decline in patients with stroke. This recommendation is grounded in the observed smaller random measurement error of the SPMSQ compared to the MoCA, coupled with the absence of a practice effect in the SPMSQ. Nevertheless, it is crucial to recognize that both the SPMSQ and MoCA may not comprehensively address all facets of cognitive assessment in stroke patients, particularly concerning language impairments (eg, aphasia) that can occur post-stroke.⁶ While recognizing these limitations, our findings provide a reference for those choosing between these 2 measures, suggesting that the SPMSQ could be a more appropriate option for obtaining stable scores in repeated assessments.

This study has 4 limitations. First, convenience sampling was used. The participants were recruited from only 1 medical center. This sampling protocol may have limited the generalizability of our findings to other institutions. Second, our participants, on average, had slightly impaired cognitive function (the mean score of the MoCA was 20.6 points at the first assessment). Thus, our results may not be directly generalized to patients with other levels of cognitive function. Future studies may be needed to investigate patients with varying levels of cognitive function to further confirm our findings. Third, the patients who had aphasia were excluded due to their inability to verbal tasks in these 2 measures. Therefore, the current findings of our study may have been overestimated and may not be generalized to patients with aphasia. Forth, the length of the time interval may have negatively influenced the magnitude of the practice effect. As such, our results for the practice effect cannot be generalized to other time intervals. Further studies should examine the practice effect at different time intervals (eg, over 2 weeks).

Conclusion

When comparing the psychometric properties of the SPMSQ and MoCA as outcome measures, our findings indicate that the SPMSQ may be a more suitable choice for patients with stroke than the MoCA, due to the relatively lower random measurement error and absence of practice effect. Our findings also indicate that the values of random measurement error of the SPMSQ appear different along the continuum of the patients’ function. Thus, the change scores of the SPMSQ are suggested to be interpreted conservatively; or the adjusted MDC value is suggested for determining a real change between repeated assessments.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by a research grant from the Changhua Christian Hospital (CSMU-CCH-110-04).

ORCID iD

Yi-Ching Wang

Data Availability Statement

The dataset used in this study remains confidential and is not available to the public due to ethical constraints. Based on the signed informed consent, participants of this study did not agree their data could be shared to the public.

References

Sun

Tan

. Post-stroke cognitive impairment: epidemiology, mechanisms and management. Ann Transl Med. 2014;2(8):80. doi:10.3978/j.issn.2305-5839.2014.08.05.

de Haan

Nys

Van Zandvoort

. Cognitive function following stroke and vascular cognitive impairment. Curr Opin Neurol. 2006;19(6):559-564. doi:10.1097/01.wco.0000247612.21235.d9.

Nys

van Zandvoort

de Kort

, et al. The prognostic value of domain-specific cognitive abilities in acute first-ever stroke. Neurology. 2005;64(5):821-827. doi:10.1212/01.WNL.0000152984.28420.5A.

Pfeiffer

. A short portable mental status questionnaire for the assessment of organic brain deficit in elderly patients. J Am Geriatr Soc. 1975;23(10):433-441. doi:10.1111/j.1532-5415.1975.tb00927.x.

Nasreddine

Phillips

Bedirian

, et al. The Montreal Cognitive Assessment, MoCA: a brief screening tool for mild cognitive impairment. J Am Geriatr Soc. 2005;53(4):695-699. doi:10.1111/j.1532-5415.2005.53221.x.

Chiti

Pantoni

. Use of Montreal cognitive assessment in patients with stroke. Stroke. 2014;45(10):3135-3140. doi:10.1161/STROKEAHA.114.004590.

Schweizer

Al-Khindi

Macdonald

. Mini-Mental State Examination versus Montreal Cognitive Assessment: rapid assessment tools for cognitive and functional outcome after aneurysmal subarachnoid hemorrhage. J Neurol Sci. 2012;316(1-2):137-140. doi:10.1016/j.jns.2012.01.003.

Pendlebury

Mariz

Bull

Mehta

Rothwell

. MoCA, ACE-R, and MMSE versus the national institute of neurological disorders and stroke-Canadian stroke network vascular cognitive impairment harmonization standards neuropsychological battery after TIA and stroke. Stroke. 2012;43(2):464-469. doi:10.1161/STROKEAHA.111.633586.

Faria

Pinho

Bermudez

IBS

. A comparison of two personalization and adaptive cognitive rehabilitation approaches: a randomized controlled trial with chronic stroke patients. J NeuroEng Rehabil. 2020;17(1):78. doi:10.1186/s12984-020-00691-5.

10.

Taravati

Capaci

Uzumcugil

Tanigor

. Evaluation of an upper limb robotic rehabilitation program on motor functions, quality of life, cognition, and emotional status in patients with stroke: a randomized controlled study. Neurol Sci. 2022;43(2):1177-1188. doi:10.1007/s10072-021-05431-8.

11.

Markle-Reid

Orridge

Weir

, et al. Interprofessional stroke rehabilitation for stroke survivors using home care. Can J Neurol Sci. 2011;38(2):317-334. doi:10.1017/s0317167100011537.

12.

Yin

Liu

Zhang

, et al. Effects of rTMS treatment on cognitive impairment and resting-state brain activity in stroke patients: a randomized clinical trial. Front Neural Circuits. 2020;14:563777. doi:10.3389/fncir.2020.563777.

13.

Cumming

Brodtmann

Darby

Bernhardt

. Cutting a long story short: reaction times in acute stroke are associated with longer term cognitive outcomes. J Neurol Sci. 2012;322(1-2):102-106. doi:10.1016/j.jns.2012.07.004.

14.

Marzolini

McIlroy

Brooks

. The effects of an aerobic and resistance exercise training program on cognition following stroke. Neurorehabil Neural Repair. 2013;27(5):392-402. doi:10.1177/1545968312465192.

15.

Aldridge

Dovey

Wade

. Assessing test-retest reliability of psychological measures persistent methodological problems. Eur Psychol. 2017;22(4):207-218. doi:10.1027/1016-9040/a000298.

16.

Mokkink

Terwee

Patrick

, et al. The COSMIN study reached international consensus on taxonomy, terminology, and definitions of measurement properties for health-related patient-reported outcomes. J Clin Epidemiol. 2010;63(7):737-745. doi:10.1016/j.jclinepi.2010.02.006.

17.

Cooley

Heaps

Bolzenius

, et al. Longitudinal change in performance on the Montreal cognitive assessment in older adults. Clin Neuropsychol. 2015;29(6):824-835. doi:10.1080/13854046.2015.1087596.

18.

Bruijnen

Dijkstra

BAG

Walvoort

SJW

, et al. Psychometric properties of the Montreal Cognitive Assessment (MoCA) in healthy participants aged 18-70. Int J Psychiatry Clin Pract. 2020;24(3):293-300. doi:10.1080/13651501.2020.1746348.

19.

Lee

Lin

Chiu

. A comparison of test-retest reliability of four cognitive screening tools in people with dementia. Disabil Rehabil. 2022;44(15):4090-4095. doi:10.1080/09638288.2021.1891466.

20.

Hung

Lin

Chen

Tsay

. Responsiveness, minimal clinically important difference, and validity of the MoCA in stroke rehabilitation. Occup Ther Int. 2019;2019:2517658. doi:10.1155/2019/2517658.

21.

Lau

Lin

, et al. Reliability of the Montreal cognitive assessment in people with stroke. Int J Rehabil Res. 2024;47(1):46-51. doi:10.1097/MRR.0000000000000612.

22.

Bartels

Wegrzyn

Wiedl

Ackermann

Ehrenreich

. Practice effects in healthy adults: a longitudinal study on frequent repetitive cognitive testing. BMC Neurosci. 2010;11:118. doi:10.1186/1471-2202-11-118.

23.

Calamia

Markon

Tranel

. Scoring higher the second time around: meta-analyses of practice effects in neuropsychological assessment. Clin Neuropsychol. 2012;26(4):543-570. doi:10.1080/13854046.2012.680913.

24.

Aaronson

Alonso

Burnam

, et al. Assessing health status and quality-of-life instruments: attributes and review criteria. Qual Life Res. 2002;11(3):193-205. doi:10.1023/a:1015291021312.

25.

Bujang

Baharum

. A simplified guide to determination of sample size requirements for estimating the value of intraclass correlation coefficient: a review. Archives of orofacial science. 2017;12(1):1-11.

26.

Mokkink

Prinsen

Patrick

, et al. COSMIN Study Design checklist for Patient-reported outcome measurement instruments. Amsterdam, The Netherlands. 2019;2019:1-32.

27.

Kasner

Chalela

Luciano

, et al. Reliability and validity of estimating the NIH stroke scale score from medical records. Stroke. 1999;30(8):1534-1537. doi:10.1161/01.str.30.8.1534.

28.

Dewey

Donnan

Freeman

, et al. Interrater reliability of the National Institutes of Health Stroke Scale: rating by neurologists and nurses in a community-based stroke incidence study. Cerebrovasc Dis. 1999;9(6):323-327. doi:10.1159/000016006.

29.

Zandieh

Kahaki

Sadeghian

, et al. The underlying factor structure of National Institutes of Health Stroke scale: an exploratory factor analysis. Int J Neurosci. 2012;122(3):140-144. doi:10.3109/00207454.2011.633721.

30.

Koo

. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. 2016;15(2):155-163. doi:10.1016/j.jcm.2016.02.012.

31.

Bland

Altman

. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet. 1986;1(8476):307-310. doi:DOI 10.1016/s0140-6736(86)90837-8

32.

Chhapola

Kanwal

Brar

. Reporting standards for Bland-Altman agreement analysis in laboratory research: a cross-sectional survey of current practice. Ann Clin Biochem. 2015;52(Pt 3):382-386. doi:10.1177/0004563214553438.

33.

Bland

Altman

. Measuring agreement in method comparison studies. Stat Methods Med Res. 1999;8(2):135-160. doi:10.1177/096228029900800204.

34.

Cohen

. A power primer. Psychol Bull. 1992;112(1):155-159. doi:10.1037//0033-2909.112.1.155.

35.

Schuck

Zwingmann

. The ‘smallest real difference’ as a measure of sensitivity to change: a critical analysis. Int J Rehabil Res. 2003;26(2):85-91. doi:10.1097/00004356-200306000-00002.

36.

Schreuders

Roebroeck

Goumans

van Nieuwenhuijzen

Stijnen

Stam

. Measurement error in grip and pinch force measurements in patients with hand injuries. Phys Ther. 2003;83(9):806-815.

37.

Smidt

van der Windt

Assendelft

, et al. Interobserver reproducibility of the assessment of severity of complaints, grip strength, and pressure pain threshold in patients with lateral epicondylitis. Arch Phys Med Rehabil. 2002;83(8):1145-1150. doi:10.1053/apmr.2002.33728.

38.

Cohen

. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. New Jersey: Lawrence Erlbaum Associates; 1988.

39.

Yang

Wang

Lee

Chen

Hsieh

. A comparison of test-retest reliability and random measurement error of the Barthel Index and modified Barthel Index in patients with chronic stroke. Disabil Rehabil. 2022;44(10):2099-2103. doi:10.1080/09638288.2020.1814429.

40.

Costa

Fimm

Friesen

, et al. Alternate-form reliability of the Montreal cognitive assessment screening test in a clinical setting. Dement Geriatr Cogn Disord. 2012;33(6):379-384. doi:10.1159/000340006.