Abstract
Background
The Parkinson's Disease (PD) Home Diary (HD) is a common clinical outcome measure, but studies show only fair agreement between clinical observer and patient assessments, with no significant improvement after patient training.
Objectives
To investigate the agreement between a clinical observer and relatives of PD patients when assessing the patient's motor status in the HD. Agreement was also assessed for relative-patient and patient-observer pairs.
Methods
This observational study included 28 PD patients with motor fluctuations and their relatives. It involved a screening visit with structured training on motor fluctuations and one day of motor ratings, where the observer, relative, and patient independently assessed the patient's motor state in the HD half-hourly.
Results
Observer, patient, and relative triads completed 445 HD assessment sets. Temporal agreement was fair for observer-relatives (Cohen's κ = 0.250) and relatives-patients (κ = 0.230), but slight for patients-observer (κ = 0.120). For observer-relatives, agreement was highest for “On without dyskinesia” (71%), and lowest for “Off” (26%). Daily time distributions differed significantly between relatives and the clinical observer for “Off” (p = 0.006) and “On without dyskinesia” (p = 0.012), but not for “On with dyskinesia” (p = 1.000).
Conclusions
This study reports fair temporal agreement of motor state assessments between relatives-observer and relatives-patients, with slight agreement between patients-observer. Relatives’ assessments of daily time in different motor states showed significant differences from the clinical observer assessments. This further highlights the challenges in obtaining reliable motor status data and the need for further research into objective assessment methods.
Plain language summary
Purpose?
Parkinson's disease patients often develop motor fluctuations, switching between feeling well (“On”), feeling worse (“Off”), and having involuntary movements (“On with dyskinesia”). To track these fluctuations, patients often use a Home Diary to record their motor state every 30 min. The diary is widely used both in clinical care and in research. However, previous studies show that patients and healthcare professionals often disagree when simultaneously assessing the patients motor state in the diary. This can lead to incorrect treatment decisions or misleading study results. Since relatives of Parkinson patients may notice the patients’ symptoms differently, involving them might give a more accurate view of the patient's condition.
The study aimed to find out how well clinicians and relatives agree when rating the patient's motor state using the Home Diary. It also investigated how patients agree with clinicians and relatives.
How?
The study included 28 Parkinson's patients and their relatives. First, they attended a screening visit where they also participated in an education about motor fluctuations. Next, they spent a full day (8:30 AM–4:00 PM) at the clinic. During this time, the patient, their relative, and the clinician independently recorded the patient's motor state every 30 min using the Home Diary.
Findings?
Clinicians and relatives agreed on the patient's motor state 52% of the assessments. Agreement was highest for “On without dyskinesia” (71%), followed by “On with dyskinesia” (52%). However, they only agreed on the “Off” state in 26% of cases. Relatives and patients agreed 56% of the time, while patients and clinicians agreed 44% of the time.
Meaning?
The study shows that agreement on motor state ratings is low between clinician, patients, and relatives. This highlights the difficulty of obtaining reliable information about Parkinson patients’ motor fluctuations and the need for more reliable ways to assess them.
Introduction
Parkinson's disease (PD) is the most common movement disorder, 1 and after five years with the disease more than 50% of patients develop motor fluctuations. 2 These fluctuations have been shown to negatively affect PD patients’ life quality.3,4 The PD Home Diary (HD) is a widely used tool for assessing motor fluctuations and dyskinesia. 5 The diary is filled in half-hourly by the patient and categorizes motor symptoms into four distinct states: “Off”, “On without dyskinesia”, “On with non-troublesome dyskinesia”, and “On with troublesome dyskinesia”. 6
Since its development in 2000, 5 the HD has been used as a central endpoint measure in many clinical trials on PD. 7 However, the gold standard for objectively assessing motor function in PD is still considered to be the evaluation by an experienced observer. Until recently, the HD had not been validated against this standard. To address this, the collaborative VALIDATE-PD project between Sweden and Germany was initiated to validate the HD by assessing the agreement between observer and HD ratings.8,9 Studies with similar designs were conducted in both countries, and both found that the agreement between observer assessments and patient HD ratings was only fair, with poor temporal agreement. An extension of the Swedish study investigated the effect of structured patient training on the agreement between observer and HD ratings, 10 demonstrating no significant improvement in the overall agreement, but a trend towards better detection of dyskinesias was observed.
Previous studies indicate that PD patients often have low awareness of dyskinesias,11,12 likely due to metacognitive deficits in the self-monitoring system. 11 Also, a 2024 literature review found that some data identifies salience and frontoparietal network regions as linked to motor state unawareness in PD patients. 13 Amanzio et al. 11 observed that awareness of hypo-bradykinesia appeared to be better preserved than dyskinesia awareness. However, the first two studies validating the HD found that only 60% of HD ratings when observed “Off” were in agreement with the simultaneous observer assessment, indicating that patients also have limited awareness of being in “Off”.8,9 Additionally, the study assessing the impact of structured training on patient-observer agreement in the HD found that while agreement for the “Off” state remained around 60% both before and after training, agreement for “On with dyskinesia” improved from 58% to 80% following the training, though this change was not statistically significant. 10 This suggests that limited awareness of dyskinesias may partly stem from a misunderstanding of what dyskinesias are.
Collecting reliable data on motor fluctuations in PD remains challenging. As patient diaries are widely used in both clinical practice and research,7,14 improving their accuracy or considering alternative assessment methods is essential. However, their validity relies on the patient's ability to recognize motor states, an ability often found to be limited.8–12 Clinical experience suggests that patients and their relatives often perceive motor fluctuations differently. Involving a relative in completing the HD may therefore provide a more accurate view of the patient's motor state.
The primary aim of this study is to investigate the agreement between a clinical observer and relatives of PD patients when assessing the patient's motor status in the HD. Secondarily, the study aims to investigate the agreement between relatives and patients as well as between patients and the observer.
Methods
This observational study is part of the VALIDATE-PD project, aiming to validate the HD for assessing motor fluctuations in PD patients.8–10,15 This study follows the same design and uses the same statistical tests as previous studies but also includes HD assessments completed by the patient's relative. The study was approved by the Swedish Ethical Review Authority (Dnr 2023-02986-01) and performed in line with the principles of the Declaration of Helsinki. Written informed consent was obtained from all participants (patients and relatives).
Participation criteria
Patients were eligible for study inclusion if they had a PD diagnosis per the Parkinson and Movement Disorder Society (MDS) Criteria, 16 experienced motor fluctuations according to a neurologist's assessment or the MDS-sponsored revision of the Unified Parkinson Disease Rating Scale (MDS-UPDRS) part IV, and were able to complete patient diaries and provide informed consent. To be included in the study, the patients should have a close relative that they spend a significant amount of time with (e.g., not a close family member that they mainly talk to on the phone).
Exclusion criteria included secondary or atypical parkinsonian syndromes. The pair was excluded from the study if either of them showed signs of dementia (Montreal Cognitive Assessment [MoCA] < 21), psychotic symptoms, inability to complete diaries and questionnaires, or lack of cooperation. Additionally, any condition hindering the patient's or relative's ability to consent, participate, or undergo clinical assessment led to exclusion.
Participant selection
A list of PD patients with motor fluctuations, normal cognition, and a home clinic in Skåne County was obtained from the Swedish National Quality Registry for PD (www.neuroreg.se). Potential participants received study information by mail and were later contacted by phone. Interested individuals were invited for a screening visit to assess eligibility and provide informed consent.
Instruments and assessments
The MoCA was used to screen for cognitive impairment. 17 The MDS-UPDRS assessed PD symptoms, with higher scores reflecting more severe symptoms (max score: 272). 18 In the HD, the motor states available for selection by patients, relatives, and the observer were “Asleep”, “Off”, “On without dyskinesia”, “On with non-troublesome dyskinesia”, and “On with troublesome dyskinesia”. 6 The latter two categories were replaced by “On with dyskinesia” in the analyses, as the distinction between “troublesome” and “non-troublesome” is inherently subjective, and neither the observer nor the relative can reliably make that judgment without input from the patient.
Study design
All participants completed a screening visit, a structured training about motor fluctuations, and one office-hour day of on-site ratings. The study design is depicted in Figure 1.

MoCA = Montreal Cognitive Assessment. MDS-UPDRS = Movement Disorder Society sponsored revision of the Unified Parkinson Disease Rating Scale.
Screening visit
The screening visit included cognitive assessment using the MoCA and clinical evaluation with the MDS-UPDRS. Both patients and relatives completed the MoCA. Additionally, demographic and clinical characteristics were collected.
Patient training
After study inclusion, participants (patients and their relative) received an approximately 50-min-long training on motor symptoms and fluctuations. This included a wordlist with explanations of key motor symptom terminology and an image illustrating how motor fluctuations relate to levodopa plasma concentrations (see Supplementary Material). Subsequently, they watched a training video on motor fluctuations and the use of on/off diaries, 19 with the spoken content translated into Swedish by the rater. Throughout the video, participants actively engaged by answering questions, such as identifying the motor states of the individuals shown. The training covered diverse aspects of the “off” state, including its motor manifestations (such as slowness, stiffness, tremor, hypomimia, hypophonia, and balance problems) and associated non-motor aspects. The session concluded with a discussion about the patient's own motor symptoms and fluctuations. This included discussing: 1) The unique clinical presentation and timing of the individual patient's “off” symptoms. 2) The presence, timing, and characteristics of any experienced dyskinesias.
Observation day after training
During the full day on-site (8:30 AM–4:00 PM), participants performed a 7-meter Timed Up and Go test (TUGT) every 30 min. 20 After each walk, the patient, their relative, and the observer independently assessed the motor state in the HD. Participants were instructed to base their motor state assessment not solely on gait function but to integrate all factors discussed during the training session. Observations encompassed the entire period, including moments when the patient was sitting before the TUGT, performing the test, and engaging in conversation. The observer spent time in the room in connection with the TUGT to ensure their rating incorporated this overall assessment. Since the observer was not present all the time, participants based their motor state assessment on all observations made only while the observer was present. The author CJ, a Medical Doctor and PhD student with four years of experience in clinical PD research, functioned as the trainer and observer. CJ completed the MDS-UPDRS training program prior to the study.
Statistical analysis
Values are provided as medians (interquartile range [IQR]). Pairwise exclusion was used for missing values. Levodopa equivalent daily doses (LEDD) were calculated according to Jost et al. 21 The agreement between the patient/relative and the observer, as well as between the patient and relative, was calculated using percentages and Cohen's kappa (κ). Weighted kappa (κw) was calculated to take different levels of disagreement into consideration. Agreement was interpreted as slight (< 0.20), fair (0.21–0.40), moderate (0.41–0.60), substantial (0.61–0.80) or almost perfect agreement (0.81–1.00). 22 McNemar-Bowker test with post-hoc McNemar test with Bonferroni adjustment was used to test for symmetry of disagreements between the rating procedures. The Friedman test with post-hoc Wilcoxon Rank test with Bonferroni adjustment was used to investigate differences in the daily time proportions (8:30 AM–4:00 PM) spent in different motor states between observer, relative and patient.
Pearson's correlation test and intraclass correlation coefficient (ICC) estimation were used for correlations of daily times spent in the various motor states on the participant level. Pearson's correlation coefficient │r│ < 0.3 was considered a weak, │r│ = 0.3–0.59 a moderate and │r│ ≥ 0.6 a strong agreement/correlation. ICC estimates and their 95% confidence intervals (95%CI) were calculated based on single-rating, absolute-agreement, 2-way mixed-effects models with two rating instruments across all participants. According to the guideline by Cichetti, 23 we interpret ICC < 0.40 as poor, ICC = 0.40–0.59 as moderate, ICC = 0.60–0.74 as good and ICC = 0.75–1.00 as excellent reliability. p < 0.05 was considered statistically significant. IBM SPSS 29.0 and GraphPad Prism were used to perform statistical analyses and to build graphs. Biorender was used to create Figure 1.
Results
Demographic and clinical data
Study information was sent to 82 patients, who were subsequently contacted by phone. 29 relative-patient pairs attended a screening visit, with one pair excluded due to cognitive impairment, leaving 28 pairs in the study. Demographic and clinical characteristics are presented in Table 1. The median LEDD was 900 mg (IQR: 600–1179), and median Hoehn and Yahr stage were 2,5 (IQR: 2–3). Of the relatives, 25 were partners and 3 were children of the patients.
Demographic and clinical characteristics.a
Presented as median (IQR, interquartile range) or percentages.
MDS-UPDRS, Movement Disorder Society sponsored revision of the Unified Parkinson Disease Rating Scale.
MoCA, Montreal Cognitive Assessment.
Normal cognition, MoCA > 25.
Mild cognitive impairment; MoCA 21–25.
Dementia, MoCA < 21.
Proportion of ratings and temporal agreement of motor state ratings
Out of the expected 448 sets of clinical observer, relative and patient HD ratings, 445 were completed (99%). Ratings were distributed between “Off”, “On without dyskinesia”, and “On with dyskinesia”. Overall analysis using the McNemar-Bowker test with Bonferroni adjustments showed significant differences in motor state rating distribution between all three pairs: relatives-observer (p < 0.001), patients-observer (p < 0.001), and relatives-patients (p = 0.021). Figure 2A shows the proportion of ratings in each motor state. As illustrated in the figure, the “Off” state was reported by relatives in 11% of ratings and by patients in 9%. In contrast, the observer identified this state in 24% of the ratings. The distribution of motor state ratings for both relatives and patients deviated significantly from the observer-rated distributions (p < 0.001), for the “Off” and “On without dyskinesia” states. However, proportions of “On with dyskinesia” ratings from relatives (p = 0.321) and patients (p = 0.612) did not differ significantly from the observer ratings. No significant differences in motor state distribution were observed between patients and relatives for neither “Off” (p = 0.423), “On without dyskinesia” (p = 1.000) or “On with dyskinesia” (p = 1.000).

(A) Proportion of “Off”, “On without dyskinesia” and “On with dyskinesia” as assessed by relative, patient, and observer diaries. p-values are from McNemar-Bowker test with post-hoc McNemar test (Bonferroni-adjusted); significance: *p ≤ 0.05, **p ≤ 0.01, ***p ≤ 0.001. (B) Temporal agreement with observer ratings as reference. (C) Participants’ diary entries by observed motor state.
Inter-rater agreement was assessed using Cohen's κ. There was a fair agreement between relatives and observer (κ = 0.250; κw = 0.230), and between relatives and patients (κ = 0.230; κw = 0.230), but a slight agreement between patients and observer (κ = 0.120; κw = 0.103). When examining agreement for specific motor states, relatives-observer agreement was fair across all motor states: “Off” (κ = 0.243), “On without dyskinesia” (κ = 0.282), and “On with dyskinesia” (κ = 0.221). Relatives-patients agreement was also fair for “Off” (κ = 0.295) and “On without dyskinesia” (κ = 0.277) but slight for “On with dyskinesia” (κ = 0.156). Patients-observer agreement was lowest, with a slight agreement for “Off” (κ = 0.142) and “On without dyskinesia” (κ = 0.145), and a non-significant κ = 0.080 for “On with dyskinesia”.
The temporal agreement between relatives, patients, and the observer (used as the gold standard) when assessing the patients’ motor state in the HD is presented in Figure 2B-C. The overall agreement was highest between relatives and patients (56%) and lowest between patients and the observer (44%), with the agreement between relatives and the observer being 52%. When examining agreement across the specific motor states (Figure 2B), the agreement was highest for observed “On without dyskinesia” (71% for relatives; 62% for patients). Conversely, agreement was lowest for observed “Off” (26% for relatives; 17% for patients). Temporal agreement between patients and relatives was 32% for “Off”, 65% for “On without dyskinesia”, and 50% for “On with dyskinesia”. When observed “Off”, relatives classified patients as “On with dyskinesia” in 39% of cases, while patients classified themselves as “On with dyskinesia” in 47% of the ratings (Figure 2C). The number of observations in each observed motor state: “Off”: n = 106, “On without dyskinesia”: n = 138, and “On with dyskinesia”: n = 201.
Daily motor state times
Figure 3A shows the distribution of daily time spent in each motor state based on half-hourly diary ratings (8:30 AM–4:00 PM) from the relatives, patients, and observer. The Friedmann test with Bonferroni adjustments showed significant differences in daily time distribution between relatives, patients, and the observer. Post-hoc Wilcoxon Rank tests with Bonferroni adjustments showed significant differences for “Off” and “On without dyskinesia” (p = 0.027, p = 0.012) between relatives and the observer but no significant differences for “On with dyskinesia” (p = 1.000). Similar differences were observed between patients and the observer for “Off” and “On without dyskinesia” (p = 0.006, p = 0.012), but no differences for “On with dyskinesia” (p = 1.000). No significant differences were found between relatives and patients in daily time distribution across motor states (“Off”: p = 1.000; “On without dyskinesia”: p = 1.000; “On with dyskinesia”: p = 1.000).

(A) Distribution of time proportions in different motor states from half-hourly diary ratings (8:30 AM–4 PM). Boxplots show medians, interquartile ranges, and extremes. p-values from Friedmann tests with post-hoc Wilcoxon Rank test (Bonferroni-adjusted); *p ≤ 0.05, **p ≤ 0.01, ***p ≤ 0.001. (B-D) Correlation analyses of mean proportions of “Off” (B), “On without dyskinesia” (C) and “On with dyskinesia” (D) Solid lines show regression with 95% CI (dotted lines). Pearson correlation coefficients and p-values are reported.
Pearson correlation analyses of the individual times spent in the three motor states (Figure 3B-C) showed a moderate correlation between relatives and the observer for “On without dyskinesia” (r = 0.544, p < 0.001) and “On with dyskinesia” (r = 0.380, p = 0.023). No significant correlation was found for “Off” (r = 0.260, p = 0.091). No significant correlations were observed between patient and observer diary data for any motor state (“Off”: r = -0.069, p = 0.0364; “On without dyskinesia”: r = 0.295, p = 0.064; “On with dyskinesia”: r = 0.192, p = 0.164). A moderate correlation was found between relative and patient diary data for “On without dyskinesia” (r = 0.521, p = 0.002), but no significant correlations were observed for “Off” or “On with dyskinesia” (r = 0.102, p = 0.303; r = 0.186, p = 0.172).
Reliability analysis using ICC (Table 2) showed moderate reliability for relative diary data for “On without dyskinesia” when compared with observer diary data [ICC = 0.45(95% CI: 0.06–0.71), p = 0.001], and poor reliability for “On with dyskinesia” [ICC = 0.38(95% CI: 0.01–0.66), p = 0.022]. No significant reliability was found for “Off” [ICC = 0.22(95% CI: −0.10–0.52), p = 0.087]. Patient diary ratings showed no significant reliability for any motor state compared to observer ratings. There was a moderate reliability for patient diary data when compared with relative diary data for “On without dyskinesia” [ICC = 0.53(95% CI: 0.19–0.75), p = 0.002]. No significant reliability was found for “Off” or “On with dyskinesia”.
Reliability of the Parkinson's disease home diary according to intraclass correlation coefficient (ICC) calculation.a
ICC estimates and 95% CIs were calculated based on single-rating, absolute-agreement, 2-way mixed-effects models with two raters (patient/relative diary and observer diary) across 28 pair of participants.
Discussion
The main findings of this study were that the temporal agreement between all raters when assessing motor status on the HD ranged from slight to fair. Agreement with the clinical observer was fair for relatives, with an overall agreement of 52%, but only slight for patients. While relatives-patients agreement was better for motor state distribution and daily time proportions in each motor state, their temporal agreement remained only fair. Crucially, agreement was lowest for the “Off” state across all comparisons. Moreover, significant differences were found in the daily time distribution of “Off” and “On without dyskinesia” between relatives-observer and patients-observer. Relatives’ assessments of daily time in different motor states showed moderate reliability for “On without dyskinesia”, poor reliability for “On with dyskinesia”, and no reliability for “Off”.
The agreement was fair for relatives-observer and relatives-patients, but only slight for patients-observer. These results align with previous studies. However, a key difference from previous validation studies is the notably lower agreement for observed “Off” in this study.8–10 Both relatives and patients reported less time in “Off” and more time in “On without dyskinesia” than the observer. Although participants were instructed to assess motor status based only on observations made while the observer was present, there is a risk that ratings were influenced by the patient's motor status during unobserved time, potentially resulting in the low agreement for “Off”. However, such low agreement for “Off” has not been reported in previous Swedish validation studies using the same study design.9,10
Similar to post-training results in the structured training study, “Off” was most often misclassified as “On with dyskinesia”. 10 In contrast, validation studies without structured patient training consistently show “Off” being misclassified as “On without dyskinesia” instead.8,9 This shift suggests that while training may improve recognition of abnormal movements, participants had difficulty distinguishing them, potentially confusing tremors with dyskinesias. Also, the presence of diphasic dyskinesias in some patients may have complicated accurate reporting. Since patients transition directly between “Off” and “On with dyskinesia” without an intermediate “On without dyskinesia” phase, 24 participants may have misinterpreted the rapid shift, leading to the high misclassification of the “Off” state as dyskinetic.
Studies suggest that unawareness of motor states is more pronounced in PD patients with cognitive impairment, 25 while those with relatively intact cognition maintain better awareness of motor symptoms. 26 However, Amanzio et al. 11 found that impaired motor state awareness can also occur in patients with normal cognition. In this study, 54% of patients and 36% of relatives had mild cognitive impairment, similar to the 45–55% in prior validation studies.8–10 Hence, while cognitive impairment may have influenced agreement across studies, it does not explain the lower agreement for “Off” observed in this study. Another theory is that patients base their “Off” ratings on both motor and non-motor symptoms, while observers focus only on motor signs. However, a study using the German VALIDATE-PD data found that the limited validity of the HD could not be explained solely by the presence of concurrent non-motor symptoms. 27 This study's patients had higher overall and motor MDS-UPDRS scores and spent more time in “Off” than those in previous studies,9,10 indicating greater impairment. It is possible that patient and relative gradually shift their perception of what is normal as the patient spends more time in “Off”, reducing agreement for observed “Off”. Another possibility is that the observer sometimes misclassified “On without dyskinesias” as “Off” due to unfamiliarity with the patient, missing subtle signs that relatives and patients could detect.
When patients were observed as in “On with dyskinesia”, the most common error was misclassification as “On without dyskinesia”, a pattern consistent with previous validation studies. Prior patient-observer agreement for “On with dyskinesia” ranged from 36% to 58%, reaching 80% after structured patient training in motor fluctuations.8–10 Despite identical training in this study, the agreement was notably lower. The structured training study invited all patients from the initial Swedish validation study, but those who participated had the highest pre-training agreement with the observer. This selection bias may explain the lower agreement for “On with dyskinesia” in the current study. Moreover, while both studies reported similar daily time spent in “On with dyskinesia”, patients in this study may have less prominent dyskinesias, making detection harder. Additionally, our patients had a shorter median duration since dyskinesias began (46 months vs. 63 months). 10 A longer duration of dyskinesias may have given patients and relatives more time to recognize and understand the symptoms.
Daily time spent in different motor states is a key outcome when evaluating treatments for motor fluctuations. 7 While previous validation studies reported moderate to excellent reliability for daily times spent in the three motor states as an aggregated HD data,8–10 our findings were substantially weaker. Specifically, relatives’ assessments of daily time in different motor states showed poor to moderate reliability, with no significant reliability for “Off”. Patient diary ratings showed no significant reliability for any motor state. Also, correlations between observer and participant daily time ratings were similarly weak compared to earlier studies,9,10 only demonstrating a moderate correlation between relatives and observers for “On without dyskinesia” and “On with dyskinesia”. The low reliability and weak correlation severely limit the utility of HD data in this cohort for quantifying daily motor state time. These results may stem from variations in the study populations, such as low prior knowledge of motor fluctuations or less pronounced motor state variations, which make accurate reporting difficult. A larger sample size might also have produced results more in line with previous studies.8–10 While patients and relatives showed a better agreement in the distribution of motor states and daily time spent in different motor states than with the observer, their temporal agreement was only fair. Overlapping confidence intervals for kappa values (data not shown), 28 indicated no superior temporal agreement between any of the rater pairs.
The poor temporal agreement, the insufficient reliability of daily motor state time, and the lack of significant correlations between the observer and relatives for daily time spent in “Off”, and between the observer and patients for any motor state, is concerning. Patient diaries are the current “gold standard” for primary outcome measures in clinical trials evaluating therapies for motor fluctuations. Typically, the primary readout involves the total daily duration within each motor state. 7 Thus, the discrepancies between patients/relatives and clinical observers regarding the distribution of daily motor state times both in this study and in the German VALIDATE-PD study introduces a risk of erroneous conclusions from clinical trials. 8 This is a critical concern for drug development and highlights the need for more consistent and reliable methods to evaluate motor fluctuations. Moreover, the deficient temporal agreement between patients/relatives and the observer raises concerns that relying on their reports about motor fluctuations could result in suboptimal individualized treatment.8–10 In clinical scenarios, the patient's subjective well-being remains the primary objective. Consequently, it may not be inherently problematic if an observer perceives a patient as being in “Off”, while the patient feels they are in “On without dyskinesia”. However, a risk of undertreatment exists if patients habituate to “Off” states without conscious recognition. In such instances, the failure to identify “Off” periods precludes the optimization of treatment and well-being. Moreover, effective treatment often requires accurate distinction between symptoms. For example, if a patient reports troublesome tremors, which is characteristic of the “Off” state, the typical response is to increase the levodopa dose. If that tremor was instead a dyskinesia, which is characteristic of the “On with dyskinesia” state, the increased levodopa dose would likely worsen their dyskinesia instead of improving their motor control.
This study underscores the need for cautious data interpretation when using the HD, as the results suggest that it is highly subjective. It highlights the importance of developing and validating objective ways to monitor motor fluctuations to ensure reliable results from clinical studies and to improve individualized patient care. Several technologies are being investigated. Löhle et al. 15 examined whether a wearable accelerometer-based digital Parkinson's motor diary, using the Parkinson's KinetiGraph (PKG), could accurately detect motor status when compared to observer assessments in the HD. They found moderate validity for daily time in “Off” and “On with dyskinesia”, but poor temporal agreement. The PKG also struggled to detect unpredictable “Off” episodes. PKG data, however, aligned more closely with observer than patient diaries. 15 Another study found moderate to high concordance between PKG data and patient diaries for daily time spent in different motor states but a limited temporal agreement. 29 Moreover, a study reported a moderate ICC between patient HD data and Holter data. 30 Additionally, some studies found high validity for wearables when compared with patient HD.31,32 Advanced monitoring systems, such as the PD Monitor, may potentially provide more reliable data, but this remains to be demonstrated. 33 Another ongoing study by Ymeri et al. 34 is investigating the feasibility of measuring motor status and complications using sensor data collected from smartphones and a wrist-worn wearable device while participants perform specific tasks (such as TUGT, finger tapping, and drawing) remotely in their home environment. Future research must continue to validate digital technologies and wearables against clinical observer-rated diaries, to ensure that the generated metrics carry clear clinical significance and direct relevance to patients. Until these digital tools are fully validated and integrated into clinical research and routine practice, the subjective nature of the HD requires cautious interpretation.
This study has some limitations that should be acknowledged. Ideally, the clinical observer would have conducted more detailed assessments, such as performing the motor part of MDS-UPDRS to better differentiate “On without dyskinesia” from “Off”. However, to maintain comparability with previous studies, the same study design was used. This also ensured that both the relative and clinical observer had the same information before assessing the patient's motor state, facilitating comparisons. Preferably, data would have been collected over several days rather than a single day. This would have provided a better insight into the patients’ motor states and allowed for a larger amount of data to be gathered. Moreover, while patients and relatives were instructed not to discuss their assessments, ensuring complete avoidance of discussions was difficult, which may have influenced the results. Also, it is challenging to accurately assess motor status during transitional states (e.g., when a patient is wearing off but not fully “off”, or transitioning to “on” but not fully there). Participants were instructed to select the state that represented the patient's closest true status. However, it is important to be aware of the subjective nature of rating these transitions. In addition, a larger sample size would have increased statistical power and enhanced the reliability of the findings, but recruiting eligible participants proved challenging. Furthermore, the low participation rate (29/82 contacted patients) introduces a risk of selection bias. Participants likely represent a highly motivated group, which could inflate agreement results compared to the general population. As many non-participants cited illness or fatigue, the cohort may be biased toward individuals with less severe PD, potentially limiting the generalizability of our findings. Moreover, as in previous validation studies, the observer was not a Movement Disorder Specialist and could thus be considered less accurate than the gold standard. However, the rater had several years of experience with PD, and consistency was maintained by using the same rater for all assessments, which is a strength.
In conclusion, this study suggests that while the HD reflects the subjective experience of motor fluctuations reported by patients and their relatives, its objective reliability and temporal agreement with clinical observations is insufficient, especially for the “Off” state. Given the widespread use of the HD as a primary outcome in clinical trials, this suggested unreliability is highly concerning, implying a risk of wrong conclusions in treatment development and a suboptimal individualized patient care. Further research is needed to develop and validate technological solutions for more objective assessments of motor fluctuations in PD patients.
Supplemental Material
sj-docx-1-pkn-10.1177_1877718X261461073 - Supplemental material for Agreement between relatives of Parkinson's patients and clinical observer in home diary assessments
Supplemental material, sj-docx-1-pkn-10.1177_1877718X261461073 for Agreement between relatives of Parkinson's patients and clinical observer in home diary assessments by Carin Janz, Jonathan Timpka, Alexander Storch, Gesine Paul and Per Odin in Journal of Parkinson's Disease
Footnotes
Acknowledgements
The Restorative Parkinson Unit led by PO, thanks the Medical Faculty at Lund University, Multipark, the Swedish Parkinson Academy, the Olle Engkvist Foundation, the southern Swedish Health Care Region, the Åhlens Foundation, the Swedish Parkinson Foundation, and the Skåne University Hospital Foundation and Donations. All co-authors have been substantially involved in the study and/or preparation of the manuscript. No undisclosed groups or personnel have had a primary role in the study and/or in the manuscript preparation. All co-authors have read and approved the submitted manuscript. The listed authors have authorized the submission of their manuscript via third party and approved the statements and declarations. No ghostwriting has been conducted by anyone not listed as an author. No editorial assistance has been used writing this manuscript.
Ethical considerations
The study was approved by the Swedish Ethical Review Authority (Dnr 2023-02986-01) and performed in line with the principles of the Declaration of Helsinki.
Consent to participate
Written informed consent was obtained from all participating patients and their relatives. Each participant received a signed copy of the consent form, while the original was securely stored at the study site.
Consent for publication
Not applicable
Author's contribution
1. Research project: A. Conception, B. Organization, C. Execution;
2. Statistical Analysis: A. Design, B. Execution, C. Review and Critique;
3. Manuscript Preparation: A. Writing of the first draft, B. Review and Critique;
CJ: 1A, 1B, 1C, 2A, 2B, 3A, 3B
JT: 1A, 1B, 2A, 2C, 3B
AS: 2A, 2C, 3B
GP: 1A, 3B.
PO: 1A, 1B, 2A, 2C, 3B
Funding
CJ received funding from Elsa Schmitz Foundation and The Uppsala Faculty of Medicine Foundation for Psychiatric and Neurological Research. JT received funding from the Elsa Schmitz Foundation and the Swedish Parkinson Foundation. JT has received compensation for consultancies from AbbVie, TransPerfect and the Swedish National Board of Health and Welfare, as well as royalties from UNI-MED Verlag. AS has received funding from the Deutsche Forschungsgemeinschaft (German Research Association) and the Helmholtz-Association outside the present study. He has received honoraria for presentations/advisory boards/consultations from Global Kinetics Corporation, Esteve, Desitin, Lobsor Pharmaceuticals, STADA, Bial, RG Gesellschaft, Zambon, NovoNordisk and AbbVie outside the present study. He has received royalties from Kohlhammer Verlag and Elsevier Press. He serves as an editorial board member of Stem Cells International. GP serves as scientific advisor for NovoNordisk A/S. GP received funding from the Swedish Parkinson Foundation, the Southern Swedish Health Care Region, the Skåne University Hospital Foundation and Donations and SRA Multipark.PO received funding from the Swedish Parkinson Foundation, Åhlens Foundation, the Southern Swedish Health Care Region, the Skåne University Hospital Foundation and Donations, SRA Multipark and the Medical Faculty of Lund University. PO has received honoraria for lectures and expert advice from AbbVie, Bial, Britannia, Convatec, Ever Pharma, Global Kinetics, Insightech, Merz, Navamedic, Nordic Infucare, Stada, and Zambon.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability
The data supporting the findings of this study are available on request from the corresponding author. The data is not publicly available due to privacy or ethical restrictions.
Supplemental material
Supplemental material for this article is available online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
