Abstract
It is common for researchers in spinal cord injury (SCI) to treat limbs from the same individual as independent to increase sample size in limb-based analyses. However, this approach may violate independence assumptions required for traditional statistical tests. This study investigated the dependence of bilateral upper-limb motor and neurophysiological outcomes through a retrospective analysis of participants from the European Multicenter Study about Spinal Cord Injury. Data from 118 participants with acute cervical SCI (236 limbs) were analyzed, including ulnar compound muscle action potentials (CMAPs) recorded in the abductor digiti minimi and manual muscle scores for the flexor digitorum profundus, bilaterally assessed at 12 and 48 weeks postinjury. Our analysis stratified participants by injury severity into two groups: motor complete (n = 55) and motor incomplete (n = 63). Interlimb dependence was assessed using Spearman’s rank correlations. To illustrate the implications of such dependence for inference, bootstrap analyses of motor recovery (12 vs. 48-week CMAPs) compared three analytical approaches using a linear mixed-effects model framework: randomly selecting outcomes from one limb per participant, including both limbs but ignoring intraparticipant correlation, and including both limbs but accounting for intraparticipant correlation. Bootstrap resampling was performed across sample sizes ranging from n = 10 to 50. Moderate-to-strong correlations (Spearman’s rho = 0.44–0.91) were observed between bilateral CMAPs and bilateral strength scores in both groups at 12 and 48 weeks postinjury. All models yielded similar CMAP recovery estimates (∼1.6 to 1.8 mV), but the precision of these estimates and the empirical power, reflecting the probability of detecting a significant recovery effect given the observed effect size and variability, varied across methods. Treating limbs as independent overestimated precision, resulting in the highest power estimates, whereas including only one limb reduced statistical power, particularly at lower sample sizes. Accounting for intraparticipant dependence yielded intermediate estimates of precision and power. These findings demonstrate that bilateral upper-limb outcomes in SCI are statistically dependent and highlight the need for statistical procedures that account for intraparticipant correlation in SCI research to ensure valid study conclusions.
Keywords
Introduction
Spinal cord injury (SCI) is a devastating condition that disrupts motor, sensory, and autonomic functions.1,2 Recovery of upper-limb function, whether spontaneous or intervention-related, is typically assessed in each limb, as impairments and recovery trajectories may differ between sides. However, studies evaluating spontaneous recovery3,4 or limb-specific interventions (e.g., nerve transfer surgery to restore upper-limb function),5–8 frequently analyze limb-level outcomes as if limbs from the same individual were independent observations. This practice can increase study power when limbs, rather than participants, serve as the unit of analysis, but it may violate the independence assumptions required for many traditional statistical methods. Conversely, excluding bilateral data and analyzing only one limb per participant can sacrifice potentially valuable information and reduce statistical power, an important consideration for low-prevalence conditions such as SCI.9,10 Consequently, careful attention to how bilateral data are handled is essential, as improper treatment of interlimb dependence can substantially influence study findings.11–13
The assumption of independence in observations implies that the value of one observation does not influence the value of another and that any errors in a statistical model are unrelated. Nonetheless, researchers in several medical fields, including ophthalmology14–17 and orthopedics,18–21 have been shown to erroneously combine data from paired structures (such as two eyes or limbs) within the same individual and treat them as independent observations. Failing to account for intraindividual dependence can lead to underestimating variability, inflating the risk of false positives (type I error), and overstating the effects of treatment or recovery.
Prior studies in healthy individuals have discussed the methodological implications of treating data from bilateral limbs as independent.22,23 However, it remains unclear to what extent this issue applies in SCI research. Because SCI can vary in lesion level, severity, and laterality, bilateral outcomes may not always be strongly correlated.24,25 This variability underscores the need to empirically quantify the degree of correlation between limbs rather than presuming independence or dependence a priori. Given the low prevalence of SCI and its heterogeneous presentation, maximizing available observations while properly accounting for potential interlimb dependence is essential.
We examined the statistical dependence of bilateral upper-limb motor and neurophysiological outcomes through a retrospective analysis of participants in the European Multicenter Study about Spinal Cord Injury (EMSCI) database. Our objective was to quantify the degree of interlimb correlation in upper-limb motor and neurophysiological outcomes during spontaneous recovery. We focused on hand muscle recovery because of its critical importance to individuals with cervical SCI, who often experience profound loss of hand and arm function. 26
Methods
Participants
EMSCI is a prospective multicenter registry in which clinical and electrophysiological measurements are performed at scheduled intervals for up to one year after traumatic or ischemic injury (http://www.emsci.org; NCT01571531). Study approval was obtained from the institutional review board at all centers participating in EMSCI. Individuals are assessed within the first 2 weeks after injury, as well as at 4, 12, 24, and 48 weeks after injury. Participants from the EMSCI database were selected for the current analysis if they had (1) traumatic SCI (motor complete or incomplete, American Spinal Injury Association Impairment Scale [AIS] A through D), (2) neurological level of injury C1–C8 measured 12 weeks after SCI, and (3) ulnar compound muscle action potentials (CMAPs) and C8 key muscle (finger flexors) motor scores derived from the International Standards for Neurological Classification of Spinal Cord Injury (ISNCSCI) 27 measured bilaterally at 12 weeks and 48 weeks postinjury. The C8 segment primarily innervates muscles essential for prehension and grasp (including the long finger/thumb flexors and extensors), functions that are impaired in individuals with SCI whose neurological level of injury is C7 or above. The 12-week time point was selected as Wallerian degeneration has occurred, making it an ideal time within the recovery window to neurophysiologically assess lower motor neuron lesions.28,29 The 48-week time point was selected because a neurological plateau is expected in individuals with motor complete injuries, with minimal likelihood of spontaneous upper-limb motor recovery occurring beyond one year postinjury.30–32 Participants were stratified by injury severity (motor complete [AIS A/B] and motor incomplete [AIS C/D]), based on AIS grades at the 12-week time point.
Outcomes
Ulnar CMAPs recorded over the abductor digiti minimi (ADM) muscle were collected from nerve conduction studies performed at participating EMSCI centers using standardized electrode placement and stimulation parameters (please refer to the EMSCI protocol guide available at http://www.emsci.org). Although ISNCSCI uses ADM to represent T1 motor neuron function, ADM is predominantly innervated by the C8 myotome, with some innervation from T1. 33 Therefore, the baseline-to-peak amplitude of the ulnar CMAPs to ADM was used as a proxy for lower motor neuron integrity at the C8–T1 spinal segments. Ulnar CMAP amplitude predicts hand strength in SCI 4 and changes in AIS grade after cervical SCI. 34 Motor strength was graded using the Medical Research Council (MRC) scale for muscle strength, derived from the ISNCSCI. The ISNCSCI evaluates five key upper-limb muscle groups on each side, with each muscle graded on a 0–5 scale based on manual muscle testing. 27 MRC scores for the flexor digitorum profundus muscle were used to represent the strength of C8-innervated muscles.
Analysis
Spearman’s rank correlation coefficients (ρ) with bootstrapped 95% confidence intervals were calculated to assess the relationship between left and right limb outcomes at 12 and 48 weeks postinjury for both groups. We selected a non-parametric test for both outcomes because ulnar CMAP amplitudes are non-normally distributed and MRC scores are ordinal. Statistical significance was set at p < 0.05.
To evaluate how the choice of modeling strategy influences the interpretation of study results when examining upper-limb motor recovery in individuals with SCI, we conducted a participant-level cluster bootstrap resampling analysis. This approach creates many new “pseudo-datasets” by repeatedly sampling from the original data, thereby mimicking the natural variability that might be observed if the study were repeated multiple times. 35 Participants (not individual limbs) were resampled with replacement (i.e., some participants may be selected more than once while others may not be selected at all), and all their limb and time-point data were carried together into each bootstrap sample. This preserves the natural correlation structure of the data while simulating repeated experiments, enabling an empirical assessment of the stability and variability of model estimates. For simplicity, this analysis focused solely on CMAP amplitudes; we chose not to include MRC scores in the bootstrap analysis because their ordinal scale, combined with a high frequency of zero values, would require more complex statistical approaches (e.g., ordinal mixed-effects models or generalized estimating equations for ordinal data). While these methods are possible, they complicate interpretation and were beyond the scope of the present study.
We tested five theoretical sample sizes: 10 (20 limbs), 20 (40 limbs), 30 (60 limbs), 40 (80 limbs), and 50 (100 limbs) participants. For each bootstrap sample, three different modeling strategies were applied to estimate motor recovery (i.e., the difference in CMAP amplitude between 12 and 48 weeks postinjury), separately for motor complete and motor incomplete participants. All models were implemented using a linear mixed-effects (LME) framework, which includes (1) fixed effects representing the variables of interest (e.g., time, group, condition), and (2) random effects accounting for variability across clusters (e.g., participants, experimental trials, study centers) and non-independent observations.36,37 Models were estimated using maximum likelihood and assumed Gaussian residuals with constant variance. Time (12 vs. 48 weeks) was included as a categorical fixed effect for all models. Random effects were assumed to follow a normal distribution with mean zero and estimated variance components. The general model form was:
We compared three different statistical approaches to handling the data from the two limbs of each participant:
Unilateral model: For each participant, one limb (left or right) was randomly selected. This approach avoids the need to adjust for within-participant dependency between limbs but reduces the available data. The LME included a random intercept for each participant to account for within-participant correlation across time:
where
Naïve bilateral model: Both limbs were included for each participant, but the analysis treated limbs as independent observational units, effectively doubling the number of “participants.” The LME included a random intercept for each limb, which accounts for the repeated measure across time within a limb (pairing each limb’s 12-week vs. 48-week CMAP) but ignores the within-participant correlation between limbs. This mimics an analysis approach where one might include all limb observations in a model, without any term linking the two limbs of the same individual:
where
Adjusted bilateral model: Both limbs were included for each participant, and the dependency between limbs was modeled by including a random intercept for each participant. This approach recognizes that the four observations from each participant (left and right limbs at 12 and 48 weeks) are not independent, adjusting for within-participant correlation without treating each limb-time point as a separate individual:
where
Each resampling and modeling procedure was repeated 200 times per sample size. From each model, we extracted the beta coefficient (β; the expected change in CMAP amplitude from 12 to 48 weeks) and its standard error (SE; a measure of the precision of β, where smaller values indicate greater precision). β coefficient and SE estimates were summarized as the mean ± 95% confidence interval (CI, 2.5th–97.5th percentiles) across 200 bootstrap iterations. We also quantified empirical power as the proportion of bootstrap iterations in which the 12- and 48-week ulnar CMAPs across limbs/participants differed significantly (α = 0.05, two-sided Wald test). This reflects the probability of detecting a time effect given the observed effect size and variability in the study dataset, rather than prospective power under a prespecified true effect. Empirical power was calculated across 200 bootstrap iterations, and the 95% CIs were derived by resampling the binary rejection outcomes (0 = not rejected, 1 = rejected) 2000 times and taking the 2.5th–97.5th percentiles of the resampled means. Outcomes were calculated separately for motor complete and motor incomplete groups at each sample size.
Although LME models are generally robust to violations of distributional assumptions, fitting them to data that violate these assumptions requires careful evaluation. 38 To ensure these features did not influence the interpretation of our results, we performed a supplementary analysis that involved fitting the three LME models to log-transformed CMAP amplitudes (+0.5 mV offset) to account for the positive skew (with a high frequency of 0 mV data points) and boundary effects (i.e., CMAP amplitudes cannot be < 0 mV). While the model effect estimates on the original (untransformed) scale represent an absolute difference in CMAP amplitudes between 12 and 48 weeks, the effect estimates on the log-transformed scale represent the ratio of geometric means between time points. All analyses were conducted in R software version 4.1.3, with LME models fit using the lme4 package. A generalized version of the R analysis for the adjusted bilateral model this study is available in Supplemental Data.
Results
Our initial query of the EMSCI database included individuals (n = 5651) with traumatic SCI and a date of injury between January 1, 2004, and December 31, 2021. Our first round of screening identified 195 individuals with a cervical neurological level of injury within the first 2 weeks postinjury and ulnar CMAPs recorded in at least one limb. Of these 195 individuals, 118 (90 males, 28 females, mean ± standard deviation age = 46.4 ± 19.1 years) had bilateral data at 12 and 48 weeks postinjury and were included in the analysis (236 upper limbs). Figure 1 illustrates the participant selection process and data availability. There were 55 participants in the motor complete group and 63 participants in the motor incomplete group (see Table 1).

Flowchart of participant inclusion in the main study analysis. AIS, American Spinal Injury Association Impairment Scale; CMAP, compound muscle action potential; EMSCI, European Multicenter Study about Spinal Cord Injury; MRC, Medical Research Council.
Participant Characteristics
Age was determined at 2 weeks postinjury. AIS grades and the NLI were determined from International Standards for Neurological Classification of Spinal Cord Injury (ISNCSCI) examinations performed at 12 weeks postinjury.
AIS, American Spinal Injury Association Impairment Scale; NLI, neurological level of injury; SD, standard deviation.
Bilateral outcomes and corresponding Spearman’s rank coefficients for the motor complete and motor incomplete groups are shown in Figure 2. Spearman’s rank correlations revealed moderate-to-strong positive correlations between bilateral limb outcomes for both groups, all of which were statistically significant (p < 0.001). In general, bilateral correlations were stronger in the motor complete group (ρ range: 0.65–0.91) than in the motor incomplete group (ρ range: 0.44–0.62). Furthermore, correlation strength between bilateral ulnar CMAPs and bilateral C8 MRC scores increased from 12 to 48 weeks in both groups.

Individual participant outcome measures and Spearman’s rho (ρ) values for the left and right limbs. (1) Scatterplots of bilateral ulnar CMAPs are shown for
The β coefficient reflects the model-estimated change in ulnar CMAP amplitude between 12 and 48 weeks postinjury. β coefficients were consistent across models and sample sizes within each group (Fig. 3A, B). For motor complete injuries, the β coefficients ranged from 1.55 mV (geometric mean ratio = 1.38) to 1.62 mV (geometric mean ratio = 1.41) across the models. For motor incomplete injuries, the corresponding estimates ranged from 1.77 mV (geometric mean ratio = 1.37) to 1.91 mV (geometric mean ratio = 1.39). Thus, ulnar CMAP amplitudes increased by approximately 37–41% from 12 to 48 weeks in both groups. Because the bilateral naïve and adjusted models were fit to the same data, their β coefficients were identical across all sample sizes. The unilateral model yielded similar recovery estimates, indicating that analyses based on a single limb produced comparable point estimates even at smaller sample sizes.

Linear mixed-effects model analysis results. (1) Regression coefficients (β) for
However, the SEs, representing the precision of the recovery estimates, varied across models and sample sizes (Fig. 3C, D). Small SEs indicate that the β coefficients are stable and precise, whereas large SEs indicate that the β coefficients are uncertain. As expected, SEs decreased with increasing sample size in all models. Across all sample sizes, the unilateral model consistently produced the largest SEs. In contrast, the naïve bilateral model, which treats limbs as independent, produced the smallest SEs, particularly in smaller samples, reflecting an underestimation of true variability. The adjusted bilateral model, which accounts for within-participant correlation, produced SEs that fell between the other two approaches.
Conversely, empirical power increased with theoretical sample size across all models and both groups (Fig. 3E, F). Empirical power represents the probability of detecting a true recovery effect under the assumed model. Using only one limb reduced the likelihood of detecting a statistically significant effect; statistically significant time effects were observed in only slightly more than half of the bootstrap samples with n = 10. The naïve bilateral model exhibited the highest power across all sample sizes, whereas the adjusted bilateral model produced intermediate values. However, differences between models diminished as sample size increased, and both bilateral approaches approached 100% power when theoretical sample sizes reached approximately 40–50 participants. This convergence does not necessarily imply that limbs can be treated as independent at higher sample sizes; rather, it reflects that with sufficient power, both models will consistently detect the underlying recovery effect. At smaller sample sizes, however, the naïve bilateral model’s artificially inflated precision increased the probability of detecting a statistically significant effect, highlighting the risk of double-counting limbs in underpowered studies.
Discussion
Outcomes from the two sides of the body are expected to show some degree of similarity because both sides share intrinsic biological and physiological traits of the individual from whom they are measured (e.g., age, sex, and body mass index), and both sides are controlled by the same central motor output.22,23 However, SCI may disrupt this bilateral symmetry, resulting in asymmetries in motor function, sensation, and overall limb performance.25,39,40 In the EMSCI cohort, we observed moderate-to-strong correlations between left and right upper-limb neurophysiological (ulnar CMAP) and motor (C8 MRC) outcomes in both motor complete and motor incomplete participants. Correlations between limbs were stronger among motor complete participants, likely reflecting more symmetrical patterns of neural impairment. In contrast, residual descending input and variable reinnervation in motor incomplete participants may introduce greater asymmetry, resulting in weaker interlimb correlations. Overall, these findings indicate that upper-limb neurophysiological and motor outcomes in SCI are not independent and that performance in one limb is statistically related to that in the other.
To examine how this interlimb dependence affected statistical inference, we used a participant-level bootstrap resampling approach to compare three strategies for analyzing bilateral data. The unilateral model, while simple to implement, consistently produced larger SEs and lower statistical power due to the exclusion of half the available data. This loss of precision is particularly problematic in SCI studies, where recruitment challenges often limit sample sizes. 41 The naïve bilateral model yielded smaller SEs; however, this was achieved artificially by treating correlated limbs as independent, thereby underestimating true variability and increasing the risk of false-positive findings. Improperly modeling correlated data can therefore inflate apparent evidence for an effect, leading to the publication of findings that are driven more by statistical artifact than biological reality. The accumulation of such false-positive findings has been recognized as a contributing factor to the broader reproducibility crisis in research. 42 Our findings illustrate how this risk can be mitigated through appropriate statistical modeling. The adjusted bilateral model, which accounted for correlations between limbs and across time, produced intermediate SEs that more accurately reflected the true amount of independent information, yielding valid and efficient estimates without discarding data. These findings align with statistical expectations and underscore the importance of accounting for intraindividual dependence to ensure valid, interpretable analyses in SCI research.
These findings have important implications for both natural recovery and interventional SCI studies. Our results do not suggest that limb-level outcomes should be ignored; rather, they show that limb-level data must be analyzed with methods that account for the fact that both limbs belong to the same participant. Failing to do so can inflate apparent precision and obscure true biological change. Our results also demonstrate that including both limbs does not double the effective sample size; instead, the added information increases precision only to the extent that the limbs provide independent information. In our example, 20 limbs offered similar precision whether they came from 10 participants in the adjusted bilateral model or from 20 participants in unilateral models (Fig. 3C, D). By appropriately accounting for interlimb dependence, studies can achieve adequate statistical power with fewer participants than would be required under the assumption of independent limbs. 43 Finally, although bilateral outcomes were moderately to strongly correlated, recovery of ulnar CMAPs was not always consistent between limbs (Supplementary Fig. S1). This suggests that the contralateral limb cannot serve as a valid control in SCI studies 22 ; instead, both limbs should be analyzed jointly to account for shared variance while acknowledging side-specific differences in recovery potential and functional use.
LME models offer a flexible framework for handling correlated data, enabling the simultaneous estimation of both population-level effects and individual variability. 36 However, they rely on assumptions that are not easily met by outcomes with limited resolution, such as MRC scores. For this reason, and to simplify interpretation, we did not model bilateral MRC scores; appropriately analyzing such outcomes would require more advanced statistical approaches that account for both the MRC scale’s properties and within-participant dependence. Another limitation is the absence of a “ground truth” benchmark against which to evaluate model accuracy. Although the bootstrap approach provided an empirical estimate of variability and uncertainty across models, it cannot determine which analytic strategy yields the most unbiased or accurate recovery estimates. Future work could use simulated datasets with known parameters to quantify model bias and false-positive error rates, thereby guiding the selection of optimal models for bilateral data. Finally, this study evaluated natural rather than intervention-related recovery, focusing on spontaneous improvements between 12 and 48 weeks postinjury. While the need to account for within-participant correlation also applies to interventional trials, interventional designs often include additional sources of clustering, such as repeated measurements within sessions or multiple follow-up time points. These introduce additional layers of correlation that would require extended mixed-effects or generalized estimating equation frameworks to model appropriately.44,45 More traditional approaches that account for repeated measurements, such as repeated measures analysis of variance, address intraindividual dependence but rely on restrictive assumptions about covariance structure and balanced designs, whereas LME models provide greater flexibility in accommodating missing or unbalanced data. As with bilateral limb data, failure to account for intraindividual temporal dependence can lead to an underestimation of uncertainty and an overestimation of statistical significance.
In summary, this study emphasizes the importance of accurately modeling the statistical dependence between bilateral upper-limb outcomes in SCI research. Because interlimb correlations are present, analyses must account for within-participant clustering to ensure accurate inference, prevent inflated significance, and maximize the utility of available data. Mixed-effects models are well-suited for this purpose, as they explicitly account for the hierarchical structure of bilateral data. Alternative methods, such as generalized estimating equations, may also be appropriate for accommodating within-participant correlation in non-normally distributed or ordinal outcomes. Researchers should carefully consider the degree of bilateral correlation in their sample, the expected symmetry of recovery, and the implications of these factors for study design and statistical analysis. Adopting these practices will strengthen the validity, reliability, and interpretability of future clinical trials and observational studies in SCI rehabilitation.
Transparency, Rigor, and Reproducibility Summary
This study used data from the EMSCI, a registered prospective cohort (ClinicalTrials.gov: NCT01571531) that systematically collects neurological, functional, and neurophysiological assessments after traumatic SCI using standardized protocols across multiple centers. All data were acquired by trained EMSCI personnel and underwent established quality control procedures.
The present work is a retrospective secondary analysis examining how statistical approaches for handling bilateral limb data influence inference in SCI research. No new data were collected, and the analysis was not preregistered. All analytic procedures were prespecified by the lead analyst and are fully documented in the Methods section. Because the study relied on existing EMSCI data, no prospective sample-size calculation was performed. All participants with complete bilateral ulnar CMAP measurements at 12 and 48 weeks were included. Precision, bias, and type I error were evaluated through extensive bootstrap resampling to assess how analytic choices affect statistical inference.
Participant inclusions and exclusions are detailed in the Methods section and were limited to missing or invalid neurophysiological recordings. As a retrospective analysis of de-identified data, blinding during acquisition was not applicable; analyses were conducted using coded identifiers. All analyses were performed in R using established packages (e.g., lme4). Clinical and neurophysiological measures, including ISNCSCI and CMAP assessments, are established and validated standards in SCI research.
The analysis explicitly addressed nonindependence of bilateral limb measurements using random-effects modeling and cluster-aware resampling. Missing data were handled through listwise exclusion for analyses requiring complete bilateral measurements. Effect sizes and confidence intervals are reported throughout. Because this study evaluates methodological considerations rather than multiple independent clinical hypotheses, correction for multiple comparisons was not required.
EMSCI data are available through a controlled-access repository with approval from the EMSCI Steering Committee. Custom analytic code is available from the corresponding author and will be deposited in a public repository upon publication. The authors elect to publish the article under the Journal of Neurotrauma’s Open Access option.
Authors’ Contributions
K.J.M.: Conceptualization, methodology, software, formal analysis, writing—original draft, writing—review and editing, and visualization. J.J.C.: Methodology, formal analysis, and writing—review and editing. J.D.: Conceptualization and writing—review and editing. I.K.F.: Resources, data curation, and writing—review and editing. A.C.: Resources, data curation, and writing—review and editing. M.S.: Resources, data curation, and writing—review and editing. N.W.: Resources, data curation, and writing—review and editing. M.J.B.: Conceptualization, methodology, writing—review and editing, supervision, and project administration.
Footnotes
Author Disclosure Statement
The authors have no competing interests to disclose.
Funding Information
This work was supported by the Office of the Assistant Secretary of Defense for Health Affairs through the Congressionally Directed Medical Research Programs’ Spinal Cord Injury Research Program (SCIRP) under Award No. W81XWH-22-1-0908/-0909 entitled “Expanding Knowledge and Information Delivery around Improving Upper Extremity Function after Cervical Spinal Cord Injury.”
Supplemental Material
Supplemental Material
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
