Joint analysis of longitudinal and recurrent event data: A functional regression approach with autoregressive frailty

Abstract

Recurrent health events often involve complex inter-relationships between longitudinal biomarkers and time-to-event outcomes, further complicated by sparse, irregular data collection and time-dependent correlations among events. Traditional statistical methods frequently struggle with these complexities, resulting in biased estimates and suboptimal modeling performance. To address these challenges, we propose the Functional Regression with AutoregressIve fraiLTY (FRAILTY) method, a novel framework designed to jointly model longitudinal measurements and recurrent events, accommodating both scalar and functional covariates while capturing time-dependent correlations among events. The FRAILTY method employs a two-step estimation procedure. First, functional principal component analysis through conditional expectation (PACE) is applied to extract key temporal features from sparse and irregular longitudinal data. Second, the obtained scores are incorporated into a dynamic recurrent frailty model with an autoregressive structure to account for within-subject correlations across recurrent events. Simulation studies demonstrated that the FRAILTY method outperformed existing methods, such as those relying on B-spline basis functions and Bayesian joint modeling, by achieving lower integrated mean squared errors, higher concordance indices, and greater statistical power in detecting functional parameters. Its practical utility was further validated through applications to two datasets: the Systolic Blood Pressure Intervention Trial study and the Multicenter Collaboration to Study Treatment Outcomes in Nephrolithiasis Evaluation cohort.

Keywords

dynamic prediction functional principal component analysis autoregressive frailty model recurrent events gap time model sparse longitudinal data analysis

1. Introduction

Understanding disease progression often involves analyzing two inter-related outcomes: repeated measurements of biomarkers or health indicators (i.e. longitudinal data), and the occurrence of major health events over time (i.e. time-to-event outcomes). Recurrent events, in particular, provide valuable insights by capturing patterns, timing, and dependencies across multiple occurrences. These relationships are essential for predicting and managing conditions such as kidney stone recurrence, cardiovascular events, and cancer progression. Joint modeling approaches are commonly employed to analyze longitudinal biomarkers and recurrent events, typically using latent class joint models¹ or shared frailty models.^2–4 However, these approaches often depend on rigid parametric assumptions that oversimplify real-world data by imposing predefined trajectory structures, such as linear mixed-effects models.^1–3 Such models frequently fail to capture the complex and heterogeneous patterns observed in longitudinal data, especially when those patterns diverge from assumed parametric forms. Consequently, dynamic disease processes are simplified into static metrics, such as current values or slopes, resulting in biased or incomplete inferences.

Functional data analysis (FDA) provides a more flexible, nonparametric alternative by treating longitudinal data as continuous functions over time rather than discrete observations. Unlike linear mixed models, which impose parametric assumptions on trajectories, FDA accommodates complex temporal trends without imposing strong model constraints. In particular, functional principal component analysis (FPCA) derives smooth, low-dimensional representations of individual trajectories, allowing for deeper insights into underlying disease dynamics. Several methods have been developed to adapt FDA for longitudinal data,^5–7 including approaches designed to handle sparse or irregular time observations.^8–12

Beyond longitudinal modeling alone, FDA has been extended to link longitudinal data with single-event outcomes. For example, Yao et al.⁸ proposed a nonparametric approach that jointly models longitudinal trajectories and time-to-event data by using flexible basis functions and incorporating functional principal component (FPC) scores into a Cox regression model to characterize disease progression. Applications of FDA span diverse areas, including high-dimensional gene expression data,¹³ dynamic disease progression prediction,¹⁴ and functional imaging analysis,¹⁵ among others. More recently, Yang et al.¹⁶ introduced a weighted functional linear Cox regression model that employs subject-specific inverse probability of censoring weights. Similarly, Cui et al.¹⁷ developed an additive functional Cox regression model to quantify the association between functional covariates and time-to-event outcomes, applying transformations to better capture the inherent complexity of functional covariates.

While significant progress has been made in applying FDA to time-to-event data, most existing methods have focused on single-event settings, and extensions to recurrent event modeling remain relatively limited. Recurrent events, such as kidney stone recurrence, cardiovascular events, or hospital readmission, offer a more detailed perspective on disease evolution by accounting for both the timing and sequence of occurrences. However, modeling such events poses unique challenges, including time-dependent correlations among occurrences and evolving biomarker-event relationships. Classic recurrent event models, such as the Andersen–Gill model,¹⁸ the Prentice–Williams–Peterson model,¹⁹ the Wei–Lin–Weissfeld model,²⁰ as well as frailty models,²¹ rarely integrate longitudinal biomarker trajectories, limiting their capacity to capture dynamic interactions between biomarkers and events. Although recent work by Hong et al.²² proposed a joint frailty model using FPCA for recurrent and terminal events, their focus was primarily on dynamic prediction, rather than inference on time-varying functional covariate effects.

We propose the Functional Regression with AutoregressIve fraiLTY (FRAILTY) method, which combines FDA with a dynamic frailty framework to model evolving dependencies in recurrent event settings. The FRAILTY method makes several key contributions: First, it extends analysis beyond single-event outcomes to recurrent events by leveraging a dynamic frailty framework with a first-order autoregressive (AR(1)) correlation structure. This approach captures intra-subject correlations among recurrent events, providing a more comprehensive view of their inter-relationships. Second, it provides a robust, nonparametric approach for handling sparse and asynchronous longitudinal data, even when both recurrent events and time-varying covariates are collected intermittently. Third, it incorporates window-specific FPC scores to dynamically link longitudinal patterns to time intervals between recurrent events, enabling a precise characterization of disease progression dynamics. Furthermore, the FRAILTY method enables dynamic prediction of future events by accounting for evolving biomarker-event relationships. Finally, it supports hypothesis testing to assess the effects of functional covariates through a Wald-type test, yielding meaningful inferences about their impact on event occurrences.

This paper is organized as follows. Section 2 presents an overview of FPCA and the formulation of the FRAILTY method for analyzing recurrent events. Section 3 evaluates performance of the FRAILTY method through simulation studies, comparing its effectiveness against existing methods. Section 4 demonstrates its practical utility through applications to two datasets: the Systolic Blood Pressure Intervention Trial study and the Multicenter Collaboration to Study Treatment Outcomes in Nephrolithiasis Evaluation cohort. Finally, Section 5 concludes with a discussion of the findings, their implications, and potential directions for future research.

2. Methodology

2.1. Overview of functional principal component analysis (FPCA)

Consider a sample of $M$ subjects, each associated with an underlying longitudinal covariate $X_{i} (t)$ defined on a compact time domain $T$ . In practice, the covariate $X_{i} (t)$ is not observed continuously. Instead, it is measured sparsely and irregularly at subject-specific visit times $t_{i l}$ , $i = 1, \dots, M$ and $l = 1, \dots, m_{i}$ . The observed measurements are contaminated with random noise: ${\tilde{X}}_{i} (t_{i l}) = X_{i} (t_{i l}) + ϵ_{i l}$ , where $ϵ_{i l}$ are independent measurement errors with mean zero and variance $σ_{ϵ}^{2}$ . We assume that ${X_{i} (t) : t \in T}$ is a square-integrable stochastic process in the Hilbert space $L^{2} (T)$ , equipped with the inner product $⟨ f, g ⟩_{=} \int_{T} f (t) g (t) d t, \forall f, g \in L^{2} (T)$ . Let $μ (t) = E [X_{i} (t)]$ and $Σ (t, t^{'}) = Cov {X_{i} (t), X_{i} (t^{'})}$ denote the mean and covariance functions of the longitudinal covariate process. The mean function $μ (t)$ is estimated by smoothing the pooled noisy observations across all subjects, yielding the estimator $\hat{μ} (t)$ . For subject $i$ , collect the observed longitudinal measurements into the vector ${\tilde{X}}_{i} = ({\tilde{X}}_{i} (t_{i 1}), \dots, {\tilde{X}}_{i} (t_{i m_{i}}))^{⊤},$ and define the corresponding estimated mean vector ${\hat{μ}}_{i} = (\hat{μ} (t_{i 1}), \dots, \hat{μ} (t_{i m_{i}}))^{⊤}$ .

Assuming that the covariance function $Σ (t, t^{'})$ is positive semi-definite, it admits the spectral decomposition $Σ (t, t^{'}) = \sum_{k = 1}^{\infty} λ_{k} ϕ_{k} (t) ϕ_{k} (t^{'})$ , where ${λ_{k}}$ are non-negative eigenvalues and ${ϕ_{k}}$ are orthonormal eigenfunctions in $L^{2} (T)$ . By the Karhunen–Loève theorem, the longitudinal covariate process can be represented as $X_{i} (t) = μ_{(} t) + \sum_{k = 1}^{\infty} ξ_{i k} ϕ_{k} (t), t \in T,$ where the FPC scores are defined as $ξ_{i k} = ⟨ X_{i} - μ, ϕ_{k} ⟩ = \int_{T} {X_{i} (t) - μ (t)} ϕ_{k} (t) d t$ with $E (ξ_{i k}) = 0$ and $Var (ξ_{i k}) = λ_{k}$ .

For densely observed data, the FPC scores $ξ_{i k}$ can be computed via numerical integration. However, when longitudinal data are sparse and irregular, direct estimation is infeasible. To address this challenge, we adopt the principal analysis by conditional expectation (PACE) approach of Yao et al..⁶ Let ${\hat{Σ}}_{{\tilde{X}}_{i}}$ denote the estimated covariance matrix of ${\tilde{X}}_{i}$ , with entries ${\hat{Σ}}_{{\tilde{X}}_{i}} (l, l^{'}) = \hat{Σ} (t_{i l}, t_{i l^{'}}) + {\hat{σ}}_{ϵ}^{2} I (l = l^{'})$ , $l, l^{'} = 1, \dots, m_{i}$ . Under the assumption of joint Gaussianity of latent FPC scores and measurement errors, the PACE estimator of $ξ_{i k}$ is

{\hat{ξ}}_{i k} = \hat{E} [ξ_{i k} ∣ {\tilde{X}}_{i}] = {\hat{λ}}_{k} {\hat{ϕ}}_{i k}^{⊤} {\hat{Σ}}_{{\tilde{X}}_{i}}^{- 1} ({\tilde{X}}_{i} - {\hat{μ}}_{i}),

where

{\hat{ϕ}}_{i k} = ({\hat{ϕ}}_{k} (t_{i 1}), \dots, {\hat{ϕ}}_{k} (t_{i m_{i}}))^{⊤}

is the vector obtained by evaluating the

k

-th estimated FPCA eigenfunction

{\hat{ϕ}}_{k} (\cdot)

at subject

i

’s visit times. Finally, the predicted trajectory of the longitudinal covariate is approximated by the truncated expansion

{\hat{X}}_{i} (t) = \hat{μ} (t) + \sum_{k = 1}^{K_{x}} {\hat{ξ}}_{i k} {\hat{ϕ}}_{k} (t)

, where

K_{x}

is the number of retained components, which is selected using a predefined percentage of variance explained (PVE) threshold, the Akaike information criterion (AIC), or the Bayesian information criterion (BIC).

FPCA provides several advantages in modeling longitudinal trajectories. It decomposes a complex stochastic process into orthogonal eigenfunctions weighted by uncorrelated FPC scores, capturing dominant patterns in the data while reducing dimensionality. This basis truncation enables regularization and emphasizes interpretable signal components. Moreover, the nonparametric estimation of the mean and covariance functions supports a data-driven modeling approach that avoids restrictive assumptions.

2.2. FRAILTY: Functional Regression with AutoregressIve fraiLTY method

2.2.1. Notation

In recurrent event settings, a subject may experience the same event multiple times during follow-up. To characterize the association between the recurrent event process and longitudinal covariate trajectories, we formulate the model in terms of functional covariates. For simplicity, we present the method using a single functional covariate; however, the proposed method can accommodate multiple longitudinal covariates (see Section 2.4). For subject $i$ , let $T_{i j}$ denote the random time of the $j$ th event, where $j = 1, \dots, n_{i}$ , and let $C_{i}$ represent the censoring time. The observed event time, accounting for right censoring, is defined as ${\tilde{T}}_{i j} = min (T_{i j}, C_{i})$ , with event indicator $δ_{i j} = I (T_{i j} \leq C_{i})$ . We use lowercase $t_{i j}$ to denote the realized value of the observed time ${\tilde{T}}_{i j}$ . Define the gap time for subject $i$ at the $j$ -th event as $g_{i j} = t_{i j} - t_{i, j - 1}$ , with $t_{i 0} = 0$ . Let $n_{i}$ denote the total number of observed event records (including censored records) for subject $i$ , and define $N = \sum_{i = 1}^{M} n_{i}$ as the total number of stacked event records across all subjects. The complete observed data for subject $i$ are given by $O_{i} = {{{\tilde{T}}_{i j}, δ_{i j}}_{j = 1, \dots, n_{i}}, Z_{i}, {{\tilde{X}}_{i} (t_{i l})}_{l = 1, \dots, m_{i}}}$ , where $Z_{i} \in R^{P}$ is a $p \times 1$ vector of time-invariant covariates, and ${\tilde{X}}_{i} (t_{i l})$ denotes the observed longitudinal covariate measured at time $t_{i l}$ .

2.2.2. Model formulation

We model the recurrent event process using a gap-time proportional hazard framework that incorporates both scalar and functional covariates, together with subject-specific frailty terms. We assume a non-informative censoring mechanism; specifically, conditional on the covariates and frailty terms, the recurrent event process is independent of the censoring time. Extending the correlated frailty framework of Yau et al.,²³ we specify the hazard function for subject $i$ during the $j$ -th inter-event interval [ $t_{i, j - 1}$ , $t_{i j}$ ) as follows. Let $t$ denote calendar time. Conditional on the event history up to time $t$ , the hazard function is defined as

h_{i} (t; Z_{i}, X_{i} (\cdot)) = h_{0} (t - t_{i, j - 1}) \exp {η_{i j} (t)}, t \geq t_{i, j - 1},

(1)

where

h_{0} (\cdot)

is an unspecified baseline hazard function defined on the gap-time scale, and

t_{i, j - 1}

denotes the observed time of the previous event (with

t_{i 0}

=0). Here,

η_{i j} (t)

denotes the time-dependent linear predictor, given by

η_{i j} (t) = α^{⊤} Z_{i} + \int_{t_{i, j - 1}}^{t} X_{i} (s) β (s) d s + v_{i j}

. The

Z_{i} \in R^{p}

is the vector of time-invariant covariates with corresponding regression coefficients

α \in R^{p}

. The functional covariate process

X_{i} (t)

is defined in Section 2.1, and

β (t)

is an unknown square-integrable coefficient function representing its time-varying effect on recurrence risk. The integral term

\int_{t_{i, j - 1}}^{t} X_{i} (s) β (s) d s

captures the cumulative contribution of the longitudinal trajectories over the current inter-event interval, thereby establishing a dynamic association between the longitudinal process and the hazard of recurrence.

The term $v_{i j}$ represents a random frailty capturing unobserved heterogeneity associated with the $j$ th event for subject $i$ . Let $v_{i}^{⊤} = (v_{i 1}, v_{i 2}, \dots, v_{i n_{i}})^{⊤}$ collect the frailty terms for subject $i$ , and define $V = {v_{1}^{⊤}, v_{2}^{⊤}, \dots, v_{M}^{⊤}}^{⊤}$ as the stacked frailty vector across all subjects. We assume $V \sim MVN (0, σ^{2} Σ_{v})$ , where $σ^{2}$ represents the variance component and $Σ_{v}$ is an $N$ $\times$ $N$ block matrix with $Σ_{v_{i}}$ on the diagonal blocks. To capture serial dependence among recurrent events within a subject, we adopt a first-order autoregressive (AR(1)) structure with correlation parameter $ρ$ such that

Σ_{v_{i}} (ρ) = \frac{1}{1 - ρ^{2}} [\begin{matrix} 1 & ρ & ρ^{2} & \dots & ρ^{n_{i} - 1} \\ ρ & 1 & ρ & \dots & ρ^{n_{i} - 2} \\ ρ^{2} & ρ & 1 & \dots & ρ^{n_{i} - 3} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ ρ^{n_{i} - 1} & ρ^{n_{i} - 2} & ρ^{n_{i} - 3} & \dots & 1 \end{matrix}] .

This autoregressive specification induces serial correlation among frailty terms, such that events closer in time exhibit stronger dependencies. This formulation enables the model to capture evolving, within-subject risk patterns for recurrent events and adds flexibility compared to models assuming independent or exchangeable frailty.

2.2.3. Likelihood function

Let $ℓ_{joint} (Θ; V)$ denote the joint log-likelihood of the recurrent event process conditional on the frailty vector $V$ , corresponding to the hazard model defined in Equation (1). The parameter space is defined as $Θ = {α, β, σ^{2}, ρ}$ . Combining contributions across all subjects yields the joint log-likelihood conditional on $V$ :

\begin{aligned} ℓ_{joint} (Θ; V) = & \log [\prod_{i = 1}^{M} \prod_{j = 1}^{n_{i}} {h_{0} (t_{i j} - t_{i, j - 1}) \exp {η_{i j} (t_{i j})}}^{δ_{i j}} S_{0} {(t_{i j} - t_{i, j - 1})}^{\exp {η_{i j} (t_{i j})}}] \\ - \frac{1}{2} [N \log (2 π σ^{2}) + \log | Σ_{v} | + \frac{1}{σ^{2}} V^{⊤} Σ_{v}^{- 1} V], \end{aligned}

(2)

with

S_{0} (\cdot)

denotes the baseline survival function evaluated at the observed gap time. In Equation (2), the first term represents the conditional log-likelihood of the recurrent event times given the frailty vector

V

, and the second term corresponds to the log density of the multivariate normal frailty distribution with covariance matrix

Σ_{v}

To eliminate the unspecified baseline hazard function $h_{0} (\cdot)$ , estimation proceeds via a Cox-type partial likelihood. Conditional on the frailty vector $V$ , the likelihood contribution of record $(i, j)$ at gap time $g_{i j}$ is proportional to $h_{0} (g_{i j}) \exp {η_{i j} (t_{i j})}$ . Let $R (g_{i j})$ denote the set of event records that remain under observation and are at risk at gap time $g_{i j}$ . Conditioning on exactly one failure occurring at gap time $g_{i j}$ , the conditional probability that record $(i, j)$ fails is $\frac{\exp {η_{i j} (t_{i j})}}{\sum_{(k, l) \in R (g_{i j})} \exp {η_{k l} (t_{k l})}}$ . Because the baseline hazard $h_{0} (g_{i j})$ is common to all records in the risk set, it cancels from the numerator and denominator. The resulting likelihood therefore depends only on the linear predictors and is independent of the baseline hazard function. For notational convenience, we stack all recurrent-event records across subjects and index them by $s = 1, \dots, N$ . The resulting partial log-likelihood is

\begin{aligned} ℓ_{partial} (Θ) = & \sum_{s = 1}^{N} δ_{s} [η_{s} - \log \sum_{l \in R (g_{s})} \exp (η_{l})] \\ - \frac{1}{2} [N \log (2 π σ^{2}) + \log | Σ_{v} | + \frac{1}{σ^{2}} V^{⊤} Σ_{v}^{- 1} V] . \end{aligned}

(3)

Here,

δ_{s} = 1

if the

s

-th record corresponds to an observed event and

δ_{s} = 0

otherwise. The quantity

η_{s}

denotes the linear predictor evaluated at the observed event time corresponding to record

s

; similarly,

η_{l}

denotes the linear predictor at the observed event time for a generic record

l

in the risk set

R (g_{s})

. The risk set

R (g_{s})

consists of all event records that remain under observation and are at risk at the same gap time as record

s

. Thus, the partial likelihood in Equation (3) is formulated at the event-record level and is independent of the baseline hazard function.

2.3. Estimation and testing for functional parameters

2.3.1. Two-step estimation procedure for the FRAILTY method

The proposed FRAILTY method employs a two-step estimation strategy. In the first step, FPCA is applied to (sparse) longitudinal data to extract principal components and compute event-specific scores. In the second step, a recurrent event model with dynamic frailty is fitted using the partial likelihood derived in Section 2.2.3. Below, we detail each step of the procedure, which is summarized in Algorithm 1.

Step 1: Functional principal component analysis for (sparse) longitudinal data. Building on the work of Yao et al.,⁶ we apply FPCA to extract dominant modes of variation from sparse and irregular longitudinal data. For sparse functional data, where subjects have few observations, the PACE algorithm is used under the assumption of jointly Gaussian measurement errors and latent components. Without loss of generality, we assume that the functional covariate $X_{i} (t)$ has been mean-centered so that $μ (t) = 0$ . As described in Section 2.1, the covariance function of $X (t)$ admits the spectral decomposition $Σ (t, t^{'}) = \sum_{k = 1}^{\infty} λ_{k} ϕ_{k} (t) ϕ_{k} (t^{'})$ , where ${ϕ_{k}}$ are orthonormal eigenfunctions in $L^{2} (T)$ . The longitudinal process is approximated by the truncated Karhunen–Loève expansion

X_{i} (t) \approx \sum_{k = 1}^{K_{x}} ξ_{i k} ϕ_{k} (t), ξ_{i k} = ⟨ X_{i}, ϕ_{k} ⟩,

where

ξ_{i k}

are the FPC scores estimated using the PACE approach.

We assume that the coefficient function $β (t)$ in the linear predictor lies in the same truncated subspace spanned by the leading FPCA eigenfunctions ${ϕ_{k}}_{k = 1}^{K_{x}}$ . Since these eigenfunctions form an orthonormal basis for this subspace of $L^{2} (T)$ , any function in the subspace admits a representation in terms of this basis. Accordingly, the coefficient function $β (t)$ can be approximated via orthogonal projection onto this subspace as

β (t) \approx \sum_{k = 1}^{K_{x}} β_{k} ϕ_{k} (t), β_{k} = ⟨ β, ϕ_{k} ⟩ .

Let

β = (β_{1}, \dots, β_{K_{x}})^{⊤}

denote the corresponding finite-dimensional parameter vector. The coefficients

β_{k}

represent the projection coordinates of

β (\cdot)

in the FPCA eigenbasis and should be distinguished from the subject-specific FPC scores

ξ_{i k}

of the covariate process.

Substituting the truncated expansions of $X_{i} (\cdot)$ and $β (\cdot)$ into the cumulative functional effect over the $j$ -th gap interval yields

\int_{t_{i, j - 1}}^{t} X_{i} (s) β (s) d s \approx \sum_{k = 1}^{K_{x}} \sum_{ℓ = 1}^{K_{x}} ξ_{i k} β_{ℓ} \int_{t_{i, j - 1}}^{t} ϕ_{k} (s) ϕ_{ℓ} (s) d s .

Define the

K_{x} \times K_{x}

matrix

[G_{i j}]_{k ℓ} = \int_{t_{i, j - 1}}^{t_{i j}} ϕ_{k} (s) ϕ_{ℓ} (s) d s,

k, ℓ = 1, \dots, K_{x}

. Let the window-specific score vector be

ξ_{i j} = G_{i j}^{⊤} ξ_{i} .

Then the cumulative functional effect over the

j

-th inter-event interval can be written as

\int_{t_{i, j - 1}}^{t_{i j}} X_{i} (s) β (s) d s \approx ξ_{i j}^{⊤} β .

These window-specific scores serve as covariates in the recurrent event model.

Step 2: Dynamic functional autoregressive frailty model for recurrent events. We integrate the functional covariate scores obtained in Step 1 with a dynamic frailty model to account for the recurrent nature of event data. Estimation proceeds via a restricted maximum likelihood (REML) approach, utilizing a log-likelihood based on the Best Linear Unbiased Predictor (BLUP) within a generalized linear mixed model (GLMM) framework.^24,25

Let $Θ = (α, β, σ^{2}, ρ)$ denote the complete parameter set, and let $β^{*} = (α, β)$ represent the regression parameters. The Newton–Raphson algorithm is used to iteratively update parameter estimates:

[\begin{matrix} {\hat{β}}^{* (m + 1)} \\ {\hat{V}}^{(m + 1)} \end{matrix}] = [\begin{matrix} {\hat{β}}^{* (m)} \\ {\hat{V}}^{(m)} \end{matrix}] + {(ℓ_{Θ}^{^{″}})}^{- 1} [\begin{matrix} [Z Ξ]^{⊤} \\ R^{⊤} \end{matrix}] ℓ_{η}^{^{'}} - {(ℓ_{Θ}^{^{″}})}^{- 1} [\begin{matrix} 0 \\ {\hat{σ}}^{- 2 (m)} {\hat{Σ}}_{v}^{- 1} {\hat{V}}^{(m)} \end{matrix}],

where

{\hat{β}}^{*}

and

\hat{V}

are the estimates of

β^{*}

and

V

Z

is the design matrix for scalar covariates,

Ξ

is the matrix of window-specific FPC scores, and

R

is the design matrix for the frailty terms. The term

ℓ_{Θ}^{^{″}}

refers to the negative Hessian of

ℓ_{partial}

with respect to

Θ

, while

ℓ_{η}^{^{'}}

and

ℓ_{η}^{^{″}}

are the first and second derivatives of

ℓ_{partial}

with respect to

η

, where

\begin{aligned} ℓ_{Θ}^{^{″}} = - \frac{\partial^{2} ℓ_{partial}}{\partial Θ \partial Θ^{⊤}} & = [\begin{array}{cc} [Z Ξ]^{⊤} ℓ_{η}^{^{″}} [Z Ξ] & [Z Ψ]^{⊤} ℓ_{η}^{^{″}} R \\ R^{⊤} ℓ_{η}^{^{″}} [Z Ξ] & R^{⊤} ℓ_{η}^{^{″}} R + {\hat{σ}}^{- 2} {\hat{Σ}}_{v}^{- 1} \end{array}] and \end{aligned}

\begin{aligned} {(ℓ_{Θ}^{^{″}})}^{- 1} & = [\begin{array}{cc} H_{β} & H_{β, v} \\ H_{v, β} & H_{v} \end{array}] . \end{aligned}

Further derivations are provided in Appendix A of the Supplementary Material. The baseline survival function is estimated on the gap-time scale using the Breslow estimator. Let

g_{(1)} < \dots < g_{(K)}

denote the ordered distinct observed gap times across all stacked event records. Let

d_{k}

denote the number of failures at gap time

g_{(k)}

, and let

R (g_{(k)})

denote the set of stacked event records that remain under observation and are at risk at gap time

g_{(k)}

. The estimator is

{\hat{S}}_{0} (u) = \exp {- \sum_{g_{(k)} \leq u} \frac{d_{k}}{\sum_{l \in R (g_{(k)})} \exp ({\hat{η}}_{l})}},

where

{\hat{η}}_{l}

denotes the estimated linear predictor for record

l

, and

u

denotes a generic gap time. The proposed FRAILTY method has been implemented in the R package “FunSurv”, which is available at https://github.com/zifangkong/FunSurv.

2.3.2. Hypothesis testing for functional parameters

To assess the association between the functional covariates and the recurrent event outcome, we consider the following null and alternative hypotheses

H_{0} : β (t) = 0 vs. H_{A} : β (t) \neq 0,

where the null hypothesis states that the functional covariate

X (t)

has no effect on the hazard function. The coefficient function

β (t)

is assumed to be identifiable within the eigenspace of

X (t)

. Since

β (t)

is an infinite-dimensional parameter, its estimation relies on a finite-dimensional approximation obtained via FPCA. As described in Section 2.3.1, we approximate

β (t) \approx \sum_{k = 1}^{K_{x}} β_{k} ϕ_{k} (t) .

This leads to the equivalent finite-dimensional hypothesis:

H_{0} : β_{1} = β_{2} = \dots = β_{K_{x}} = 0 vs. H_{A} : β_{j} \neq 0 for at least one j, 1 \leq j \leq K_{x} .

To test these hypotheses, we use a Wald-type statistic

T_{W} = {\hat{β}}^{⊤} {V (\hat{β})}^{- 1} \hat{β},

where

\hat{β}

is the partial likelihood estimate of

β = (β_{1}, \dots, β_{K_{x}})^{⊤}

, and

V (\hat{β}) = H_{β}

is its estimated covariance matrix from the corresponding block of the inverse Hessian matrix

(ℓ_{Θ}^{^{″}})^{- 1}

. Under

H_{0}

, the test statistic

T_{W}

asymptotically follows a

χ^{2}

distribution with

K_{x}

degrees of freedom.

As the sample size $M$ increases, selecting a larger $K_{x}$ provides a better approximation of the functional space and reducing bias. However, an excessively large $K_{x}$ introduces high variability in parameter estimates, which can potentially distort the asymptotic distribution of the test statistic. This testing procedure, along with its properties, has been extensively studied in the context of functional linear regression.²⁶ However, it is important to note that the choice of the number of FPCs can influence the statistical power of the Wald-type test. Prior studies^26,27 have shown that the Wald test statistic retains its asymptotic $χ^{2}$ distribution with $K_{x}$ degrees of freedom as long as $K_{x}$ grows at a controlled rate, meaning there exists an upper bound on its rate of divergence. If $K_{x}$ increases too rapidly, the problem becomes high-dimensional, and the standard asymptotic results may no longer hold. Further exploration of statistical testing in the context of functional Cox regression may yield valuable insights, particularly on optimal FPC selection to enhance test performance.

2.4. mFRAILTY: Extension to multivariate functional covariates

In many clinical scenarios, multiple functional processes may simultaneously influence the time-to-event outcome, such as several longitudinal biomarkers measured over time. These functional covariates are often correlated and ignoring such dependence may lead to suboptimal modeling and loss of efficiency. To accommodate this setting, we extend the FRAILTY framework to incorporate multivariate functional covariates. Let $X_{i} (t) = (X_{i}^{(1)} (t), \dots, X_{i}^{(Q)} (t))^{⊤}$ , $t \in T$ , denote a Q-dimensional vector of square-integrable stochastic processes for subject $i$ . Correspondingly, let $β^{(q)} (t)$ denote the coefficient function associated with $X_{i}^{(q)} (t)$ . Extending the linear predictor defined in Equation (1), the cumulative functional effect over the $j$ -th inter-event interval $[t_{i, j - 1}, t]$ becomes $\sum_{q = 1}^{Q} \int_{t_{i, j - 1}}^{t} X_{i}^{(q)} (s) β^{(q)} (s) d s$ . The hazard function retains the same structure as that in Section 2.2.2, with the linear predictor $η_{i j} (t) = α^{⊤} Z_{i} + \sum_{q = 1}^{Q} \int_{t_{i, j - 1}}^{t} X_{i}^{(q)} (s) β^{(q)} (s) d s + v_{i j}$ , where $v_{i j}$ follows the AR(1) frailty structure defined in Section 2.2.2. To model dependence among multiple functional covariates, we employs multivariate FPCA (MFPCA) as proposed by Happ et al.,²⁸ which captures the joint variability while accounting for correlations among functional covariates. Unlike applying univariate FPCA separately to each process, MFPCA models dependence through the joint covariance structure of FPC scores. The mFRAILTY procedure involves the following steps:

Apply univariate FPCA to each functional covariate $X_{i}^{(q)} (t)$ . For each $q = 1, \dots, Q$ , let $μ_{q} (t) = E {X_{i}^{(q)} (t)}$ denote the mean function of the $q$ -th functional covariate. Following the FPCA/PACE procedure described in Section 2.1, we estimate the mean function by pooling noisy observation across subjects and smoothing over time, yielding the estimator ${\hat{μ}}_{q} (t)$ . The longitudinal trajectories are centered using ${\hat{μ}}_{q} (t)$ prior to decomposition. We then estimate the covariance function of $X_{i}^{(q)} (t)$ , perform eigen-decomposition, and obtain the estimated eigenfunctions ${{\hat{ϕ}}_{k}^{(q)} (t)}_{k = 1}^{K_{q}}$ and subject-specific FPC scores ${\hat{ξ}}_{i}^{(q)} = ({\hat{ξ}}_{i 1}^{(q)}, \dots, {\hat{ξ}}_{i K_{q}}^{(q)})^{⊤}, i = 1, \dots, M .$ Here, $K_{q}$ is the number of retained FPC components for the $q$ -th functional covariate.

Combine the estimated FPC scores across all functional covariates into a single matrix. Let $K_{+} = \sum_{q = 1}^{Q} K_{q}$ be the total number of retained univariate FPC components. Define the stacked score vector ${\hat{ξ}}_{i}^{⊤} = ({\hat{ξ}}_{i}^{(1) ⊤}, \dots, {\hat{ξ}}_{i}^{(Q) ⊤}) \in R^{K_{+}}$ , where ${\hat{ξ}}_{i}^{(q)} = ({\hat{ξ}}_{i 1}^{(q)}, \dots, {\hat{ξ}}_{i K_{q}}^{(q)})^{⊤}$ . Construct the $(M \times K_{+})$ matrix $D = ({\hat{ξ}}_{1}^{⊤}, \dots, {\hat{ξ}}_{M}^{⊤})^{⊤}$ . Compute the $(K_{+} \times K_{+})$ covariance matrix $H = (M - 1)^{- 1} D^{⊤} D$ , which captures the joint variability across all functional covariates.

Perform eigen-decomposition of $H = \hat{C} \hat{Λ} {\hat{C}}^{⊤}$ , where $\hat{Λ} = diag ({\hat{λ}}_{1}, \dots, {\hat{λ}}_{K_{+}})$ and $\hat{C} = [{\hat{c}}_{1}, \dots, {\hat{c}}_{K +}]$ has orthonormal columns. Partition each eigenvector into covariate-specific blocks: ${\hat{c}}_{m} = ({\hat{c}}_{m}^{(1) ⊤}, \dots, {\hat{c}}_{m}^{(Q) ⊤})^{⊤}$ , ${\hat{c}}_{m}^{(q)} \in R^{K_{q}}$ . The truncation point $K_{m}$ , representing the number of retained multivariate FPC components, is selected via cross-validation or PVE.

Use eigenvectors ${\hat{c}}_{m}$ to calculate: (i) for $m = 1, \dots, K_{m}$ , the estimated multivariate eigenfunctions as ${\hat{ϕ}}_{m}^{* (q)} (t) = \sum_{k = 1}^{K_{q}} {\hat{c}}_{m, k}^{(q)} {\hat{ϕ}}_{k}^{(q)} (t)$ ; and (ii) the estimated multivariate FPC scores ${\hat{ξ}}_{i m}^{*} = \sum_{q = 1}^{Q} \sum_{k = 1}^{K_{q}} {\hat{c}}_{m, k}^{(q)} {\hat{ξ}}_{i k}^{(q)}$ . The longitudinal trajectory of the $q$ -th covariate is approximated as $X_{i}^{(q)} (t) \approx {\hat{μ}}_{q} (t) + \sum_{m = 1}^{K_{m}} {\hat{ξ}}_{i m}^{*} {\hat{ϕ}}_{m}^{* (q)} (t)$ .

For each inter-event interval $[t_{i, j - 1}, t_{i j}]$ , define the $K_{m} \times K_{m}$ matrix $[G_{i j}^{*}]_{m l} = \int_{t_{i, j - 1}}^{t_{i j}} {\hat{ϕ}}_{m}^{*} (s) {\hat{ϕ}}_{l}^{*} (s) d s,$ $m, l = 1, \dots, K_{m}$ . Let ${\hat{ξ}}_{i}^{*} = ({\hat{ξ}}_{i 1}^{*}, \dots, {\hat{ξ}}_{i K_{m}}^{*})^{⊤}$ denote the subject-level multivariate FPC score vector. The window-specific multivariate score vector for record $(i, j)$ is then defined as ${\hat{ξ}}_{i j}^{*} = G_{i j}^{* ⊤} {\hat{ξ}}_{i}^{*} .$ These window-specific multivariate scores serve as time-dependent covariates in the recurrent event model.

3. Simulation

3.1. Data-generating mechanisms

To evaluate the performance of the proposed FRAILTY method, we conducted a simulation study following the setup described in Goldsmith et al.²⁹ We generated a single observed functional covariate ${\tilde{X}}_{i} (t)$ as ${\tilde{X}}_{i} (t) = X_{i} (t) + ϵ_{i}$ , where $ϵ_{i} \sim N (0, σ_{ϵ}^{2})$ with measurement error variance $σ_{ϵ} = 0.5$ . The underlying true covariate was constructed as $X_{i} (t_{)} = u_{i 0} + u_{i 1} t + \sum_{k = 1}^{10} {u_{i k 1} \sin (\frac{π k}{3} t_{)} + u_{i k 2} \cos (\frac{π k}{3} t_{)}}$ , where $u_{i 0}, u_{i 1} \sim N (0, 0.2)$ and $u_{i k 1}, u_{i k 2} \sim N (0, 1 / k^{2})$ . We examined four scenarios for the functional coefficient $β (t)$ to reflect a range of dynamic relationships between the functional covariate and the outcome:

(i)
Constant effect: $β (t) = 0.3$ , representing a time-invariant relationship.
(ii)
Smooth periodic oscillation effect: $β (t) = \frac{1}{2} {\sin (\frac{π}{3} t) + \cos (\frac{π}{3} t)}$ , modeling a smooth oscillatory effect over time.
(iii)
Cyclic variation: $β (t) = \frac{1}{4} {\sin (4 t) / 5 - \cos (4 t) + 1}$ , simulating periodic fluctuations.
(iv)
Irregular fluctuations: $β (t) = - 0.1 p (t | 0.5, 0.1) + 0.4 p (t | 1.5, 0.2) + 0.2 p (t | 2.5, 0.3)$ , where $p (t | μ, σ)$ denotes the normal density with mean $μ$ and standard deviation $σ$ , representing localized time-dependent effects.
These scenarios encompass a spectrum of dynamic behaviors, from simple constant effects to more complex time-dependent patterns. In addition to the functional covariate, we incorporated a binary scalar covariate drawn from a Bernoulli distribution with a success probability of $0.5$ and coefficient of $α = - 3$ . Recurrent event data were simulated from the frailty model in Equation (1), allowing each subject to experience up to six recurrent events. Gap times between successive events followed a Weibull baseline hazard with scale = $1$ and shape = $2$ . Non-informative censoring times were independently drawn from a uniform $U (0, 3)$ .

Simulations were conducted for sample sizes $M \in {100, 300, 500}$ , with the number of retained functional principal components ( $K_{x}$ ) determined based on PVE. Each configuration was evaluated using $1, 000$ independently generated datasets. The proposed FRAILTY method, which utilizes an eigenbasis with $90 %$ and $95 %$ PVE for constructing $β (t)$ , was compared against two existing approaches. The first approach was the cubic B-spline basis model, constructed from piecewise cubic polynomials with knot selection via AIC.³⁰ The second approach was the Bayesian joint model (JMbayes), implemented via the R package “JMbayes2”,³¹ which assumes dependence between longitudinal biomarkers and recurrent events through shared random effects. In this approach, the longitudinal trajectory is modeled using a mixed-effects regression, while the recurrent event process is governed by a proportional hazards model, with parameter estimation performed under a Bayesian framework.

Finally, to assess statistical power of FRAILTY under the cyclic variation scenario, we tested $H_{0} : β (t) = 0$ versus $H_{A} : β (t) \neq 0$ using the Wald-type test statistic $T_{W}$ . Sample size was fixed at $M = 300$ , and the effect size was varied as $β (t) = \frac{1}{4} κ {\frac{\sin (4 t)}{5} - \cos (4 t) + 1}$ , $κ \geq 0$ , where $κ$ is a scaling constant that determines the magnitude of departure from the null.
3.2. Model evaluation

We assessed the performance of the proposed FRAILTY method in estimating the functional coefficient $β (t)$ using three key metrics as described by Yang et al.:¹⁶ (i) Integrated bias (denoted as BIAS $^{2}$ ): $\int_{T} {E [\hat{β} (t)] - β_{0} (t)}^{2} d t$ , which quantifies the squared systematic deviation of the estimator’s expectation from the true coefficient over time; (ii) Integrated variance (denoted as VAR): $\int_{T} E {\hat{β} (t) - E [\hat{β} (t)]}^{2} d t$ , which measures the average variability of the estimator $\hat{β} (t)$ around its expected value; and (iii) Integrated mean squared error (denoted as MSE): $\int_{T} E {[\hat{β} (t) - β_{0} (t)]}^{2} d t$ , which combines both bias and variance to capture the total estimation error. For the scalar parameter $α$ , analogous metrics were computed, including bias ${E (\hat{α}) - α_{0}}^{2}$ , variance $E {\hat{α} - E (\hat{α})}^{2}$ , and mean squared error $E (\hat{α} - α_{0})^{2}$ .

We further evaluated the model’s discriminative ability using the time-dependent concordance index (C-index) of Kim et al.,³² designed for recurrent event data. It is defined as

C (t) = \Pr {M_{i} (t) > M_{j} (t) ∣ N_{i} (t) > N_{j} (t)}, t \in [0, T],

where

M_{i} (t)

is the predicted risk score and

N_{i} (t)

is the observed number of recurrent events for subject

i

up to time

t

. This measure reflects the probability that a subject with more observed events has a higher predicted risk, thus capturing the model’s ability to correctly rank individuals over time.

3.3. Simulation results

Table 1 presents the performance of the proposed FRAILTY method using an eigenbasis (with 90% and 95% PVE), the B-spline basis method, and the JMbayes method in estimating $β (t)$ and $α$ across $1, 000$ simulated datasets. The evaluation metrics include integrated bias (BIAS $^{2}$ ), integrated variance (VAR), and integrated mean squared error (MSE) to assess the trade-offs between bias, variance, and overall estimation error. In the first scenario, where $β (t)$ remains constant, JMbayes consistently yields the smallest MSE across all sample sizes, with minimal bias and variance. This occurs because JMbayes estimates $β (t)$ as a single point, implicitly assuming a constant effect, which results in stable and precise estimates. The FRAILTY method at $90 %$ PVE performs competitively, with a moderate MSE, while the $95 %$ PVE setting slightly reduces bias but increases variance, leading to a modest rise in MSE. The B-spline basis method performs the worst, yielding the highest MSE across all sample sizes, particularly in smaller samples where its variance is substantial. In the second scenario, where $β (t)$ exhibits a smooth periodic oscillatory pattern over time, the performance differences among methods become more pronounced. JMbayes achieves the smallest variance but suffers from the highest bias, producing the largest MSE across all sample sizes. This outcome is expected because the single-point estimate from JMbayes cannot capture the time-varying, oscillatory behavior of $β (t)$ . In contrast, the FRAILTY method at $90 %$ PVE achieves the best trade-off between bias and variance, resulting in the lowest MSE. The FRAILTY method with $95 %$ PVE exhibits slightly higher bias and variance, leading to a moderate increase in MSE. The B-spline basis method again suffers from high variance, especially in smaller samples, which reduces its reliability relative to FRAILTY.

Table 1.
Summary of $1, 000$ simulated datasets comparing four methods for estimating $β (t)$ and $α$ : the FRAILTY method with an eigenbasis at $90 %$ and $95 %$ PVE, the B-spline basis method, and the JMbayes method.

Scenario 1: Constant effect Scenario 2: Oscillation effect Scenario 3: Cyclic variation Scenario 4: Irregular fluctuations

n Method Parameter BIAS $^{2}$ VAR MSE BIAS $^{2}$ VAR MSE BIAS $^{2}$ VAR MSE BIAS $^{2}$ VAR MSE

100 FRAILTY 90% PVE $β (t)$ 0.064 0.115 0.179 0.060 0.047 0.107 0.119 0.361 0.480 0.131 0.329 0.460

FRAILTY 90% PVE $α$ 0.013 0.121 0.134 0.002 0.127 0.129 0.009 0.121 0.130 0.053 0.203 0.256

FRAILTY 95% PVE $β (t)$ 0.030 0.479 0.509 0.061 0.139 0.200 0.076 0.565 0.641 0.120 0.538 0.658

FRAILTY 95% PVE $α$ 0.020 0.125 0.145 0.005 0.131 0.136 0.013 0.123 0.135 0.061 0.206 0.268

B-spline basis $β (t)$ 0.205 33.956 34.161 0.108 1.508 1.616 0.149 13.427 13.576 0.527 12.763 13.290

B-spline basis $α$ 0.029 0.127 0.156 0.019 0.141 0.160 0.035 0.130 0.165 0.100 0.218 0.318

JMbayes $β (t)$ 0.056 0.030 0.086 0.504 0.033 0.537 0.138 0.031 0.168 0.234 0.033 0.260

JMbayes $α$ 0.046 0.117 0.163 0.243 0.098 0.342 0.078 0.108 0.187 0.082 0.171 0.253

300 FRAILTY 90% PVE $β (t)$ 0.069 0.037 0.106 0.069 0.014 0.083 0.135 0.096 0.231 0.138 0.087 0.225

FRAILTY 90% PVE $α$ 0.003 0.038 0.041 0.001 0.037 0.037 0.000 0.037 0.038 0.016 0.056 0.073

FRAILTY 95% PVE $β (t)$ 0.029 0.141 0.170 0.076 0.035 0.111 0.089 0.212 0.301 0.122 0.174 0.296

FRAILTY 95% PVE $α$ 0.004 0.038 0.042 0.000 0.037 0.037 0.001 0.037 0.038 0.018 0.057 0.076

B-spline basis $β (t)$ 0.042 6.953 6.995 0.119 0.390 0.509 0.037 3.804 3.841 0.363 3.303 3.666

B-spline basis $α$ 0.004 0.038 0.041 0.000 0.037 0.037 0.003 0.038 0.041 0.025 0.059 0.083

JMbayes $β (t)$ 0.062 0.009 0.070 0.509 0.009 0.518 0.142 0.009 0.150 0.236 0.012 0.248

JMbayes $α$ 0.103 0.032 0.136 0.337 0.025 0.362 0.145 0.031 0.176 0.176 0.048 0.225

500 FRAILTY 90% PVE $β (t)$ 0.068 0.021 0.089 0.070 0.008 0.078 0.135 0.062 0.197 0.137 0.047 0.184

FRAILTY 90% PVE $α$ 0.002 0.022 0.023 0.002 0.022 0.024 0.000 0.021 0.021 0.011 0.031 0.042

FRAILTY 95% PVE $β (t)$ 0.022 0.087 0.109 0.081 0.021 0.102 0.084 0.132 0.216 0.122 0.099 0.221

FRAILTY 95% PVE $α$ 0.002 0.022 0.024 0.001 0.022 0.023 0.000 0.021 0.021 0.012 0.031 0.043

B-spline basis $β (t)$ 0.065 3.679 3.744 0.129 0.217 0.346 0.033 1.813 1.846 0.210 1.808 2.018

B-spline basis $α$ 0.001 0.021 0.023 0.001 0.022 0.023 0.001 0.021 0.022 0.015 0.031 0.046

JMbayes $β (t)$ 0.059 0.005 0.064 0.504 0.006 0.510 0.139 0.005 0.144 0.232 0.008 0.240

JMbayes $α$ 0.119 0.017 0.136 0.366 0.015 0.380 0.161 0.017 0.177 0.194 0.026 0.220

			Scenario 1: Constant effect	Scenario 2: Oscillation effect	Scenario 3: Cyclic variation	Scenario 4: Irregular fluctuations
100	FRAILTY 90% PVE	$β (t)$	0.064	0.115	0.179	0.060	0.047	0.107	0.119	0.361	0.480	0.131	0.329	0.460
	FRAILTY 90% PVE	$α$	0.013	0.121	0.134	0.002	0.127	0.129	0.009	0.121	0.130	0.053	0.203	0.256
	FRAILTY 95% PVE	$β (t)$	0.030	0.479	0.509	0.061	0.139	0.200	0.076	0.565	0.641	0.120	0.538	0.658
	FRAILTY 95% PVE	$α$	0.020	0.125	0.145	0.005	0.131	0.136	0.013	0.123	0.135	0.061	0.206	0.268
	B-spline basis	$β (t)$	0.205	33.956	34.161	0.108	1.508	1.616	0.149	13.427	13.576	0.527	12.763	13.290
	B-spline basis	$α$	0.029	0.127	0.156	0.019	0.141	0.160	0.035	0.130	0.165	0.100	0.218	0.318
	JMbayes	$β (t)$	0.056	0.030	0.086	0.504	0.033	0.537	0.138	0.031	0.168	0.234	0.033	0.260
	JMbayes	$α$	0.046	0.117	0.163	0.243	0.098	0.342	0.078	0.108	0.187	0.082	0.171	0.253
300	FRAILTY 90% PVE	$β (t)$	0.069	0.037	0.106	0.069	0.014	0.083	0.135	0.096	0.231	0.138	0.087	0.225
	FRAILTY 90% PVE	$α$	0.003	0.038	0.041	0.001	0.037	0.037	0.000	0.037	0.038	0.016	0.056	0.073
	FRAILTY 95% PVE	$β (t)$	0.029	0.141	0.170	0.076	0.035	0.111	0.089	0.212	0.301	0.122	0.174	0.296
	FRAILTY 95% PVE	$α$	0.004	0.038	0.042	0.000	0.037	0.037	0.001	0.037	0.038	0.018	0.057	0.076
	B-spline basis	$β (t)$	0.042	6.953	6.995	0.119	0.390	0.509	0.037	3.804	3.841	0.363	3.303	3.666
	B-spline basis	$α$	0.004	0.038	0.041	0.000	0.037	0.037	0.003	0.038	0.041	0.025	0.059	0.083
	JMbayes	$β (t)$	0.062	0.009	0.070	0.509	0.009	0.518	0.142	0.009	0.150	0.236	0.012	0.248
	JMbayes	$α$	0.103	0.032	0.136	0.337	0.025	0.362	0.145	0.031	0.176	0.176	0.048	0.225
500	FRAILTY 90% PVE	$β (t)$	0.068	0.021	0.089	0.070	0.008	0.078	0.135	0.062	0.197	0.137	0.047	0.184
	FRAILTY 90% PVE	$α$	0.002	0.022	0.023	0.002	0.022	0.024	0.000	0.021	0.021	0.011	0.031	0.042
	FRAILTY 95% PVE	$β (t)$	0.022	0.087	0.109	0.081	0.021	0.102	0.084	0.132	0.216	0.122	0.099	0.221
	FRAILTY 95% PVE	$α$	0.002	0.022	0.024	0.001	0.022	0.023	0.000	0.021	0.021	0.012	0.031	0.043
	B-spline basis	$β (t)$	0.065	3.679	3.744	0.129	0.217	0.346	0.033	1.813	1.846	0.210	1.808	2.018
	B-spline basis	$α$	0.001	0.021	0.023	0.001	0.022	0.023	0.001	0.021	0.022	0.015	0.031	0.046
	JMbayes	$β (t)$	0.059	0.005	0.064	0.504	0.006	0.510	0.139	0.005	0.144	0.232	0.008	0.240
	JMbayes	$α$	0.119	0.017	0.136	0.366	0.015	0.380	0.161	0.017	0.177	0.194	0.026	0.220

Performance was evaluated using the integrated bias, the integrated variance, and the integrated mean squared error across four functional parameter scenarios: Scenario 1: $β (t) = 0.3$ , scenario 2: $β (t) = \frac{1}{2} {\sin (\frac{π}{3} t) + \cos (\frac{π}{3} t)}$ , scenario 3: $β (t) = \frac{1}{4} {\sin (4 t) / 5 - \cos (4 t) + 1}$ , and scenario 4: $β (t) = - 0.1 p (t | 0.5, 0.1) + 0.4 p (t | 1.5, 0.2) + 0.2 p (t | 2.5, 0.3)$ . Sample sizes of $M = 100, 300$ , and $500$ were considered.

In the third scenario, where $β (t)$ follows a regular cyclic pattern, the FRAILTY method at $90 %$ PVE demonstrates the best overall performance, with a better bias-variance trade-off and lower MSE than 95% PVE version, which has higher variance. The B-spline basis method, which is inherently suited to capturing cyclic effects, achieves the smallest bias for larger sample sizes ( $M = 300$ and $500$ ). However, its higher variance reduces overall stability, especially for smaller samples. As in Scenario 2, JMBayes fails to capture the cyclic behavior, resulting in large bias despite low variance. These findings underscore the value of more flexible approaches when $β (t)$ exhibits periodic fluctuations. In the fourth scenario, where $β (t)$ displays irregular fluctuations, the FRAILTY method at $90 %$ PVE again provides the best balance between bias and variance, yielding the lowest MSE. While the FRAILTY at $95 %$ PVE achieves slightly lower bias, the gain is offset by a noticeable increase in variance, leading to a slightly higher MSE. The B-spline basis method remains unreliable due to its excessive variance, while JMbayes, although maintaining lower variance, consistently fails to capture irregular temporal variation, resulting in high bias.

The FRAILTY method performs well across all scenarios but achieves relatively better performance in the second scenario. This can be attributed to the fact that the true $β (t)$ is constructed using the same basis functions as the functional covariate $X (t)$ , aligning with FRAILTY’s underlying assumption that $β (t)$ and $X (t)$ lie in the same functional space. Even in scenarios where this assumption is not met, the FRAILTY method maintains good performance, producing stable estimates with a favorable bias-variance trade-off.

Table 1 also reports results for estimating the scalar parameter $α$ . The FRAILTY method at both $90 %$ and $95 %$ PVE consistently outperforms the other methods, achieving the smallest MSE in all scenarios and sample sizes. The B-spline basis method tends to have slightly higher bias and variance, leading to higher MSE. In contrast, JMbayes performs the worst in Scenarios 2, 3, and 4, where it exhibits substantially higher bias and MSE even as the sample size increases. The difference in performance between the FRAILTY method at $90 %$ and $95 %$ PVE is minimal, suggesting that increasing the PVE threshold beyond $90 %$ offers little practical benefit in reducing estimation error.

In summary, the FRAILTY method, particularly when using $90 %$ PVE threshold, consistently outperforms the B-spline basis method across all scenarios. It achieves superior accuracy by maintaining an effective balance between bias and variance, resulting in lower MSE for both simple and complex functional forms, especially as sample sizes increase. In contrast, the B-spline basis method struggles in settings with complex or irregular fluctuations, where it exhibits high variance and instability. The JMbayes method performs well when $β (t)$ is constant but lacks the flexibility to capture time-varying effects, leading to substantial bias in dynamic scenarios. Overall, these findings highlight the robustness of the FRAILTY method for estimating functional parameters under diverse modeling conditions. As shown in Tables S1-S4 of Appendix B of the Supplementary Material, selecting two or three components, which correspond to $90 % - 97 %$ of the total variability, consistently optimizes model performance by minimizing both bias and MSE.

Figure 1 displays the C-index values, evaluated at a fixed time point $t = 2.5$ , for the four methods across different sample sizes and functional coefficient structures. The FRAILTY method with an eigenbasis at the $90 %$ PVE threshold achieves slightly higher C-index values than its $95 %$ PVE counterpart, with variants consistently outperforming the B-spline basis method in terms of predictive discrimination and stability. In contrast, the JMbayes method yields lower C-index values than the other methods, indicating its limited capability to capture the true discriminative risk scores in these settings. As the sample size increases, the variability in the estimated C-index values decreases, reflecting improved stability of the estimates. Across functional coefficient structures, the C-index remains relatively stable, although a modest increase is observed under the proposed method as the complexity of $β (t)$ increases. Overall, these findings highlight the robustness of the FRAILTY method and its capacity to deliver reliable risk predictions across diverse modeling scenarios.

Figure 1.

Simulation results comparing the C-index at time point $t = 2.5$ for the four methods: the FRAILTY method at $90 %$ PVE, the FRAILTY method at $95 %$ PVE, the B-spline basis method, and the JMbayes method. These methods were evaluated under four different true underlying functional coefficients scenarios: constant effect, smooth periodic oscillation effect, cyclic variation, and irregular fluctuations, with varying sample sizes.

Figure 2 illustrates the results of the statistical power analysis for three methods: the FRAILTY method at $90 %$ PVE, the FRAILTY method at $95 %$ PVE, and the B-spline basis method, evaluated under varying effect size. In the left panel, all three methods exhibit increased statistical power as the effect size $κ$ increases. Among them, the FRAILTY method at $90 %$ PVE consistently attains the highest power, followed closely by the FRAILTY at $95 %$ PVE. Although the B-spline basis method also demonstrates effectiveness, its power remains generally lower than that of the FRAILTY method. At $κ = 1$ , all three methods exceed $80 %$ power, indicating adequate sensitivity to detect moderate effect sizes. The right panel of Figure 2 examines the relationship between the number of retained FPCs and statistical power within the proposed FRAILTY method. The first five FPCs explain $53.0 %$ , $91.2 %$ , $96.8 %$ , $98.8 %$ , and $99.7 %$ of the total variability, respectively. The graph shows that adding more FPCs does not necessarily enhance power. The highest power across different $κ$ values is achieved when using only two FPCs, which account for $91.2 %$ of the total variability. Beyond this point, power declines – particularly for larger $κ$ values. For example, when the top five FPCs (explaining $99.7 %$ of the total variability) are included, power decreases for $κ$ values greater than $1$ . These findings highlight the importance of parsimonious modeling: selecting an optimal number of FPCs improves detection of both scale and functional effects, whereas including excessive components may lead to overfitting and reduce efficiency.

Figure 2.

Simulation results for statistical power analysis with a sample size of $M = 300$ under the true functional coefficient $β (t) = \frac{1}{4} κ {\sin (4 t) / 5 - \cos (4 t) + 1}$ , with $κ$ varying from $0$ to $2$ . The left panel compares statistical power across three methods, including the FRAILTY method at $90 %$ PVE, the FRAILTY at $95 %$ PVE, and the B-spline basis method. The right panel compares statistical power for the FRAILTY method using different numbers of functional principal components (npc; from $1$ to $5$ ), with the corresponding PVE of $53.0 %$ , $91.2 %$ , $96.8 %$ , $98.8 %$ , and $99.7 %$ .

4. Data application

4.1. Application to the systolic blood pressure intervention trial (SPRINT) study

The SPRINT study is a multicenter randomized controlled trial designed to determine whether intensive systolic blood pressure (SBP) management reduces the risk of cardiovascular disease (CVD) more effectively than the standard treatment.³³ Specifically, the trial compared an intensive SBP target of $< 120$ mmHg with the standard target of $< 140$ mmHg. A total of $9, 361$ participants aged $50$ years or older, with baseline SBP of $\geq 130$ mmHg and evidence of CVD or elevated CVD risk, were enrolled. The primary outcome was a composite time-to-event measure, including myocardial infarction (MI; confirmed by electrocardiogram or hospitalization), stroke, heart failure, non-MI acute coronary syndrome, or CVD-related death.³³ Longitudinal SBP measurements were collected on a structured schedule – at baseline, 1 month, 3 months, and every 3 months thereafter. This structured data collection produced relatively uniform and regularly spaced longitudinal measurements, enabling consistent tracking of SBP changes over time. Although high SBP is a well-established CVD risk factor,^34,35 modeling its dynamic association with CVD risk remains challenging.

From the original cohort, $1, 874$ participants with a baseline history of CVD and complete SBP measurements were selected for analysis. Figure 3(a) shows the distribution of recurrent CVD events, while Figure 3(b) depicts SBP trajectories for a random sample of $100$ participants. Each line represents an individual’s SBP measurements over time, with the vertical axis indicating SBP (mmHg) and the horizontal axis representing time in months. These trajectories reveal substantial heterogeneity in SBP patterns, highlighting individualized SBP fluctuations. Figure 3(c) illustrates the first five FPCs, capturing the primary modes of variation in SBP trajectories. The $y$ -axis represents the magnitude of each component, while the $x$ -axis represents time in months. The first component (FPC1) accounts for $78 %$ of the total variability in SBP trajectories. Subsequent components account for additional variability, with cumulative explained variance reaching $87 %$ , $91 %$ , $94 %$ , and $96 %$ for the second, third, fourth, and fifth components, respectively.

Figure 3.

(a) Visualization of individual patient data, illustrating follow-up time and frequency of recurrent events throughout the SPRINT study. (b) Raw systolic blood pressure (mmHg) trajectories for $100$ participants who experienced recurrent CVD events in the SPRINT study. (c) Illustration of the leading five FPCs, capturing the primary modes of variation in systolic blood pressure values. (d) Estimation of $β (t)$ , which quantifies the time-varying effect of SBP on CVD risk, while accounting for age and treatment assignment. The analysis compares the FRAILTY method with an eigenbasis at $90 %$ and $95 %$ PVE to the B-spline basis method. The shaded regions represent the 95% pointwise confidence bands.

Figure 3(d) illustrates the estimated time-varying effect of SBP on CVD risk, adjusted for age and treatment assignment. The estimated $β (t)$ fluctuates around $0$ throughout follow-up, indicating temporal variability in the association between SBP and CVD risk. During the first $15$ months, higher SBP is associated with increased CVD risk. Beyond this period, the association diminishes, likely reflecting the cumulative effect of sustained antihypertensive treatment. As treatment stabilizes SBP over time, the risk attributable to elevated SBP appears to decline, leading to a reduction in the adjusted hazard ratio (HR). The B-spline basis method yields $β (t)$ estimates distinct from the FRAILTY method, yet both approaches capture similar overall trends. At month $30$ , the C-index values for the FRAILTY method using the top three and top five FPCs are $0.971$ and $0.956$ , respectively – both exceeding the C-index of $0.944$ for the B-spline basis model. These results indicate superior predictive performance of the proposed FRAILTY method.

4.2. Application to the multicenter collaboration to study treatment outcomes in nephrolithiasis evaluation (MSTONE) cohort

The MSTONE study is a retrospective cohort study that compiles data from four tertiary care stone centers in four cities of the United States (Dallas, TX; Madison, WI; Iowa City, IA; and Nashville, TN). A thorough chart review identified $396$ patients with a minimum follow-up of three months. The dataset encompasses a broad range of clinical information, including patient demographics, laboratory results, prescribed medications, imaging studies, and longitudinal clinical measurements. Additionally, multiple 24-hour urine collections were conducted to evaluate metabolic abnormalities. Prior analysis of 24-hour urine collections have revealed intricate links between urinary composition and kidney stone risk. As reported by Ferraro et al.,³⁶ elevated urinary calcium, oxalate, and sodium levels increase the likelihood of stone formation, whereas higher urine volume, uric acid, citrate, potassium, and magnesium are protective. Moreover, higher urine pH has been linked to an increased risk of stone recurrence.³⁷ However, analyzing MSTONE data is challenging due to the recurrent nature of kidney stone events and the irregular timing of 24-hour urine collections. Traditional models such as Cox regression struggle with event dependence, intra-subject correlations, and data sparsity, limiting their utility.³⁷

The proposed FRAILTY method was applied to predict the risk of stone recurrence using irregular and sparse longitudinal urine pH data, and its results were compared to those obtained using the B-spline basis method. Figure 4(a) visualizes the distribution of recurrent kidney stone events among all subjects, while Figure 4(b) presents individual urine pH trajectories. Each line represents a patient’s urine pH measurements over time, revealing substantial heterogeneity. Some patients maintain relatively stable urine pH values, whereas others display pronounced fluctuations. The irregular spacing of points reflects the sporadic nature of data collection, often due to missed clinical visits. Figure 4(c) illustrates the first two FPCs derived from the urine pH data, which capture the dominant modes of variation in these trajectories. The first component explains $76 %$ of the total variability, representing overall shifts in urine pH over time. When combined with the second component, the explained variance reaches $98 %$ , with the second component capturing wave-like periodic fluctuations. These components together reflect both baseline levels and dynamic changes in urine pH trajectories.

Figure 4.

(a) Visualization of individual patient data, displaying follow-up time and the frequency of recurrent stone events throughout the MSTONE study. (b) Raw urine pH trajectories for $396$ patients from the MSTONE study. (c) Illustration of the leading two FPCs, which represent the primary modes of variation in urine pH values. (d) Estimation of $β (t)$ to evaluate the effect of urine pH on stone recurrence, adjusted for age through three methods: the FRAILTY method with an eigenbasis, applied with the first FPC alone and the first two FPCs combined, as well as the B-spline basis method. The shaded regions represent the 95% pointwise confidence bands.

Figure 4(d) presents the estimated $β (t)$ , which quantifies the effect of urine pH on kidney stone recurrence over time after adjusting for age. Three models were compared: the FRAILTY method with one FPC (explaining $76 %$ of the total variability), the FRAILTY method with two FPCs (explaining $98 %$ of the total variability), and the B-spline basis method. Results show that as $β (t)$ increases over time, urine pH becomes a stronger predictor of recurrence risk. For example, if a stone event occurred at year 2, the FRAILTY method with one FPC estimates the integral $\hat{β} (t)$ from year 2 to year 5 as $0.35$ , corresponding to an adjusted HR of $\exp (0.35) = 1.42$ . This implies that, at year 5, patients with higher urine pH have a greater likelihood of stone recurrence than those with lower urine pH, after controlling for age. We note that the FRAILTY method, especially the one-FPC model, provides smooth and stable estimates, indicting a consistent association between urine pH and stone recurrence risk. The two-FPCs model captures additional variation while preserving a similar trend. By contrast, the B-spline basis method yields more pronounced fluctuations in $β (t)$ estimates, which could signify potential localized changes in the association between urine pH and stone recurrence but could also indicate overfitting to noise. In terms of predictive performance, the FRAILTY method achieved C-index values of $0.888$ for the one-FPC model and $0.889$ for the two-FPCs model, both exceeding the B-spline basis ( $0.876$ ). Overall, the FRAILTY method, especially with one FPC, demonstrates higher predictive accuracy for modeling the time-varying effect of urine pH on stone recurrence. The application of mFRAILTY is presented in Appendix C of the Supplementary Material.

5. Discussion

This paper introduces the FRAILTY method, an approach designed to address the challenges of jointly analyzing longitudinal and recurrent event data. By leveraging FPCA and incorporating an AR(1) frailty term, the proposed FRAILTY method accommodates the complexities of time-varying effects, sparse and irregular measurements, and within-subject correlations across recurrent events. Its flexibility to integrate scalar and functional covariates, coupled with dynamic prediction capabilities, offers an alternative for analyzing recurrent event data. Across a wide range of simulation scenarios, the FRAILTY method consistently outperforms the B-spline basis method and the JMBayes method when $β (t)$ is non-constant. In particular, the FRAILTY with an eigenbasis at $90 %$ PVE provides an optimal balance between bias and variance, yielding the lowest MSE and the highest C-index values. These results validate the method’s ability to adapt to a wide range of functional parameter settings, from simple constant effects to complex cyclic and irregular patterns.

Applied to the SPRINT and MSTONE datasets, the FRAILTY model provided clinically relevant insights. In the SPRINT study, it revealed temporal changes in the effect of SBP on CVD risk, accounting for the cumulative influence of antihypertensive treatment – insights not available from traditional Cox regression. In the MSTONE cohort, the FRAILTY method captured the dynamic association between longitudinal urine pH and stone recurrence risk, generating smooth and interpretable hazard ratio estimates. The multivariate extension, mFRAILTY, which jointly incorporates urine citrate and uric acid, further demonstrated scalability and the ability to uncover nuanced predictor interactions.

Despite these strengths, the FRAILTY method has limitations that warrant further exploration. First, the current two-step estimation procedure, while computationally efficient, may introduce bias due to the separation of longitudinal and survival processes. Specifically, misspecification in the longitudinal model can propagate to the survival component, compromising association and prediction accuracy. Future work could explore fully integrated, one-step estimation approaches to mitigate this risk and improve parameter estimation.

Second, the assumption of an AR(1) frailty structure may not adequately represent more complex or nonlinear correlation structures among recurrent events. Higher-order autoregressive terms or time-varying frailty structures could better account for dependencies among recurrent events in scenarios with intricate temporal dynamics. Incorporating such extensions could significantly enhance the model’s flexibility and applicability across a broader range of real-world datasets.

Third, selecting the optimal number of FPCs remains a practical challenge. Although selecting components that explain the majority of the variance in $X (t)$ is standard, this approach may not always capture the true structure of $β (t)$ . While criteria such as AIC or PVE provide practical guidance, they may overlook subtle but important patterns in higher-order components, potentially leading to an incomplete representation of $β (t)$ . To address this limitation, adaptive methods such as functional partial least squares or penalized regression could be explored, refining the selection process to achieve a balance between model parsimony and completeness.

Additionally, both model formulation and statistical estimation require exact event times to be observed. However, in practical applications, event times may only be available within intervals between clinical visits, posing challenges related to interval censoring. Another concern is the treatment of study withdrawals as censoring. Dropout-related missingness is often non-ignorable, as it may depend on both covariates and outcomes, violating the assumption of independent censoring. A particularly important source of dropout is terminal events, such as death, which frequently occur in longitudinal studies and introduce additional complexity to recurrent event modeling. Incorporating terminal events into joint models with longitudinal functional data could improve time-to-event modeling, providing a more comprehensive understanding of how functional covariates relate to clinical outcomes.

In conclusion, the FRAILTY method’s integration of FPCA and dynamic frailty modeling provides a robust and versatile framework for analyzing longitudinal and recurrent event data. Its demonstrated effectiveness in both simulation and real-world applications underscores its promise as a valuable tool for advancing research and clinical practice in health-related fields.

Supplemental Material

sj-pdf-1-smm-10.1177_09622802261457281 - Supplemental material for Joint analysis of longitudinal and recurrent event data: A functional regression approach with autoregressive frailty

Supplemental material, sj-pdf-1-smm-10.1177_09622802261457281 for Joint analysis of longitudinal and recurrent event data: A functional regression approach with autoregressive frailty by Zifang Kong, Sy Han Chiou, Naim M Maalouf and Yu-Lun Liu in Statistical Methods in Medical Research

Footnotes

Acknowledgments

The authors sincerely thank the Editor and the anonymous reviewers for their thoughtful evaluation and constructive comments, which have helped improve the clarity of this work. They are also grateful to Dr. Daniel F. Heitjan and Dr. Chul Moon for their valuable insights and helpful feedback. In addition, they thank Dr. Brett Johnson for generously providing access to the MSTONE dataset. This manuscript was prepared using SPRINT Research Materials obtained from the NHLBI Biologic Specimen and Data Repository Information Coordinating Center (BioLINCC). The content of this publication does not necessarily reflect the opinions or views of the SPRINT Research Group or the NHLBI.

ORCID iD

Yu-Lun Liu

Author's Note

Zifang Kong is now affiliated with the Mathematics Department, Milwaukee School of Engineering, Milwaukee, WI, USA.

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Institutes of Health—NIDDK grant: R01DK128237 (ZK, NM, and YL).

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this paper.

Supplemental material

Supplemental material for this article is available online.

References

Han

Slate

Peña

. Parametric latent class joint model for a longitudinal biomarker and recurrent events. Stat Med 2007; 26: 5285–5302.

Liu

Huang

. Joint analysis of correlated repeated measures and recurrent events processes in the presence of death, with application to a study on acquired immune deficiency syndrome. J R Stat Soc Ser C: Appl Stat 2009; 58: 65–81.

Kim

Zeng

Chambless

, et al. Joint models of longitudinal data and recurrent events with informative terminal event. Stat Biosci 2012; 4: 262–281.

Cai

Wang

Chan

KCG

. Joint modeling of longitudinal, recurrent events and failure time data for survivor’s population. Biometrics 2017; 73: 1150–1160.

Ramsay

Silverman

. Functional Data Analysis. 2nd ed. New York: Springer, 2005, pp.147–172.

Yao

Müller

Wang

. Functional linear regression analysis for longitudinal data. Ann Stat 2005; 33: 2873–2903.

Greven

Crainiceanu

Caffo

, et al. Longitudinal functional principal component analysis. In: Recent advances in functional data analysis and related topics, 2005, pp.149–154. Springer.

Yao

. Functional principal component analysis for longitudinal and survival data. Stat Sin 2007; 17: 965–983.

Hall

Müller

Wang

. Properties of principal component methods for functional and longitudinal data analysis. Ann Stat 2006; 34: 1493–1517.

10.

Hall

Müller

Yao

. Modelling sparse generalized longitudinal observations with latent gaussian processes. J R Stat Soc Ser B: Stat Methodol 2008; 70: 703–723.

11.

Xiao

Luo

. Fast covariance estimation for multivariate sparse functional data. Stat 2020; 9: e245.

12.

Boente

Salibián-Barrera

. Robust functional principal components for sparse longitudinal data. Metron 2021; 79: 159–188.

13.

Chen

Müller

, et al. Stringing high-dimensional data for functional analysis. J Am Stat Assoc 2011; 106: 275–284.

14.

Yan

Lin

Huang

. Dynamic prediction of disease progression for leukemia patients by functional principal component analysis of longitudinal expression levels of an oncogene. Ann Appl Stat 2017; 11: 1649–1670.

15.

Kong

Ibrahim

Lee

, et al. Flcrm: Functional linear Cox regression model. Biometrics 2018; 74: 109–117.

16.

Yang

Zhu

Ahn

, et al. Weighted functional linear Cox regression model. Stat Methods Med Res 2021; 30: 1917–1931.

17.

Cui

Crainiceanu

Leroux

. Additive functional Cox model. J Comput Graph Stat 2021; 30: 780–793.

18.

Andersen

Gill

. Cox’s regression model for counting processes: a large sample study. Ann Stat 1982; 10: 1100–1120.

19.

Prentice

Williams

Peterson

. On the regression analysis of multivariate failure time data. Biometrika 1981; 68: 373–379.

20.

Wei

Lin

Weissfeld

. Regression analysis of multivariate incomplete failure time data by modeling marginal distributions. J Am Stat Assoc 1989; 84: 1065–1073.

21.

Therneau

Grambsch

. The Cox model. New York, NY: Springer New York, 2000.

22.

Hong

Song

, et al. Dynamic prediction of disease processes based on recurrent history and functional principal component analysis of longitudinal biomarkers: application for ovarian epithelial cancer. Stat Med 2021; 40: 2006–2023.

23.

Yau

McGilchrist

. Ml and reml estimation in survival analysis with time dependent correlated frailty. Stat Med 1998; 17: 1201–1213.

24.

McGilchrist

. Estimation in generalized mixed models. J R Stat Soc Ser B: Stat Methodol 1994; 56: 61–69.

25.

McGilchrist

Yau

. The derivation of Blup, Ml, Reml estimation methods for generalised linear mixed models. Commun Stat-Theory Methods 1995; 24: 2963–2980.

26.

Kong

Staicu

Maity

. Classical testing in functional linear models. J Nonparametr Stat 2016; 28: 813–838.

27.

Hsu

. Hypothesis testing in functional linear models. Biometrics 2017; 73: 551–561.

28.

Happ

Greven

. Multivariate functional principal component analysis for data observed on different (dimensional) domains. J Am Stat Assoc 2018; 113: 649–659.

29.

Goldsmith

Bobb

Crainiceanu

, et al. Penalized functional regression. J Comput Graph Stat 2011; 20: 830–851.

30.

Gellar

Colantuoni

Needham

, et al. Cox regression models with functional covariates for survival data. Stat Modell 2015; 15: 256–278.

31.

Rizopoulos

Afonso

Papageorgiou

. Jmbayes2: Extended joint models for longitudinal and time-to-event data. http://CRANR-projectorg/package=JMbayes2, R package Version 04-5 2023.

32.

Kim

Schaubel

McCullough

. A c-index for recurrent event data: application to hospitalizations among dialysis patients. Biometrics 2018; 74: 734–743.

33.

Ambrosius

Sink

Foy

, et al. The design and rationale of a multicenter clinical trial comparing two strategies for control of systolic blood pressure: the systolic blood pressure intervention trial (sprint). Clinical Trials 2014; 11: 532–546.

34.

Kannel

. Role of blood pressure in cardiovascular morbidity and mortality. Prog Cardiovasc Dis 1974; 17: 5–24.

35.

Brunström

Carlberg

. Association of blood pressure lowering with mortality and cardiovascular disease across blood pressure levels: a systematic review and meta-analysis. JAMA Intern Med 2018; 178: 28–36.

36.

Ferraro

Taylor

Curhan

. 24-hour urinary chemistries and kidney stone risk. Am J Kidney Dis 2024; 84: 164–169.

37.

Kong

Johnson

Maalouf

, et al. Predicting urinary stone recurrence: a joint model analysis of repeated 24-hour urine collections from the mstone database. Urolithiasis 2024; 52: 156.

Supplementary Material

Please find the following supplemental material available below.

For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.

For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.

0.30 MB

0.00 MB

			Scenario 1: Constant effect			Scenario 2: Oscillation effect			Scenario 3: Cyclic variation			Scenario 4: Irregular fluctuations
n	Method	Parameter	BIAS $^{2}$	VAR	MSE	BIAS $^{2}$	VAR	MSE	BIAS $^{2}$	VAR	MSE	BIAS $^{2}$	VAR	MSE
100	FRAILTY 90% PVE	$β (t)$	0.064	0.115	0.179	0.060	0.047	0.107	0.119	0.361	0.480	0.131	0.329	0.460
	FRAILTY 90% PVE	$α$	0.013	0.121	0.134	0.002	0.127	0.129	0.009	0.121	0.130	0.053	0.203	0.256
	FRAILTY 95% PVE	$β (t)$	0.030	0.479	0.509	0.061	0.139	0.200	0.076	0.565	0.641	0.120	0.538	0.658
	FRAILTY 95% PVE	$α$	0.020	0.125	0.145	0.005	0.131	0.136	0.013	0.123	0.135	0.061	0.206	0.268
	B-spline basis	$β (t)$	0.205	33.956	34.161	0.108	1.508	1.616	0.149	13.427	13.576	0.527	12.763	13.290
	B-spline basis	$α$	0.029	0.127	0.156	0.019	0.141	0.160	0.035	0.130	0.165	0.100	0.218	0.318
	JMbayes	$β (t)$	0.056	0.030	0.086	0.504	0.033	0.537	0.138	0.031	0.168	0.234	0.033	0.260
	JMbayes	$α$	0.046	0.117	0.163	0.243	0.098	0.342	0.078	0.108	0.187	0.082	0.171	0.253
300	FRAILTY 90% PVE	$β (t)$	0.069	0.037	0.106	0.069	0.014	0.083	0.135	0.096	0.231	0.138	0.087	0.225
	FRAILTY 90% PVE	$α$	0.003	0.038	0.041	0.001	0.037	0.037	0.000	0.037	0.038	0.016	0.056	0.073
	FRAILTY 95% PVE	$β (t)$	0.029	0.141	0.170	0.076	0.035	0.111	0.089	0.212	0.301	0.122	0.174	0.296
	FRAILTY 95% PVE	$α$	0.004	0.038	0.042	0.000	0.037	0.037	0.001	0.037	0.038	0.018	0.057	0.076
	B-spline basis	$β (t)$	0.042	6.953	6.995	0.119	0.390	0.509	0.037	3.804	3.841	0.363	3.303	3.666
	B-spline basis	$α$	0.004	0.038	0.041	0.000	0.037	0.037	0.003	0.038	0.041	0.025	0.059	0.083
	JMbayes	$β (t)$	0.062	0.009	0.070	0.509	0.009	0.518	0.142	0.009	0.150	0.236	0.012	0.248
	JMbayes	$α$	0.103	0.032	0.136	0.337	0.025	0.362	0.145	0.031	0.176	0.176	0.048	0.225
500	FRAILTY 90% PVE	$β (t)$	0.068	0.021	0.089	0.070	0.008	0.078	0.135	0.062	0.197	0.137	0.047	0.184
	FRAILTY 90% PVE	$α$	0.002	0.022	0.023	0.002	0.022	0.024	0.000	0.021	0.021	0.011	0.031	0.042
	FRAILTY 95% PVE	$β (t)$	0.022	0.087	0.109	0.081	0.021	0.102	0.084	0.132	0.216	0.122	0.099	0.221
	FRAILTY 95% PVE	$α$	0.002	0.022	0.024	0.001	0.022	0.023	0.000	0.021	0.021	0.012	0.031	0.043
	B-spline basis	$β (t)$	0.065	3.679	3.744	0.129	0.217	0.346	0.033	1.813	1.846	0.210	1.808	2.018
	B-spline basis	$α$	0.001	0.021	0.023	0.001	0.022	0.023	0.001	0.021	0.022	0.015	0.031	0.046
	JMbayes	$β (t)$	0.059	0.005	0.064	0.504	0.006	0.510	0.139	0.005	0.144	0.232	0.008	0.240
	JMbayes	$α$	0.119	0.017	0.136	0.366	0.015	0.380	0.161	0.017	0.177	0.194	0.026	0.220