Abstract
Censored data make survival analysis more complicated because exact event times are not observed. Statistical methodology developed to account for censored observations assumes that patients’ withdrawal from a study is independent of the event of interest. However, in practice, some covariates might be associated to both lifetime and censoring mechanism, inducing dependent censoring. In this case, standard survival techniques, like Kaplan–Meier estimator, give biased results. The inverse probability censoring weighted estimator was developed to correct for bias due to dependent censoring. In this article, we explore the use of inverse probability censoring weighting methodology and describe why it is effective in removing the bias. Since implementing this method is highly time consuming and requires programming and mathematical skills, we propose a user friendly algorithm in
Keywords
1 Introduction
Survival analysis is the study of the distribution of lifetimes, i.e. the times from a pre specified initiating event (e.g. birth, diagnosis, start of treatment) to some terminal event of interest (e.g. death, relapse, remission). It is most prominently (but not exclusively) used in the biomedical sciences. A special feature of survival studies is that it takes time to observe the event of interest. As a result, for a number of subjects, the event is not observed during follow-up, and the only available information is that the event of interest has not taken place yet at the last observation time. This phenomenon is called censoring. Methodologies are developed to include censored subjects in analysis.
Standard methods used to analyze data with censored observations assume that censoring is non-informative. This means that censoring carries no prognostic information about the survival experience. Therefore, individuals who are censored at a specific time point should be as likely to experience an event as those subjects who remain in the study, i.e. the probability of being censored is the same for all subjects at risk. Informative censoring may occur when time to event and time to censoring are dependent, either directly or through covariates. In the latter case, dependence between event and censoring times is induced through covariates associated to time to event and time to censoring. This type of informative censoring is called dependent censoring and it is the focus of this article. For example, young patients may be more likely to quit a treatment than older patients. This implies that the event is observed more often in older than in young patients, leading to biased estimates of survival probabilities, because the observed event times are not representative for the event times of the whole population.
Since dependent censoring can cause bias in the results, it is crucial to consider this aspect in the analysis. Inverse probability censoring weighting (IPCW) was proposed1–3 to correct for the presence of dependent censoring. This method is based on the idea of compensating for censored subjects by giving extra weight to subjects who are not censored. More specific, IPCW assigns extra weight to subjects with similar characteristics to the ones that are censored.
Implementing IPCW is very time consuming. Programming and mathematical skills are needed to apply the IPCW procedure, and careful bookkeeping is required. In this article, a user friendly implementation of IPCW in
As illustration, IPCW is applied to a toy data set and to a real data set from the Department of Psychiatry of the Leiden University Medical Center (LUMC) in The Netherlands. A simulation study is performed to compare IPCW with the traditional methodology in case dependent censoring is present in the data. Several scenarios were generated with different sample sizes, percentages of censoring, and strengths of the censoring mechanism. Simulations show that correcting for the presence of dependent censoring by using IPCW is crucial and reduces bias in the estimation results.
This article is organized as follows. Notation and a short introduction to techniques concerning survival analysis and details about the concept of dependent censoring are outlined in section 2. The IPCW method is illustrated and applied to a real data set in section 3. In section 4, a simulation study is presented in which the performance of IPCW is investigated. Details concerning the proposed user friendly implementation of the IPCW method in
2 Dependent censoring in survival analysis
2.1 Definition dependent censoring
The aim of survival analysis is to estimate time to a certain event of interest, e.g. death or recovery. However, event times are often incompletely observed, i.e. time to the occurrence of the event of interest is not known for some subjects. This phenomenon is called censoring. Survival data for a subject j are represented as triplets
The Kaplan–Meier (KM) estimator assumes that censoring times are non-informative, i.e. that there is no dependence between lifetime X and censoring mechanism C. Hence, the hazard of X among subjects at risk is the marginal hazard of X, i.e.
A time-dependent Cox proportional hazard model for censoring can be used to test this condition. If equation (4) holds, event time X and censoring time C are dependent through covariates, a phenomenon called dependent censoring, and, as a consequence, equations (2) and (3) do not hold. 3 In this case, estimators based on the independent censoring assumption, like KM estimator, should not be used. 5 A proper methodology that accounts for dependent censoring should then be employed to avoid biased results.
To illustrate the possible bias when dependent censoring is present in the data under investigation, simulated data in which covariates are associated to both time to event and time to censoring are used to estimate the survival function. Results are given in Figure 1 where the traditional KM estimator and the real survival (which simply is the KM estimator in case all events times are observed) are shown together. For details concerning the simulations see Supplementary Material B.
Real survival curve and Kaplan–Meier estimate in case dependent censoring is present in the data.
As expected the KM estimator overestimates the real survival probabilities. This indicates that correction for the presence of dependent censoring is important in order to obtain a good estimator. More examples are shown in the simulation study described in section 4.
2.2 IPCW estimator
The IPCW estimator1–3 was developed to correct for dependent censoring. It corrects for censored subjects by giving extra weight to subjects who are not censored. Typically, these weights are chosen in such a way that individuals who best match the censored subjects will receive more weight. At every observed time point t, each subject j is given a weight which is inversely proportional to the estimated probability of having remained uncensored until time t. The estimated probability is based on the fit of a Cox model for censoring with risk factors for failure and censoring. When a subject is censored at time tc, individuals who remain at risk should be given extra weight from tc onward. Hence, IPCW weights have to be recalculated for each subject at risk at each censoring time. This procedure can be summarized in the four steps below.
Fit a model for the censoring mechanism that incorporates covariates associated with event and censoring time. Estimate the probability of remaining uncensored at each observed time point t for all subjects at risk at that time point. Denote this estimated probability for subject j at time t as Compute the IPCW weights as Estimate the survival probabilities IPCW is based on the assumption that, given This means that all covariates that might be associated with event or censoring time must be measured, i.e. there should be no unmeasured confounders. If this assumption holds, in theory, IPCW can fully correct for bias due to dependent censoring. Therefore, researchers should always gather enough prognostic values that are expected to influence censoring time, such that equation (5) may be approximately true.
3
Even if all prognostic values are included in the model, a crucial step in the IPCW method is to have good estimates for the parameters in the censoring model. Small sample sizes, measurement errors, and other causes for a bad fit for the censoring model may reduce the precision of the estimated weights. However, when the fitted censoring model is accurate enough, the IPCW method will reduce the bias due to dependent censoring. Toy survival data for six subjects. Note: For each subject, the observed time point t, event indicator δ and time-independent covariate Z are given.Step 1
Step 2
Step 3
Step 4
2.2.1 Step 1: Fit censoring model
To assess the influence of covariates
Fitting the Cox model for time to censoring on the toy data gives a hazard ratio of 0.281 for the covariate Z, with 95% confidence interval [0.031, 2.538]. This effect is not significant since no dependence between covariates and censoring times was deliberately included in the data.
2.2.2 Step 2: Estimate probabilities of remaining uncensored
The estimated hazard for censoring,
Estimated IPCW estimator weights for the toy example.
2.2.3 Step 3: Compute IPCW weights
The weights for each subject j are computed as
At the end of the study time, when most subjects have experienced the event or have been censored, the probabilities of remaining uncensored become very small. Hence, the IPCW estimator weights will become large and unstable. Weights can be stabilized by dividing the marginal probability of remaining uncensored by the estimated probability of remaining uncensored (
2.2.4 Step 4: Estimate the IPCW version of the survival curve
In Steps 1–3, it was described how to compute weights for all subjects at risk at each observed time point. By weighting the subjects, a model for time to event in the absence of censoring can be fitted. KM estimator for time to event is adjusted to include the weighted subjects as follows
Survival probabilities for the toy data set estimated with standard Kaplan–Meier (
Figure 1 showed the poor performance of traditional KM in the presence of dependent censoring in data. To correct for dependent censoring, the IPCW estimator version of the survival curve was estimated on the same data set (denoted by KMIPCW). The results are shown in Figure 2 where the classical KM is plotted along with KMIPCW and the real survival curve. From this figure, we can conclude that IPCW reduced the bias caused by dependent censoring. More simulation results will be shown in section 4.
Real survival curve and its Kaplan–Meier estimates with (KMIPCW) and without (KM) implementing IPCW.
3 Application
3.1 Routine outcome monitoring data
IPCW is applied to a data set from the Department of Psychiatry at the LUMC and Rivierduinen, a local healthcare provider, in The Netherlands. All patients diagnosed with Diagnostic Statistical Manual – fourth edition – text revision (DSM-IV-TR) mood and/or anxiety disorder and with suicidal ideation were included in the data set. Suicidal ideation was defined as a score of 2 or higher on item 10 of the Montgomery–Åsberg Depression Rating Scale (MADRS 9 ). Data were collected through a procedure known as routine outcome monitoring (ROM)10 used to gather information concerning treatment progress by repeatedly measuring symptom severity. The goal is to diagnose patients and to inform clinicians and patients about the treatment progress. The data set was used to identify baseline predictors for remission of suicidal ideation. Remission of suicidal ideation was defined as a score below 2 on item 10 of the MADRS. Socio-demographic variables, functional and clinical scores on several other psychometric instruments were considered as possible predictors for remission. All patients were followed from diagnosis until remission or loss to follow-up, with a maximum of two years.
Hazard ratios for remission of suicidal ideation.
MADRS: Montgomery–Åsberg depression rating scale; DAPP-SF: Dutch dimensional assessment of personality pathology-short form.
Covariate self-harm is the most significant covariate. To compare survival curves for patients with different levels of self-harm, clinicians defined three groups for this variable: low, medium, and high. In Figure 3, the estimated KM curves are shown for these three groups. Recall that the event of interest is remission; hence, low survival probability is favorable for patients, since it indicates a high probability of remission.
Kaplan–Meier curves for remission from suicidal ideation.
3.2 Correcting for dependent censoring in ROM data
In this study, researchers were only interested in time to remission within the first two years after diagnosis. Therefore, each patient who did not experience the event of interest within two years was censored at two years, as is shown in Figure 4. This is called administrative censoring and was applied to all patients who were still at risk two years after baseline measurements (5.2% of the included patients). Censoring at two years was independent of patient characteristics. Many patients in the ROM data set (18.2%) were censored during the first two years after baseline measurements. These patients are represented by the steps in the inverse KM curve for time to censoring (Figure 4) during the first two years.
Inverse Kaplan–Meier curve for time to censoring representing the probability of being censored at each time point since diagnosis, given that the subject was not censored until that time point.
Clinicians believe that patients’ withdrawal is likely to be related to their health status. As ROM sessions took place approximately every three months and a final measurement session was not obligatory, it is conceivable that patients who achieved remission, and therefore ended the treatment, did not have a final ROM measurement reflecting their improvement. This suggests a dependence between time to event and time to censoring through health status, i.e. presence of dependent censoring. If true, this would result in overestimation of survival probabilities, since patients who are more likely to experience the event (remission) are also more likely to be censored. Therefore, less events will be observed, and the survival probabilities will be overestimated. IPCW was applied to this data set to try to correct for dependent censoring. Unfortunately, there is no covariate that directly represents the health status of patients. Instead, several covariates related to health status were used to fit the censoring model. In this example, we rely on the assumption that patients who do not return for a ROM assessment due to their remission were close to remission on their last assessment. If this assumption does not hold, it is impossible to observe whether patients that are almost in remission are more likely to be censored than patients with a bad health status.
Since the administrative censoring is independent of patient’s characteristics, patients who were censored at two years after diagnosis should not be included in the Cox model for the censoring mechanism. Therefore, the IPCW method described in this section is based on a censoring model for time to non-administrative censoring.
Cox model for time to censoring.
MADRS: Montgomery–Åsberg depression rating scale; DAPP-SF: Dutch dimensional assessment of personality pathology-short form.
The censoring model to be fitted in step 1 of the IPCW algorithm includes all covariates given in Table 5, and those that are significantly associated to time to event (Table 4). This censoring model was used to estimate the conditional probabilities of remaining uncensored (step 2), and IPCW estimator weights (step 3). The resulting IPCW survival curve (step 4) is almost identical to the original KM curve estimated without applying IPCW estimator (not shown). This suggests that the dependent censoring has hardly any influence on the estimated survival probabilities at population level. However, by looking at the individual level difference between the prediction for the model with and without IPCW Survival curves estimated with and without the IPCW method for the subject for whom 
4 Simulation study
A simulation study was performed to investigate the behavior of IPCW under different scenarios.11 In the simulation process, different sample sizes, strengths of the censoring model, and percentages of censored individuals were chosen. In this section, all steps in the simulation process are illustrated and part of the results coming from a large simulation study is discussed.
4.1 Steps in the simulations process
For each subject j, Determine the hazards for event and censoring for each individual in the data set. Hazards depend on covariates Z1 and Z2:
Sample the event times xj and censoring times cj from an exponential distribution, with rates For each individual compute the observed time point Estimate the true survival curve (KM on xj), the standard KM result (KM on All details concerning the method developed to generate survival data with dependent censoring and different scenarios are outlined in Supplementary Material B.Step 1
Step 2
Step 3
Step 4
Step 5
4.2 Varying the sample size
To investigate the effect of sample size n on IPCW results, simulations were performed with different numbers of subjects. Three different sample sizes n equal to 100, 250, and 500 were chosen. The other variables, like strength of the censoring model ( True survival probabilities (Real) and the survival curves estimated with standard Kaplan–Meier (KM) and IPCW (KMIPCW) corresponding to different sample sizes, n equal to 100 (figure a), 250 (figure b), and 500 (figure c).
4.3 Varying the strength of the censoring model
The strength of the censoring model is defined by the parameter
In Figure 7, results corresponding to the different values of the censoring mechanism are shown. In these scenarios β is equal to (0.5, 1.5), the sample size n is equal to 100 and 35% of the subjects were censored. The two covariates Z1 and Z2 were generated from a standard normal distribution and a Bernoulli distribution, respectively. The first may, for example, represent the ages of subjects in the study and the second one may represent gender (e.g. 0 = male and 1 = female). In this case, the choice of True survival probabilities (Real) and the survival curves estimated with standard Kaplan–Meier (KM) and IPCW (KMIPCW) corresponding to different parameters φ in the censoring model. The combinations used are 
When considering these strong and weak censoring models, subjects who have a higher probability of experiencing the event also have a higher chance of being censored. Therefore, less events are observed and survival probabilities are overestimated by both standard KM and IPCW (Figure 7(b) and 7(a)). The stronger the censoring mechanism, the worse the fit for both methods. However, IPCW estimator is less biased than standard KM in both cases. In the absence of dependent censoring (
4.4 Varying the percentage of censored subjects
Varying the percentage of censored subjects, while keeping the censoring model constant ( True survival probabilities (Real) and the survival curves estimated with standard Kaplan–Meier (KM) and IPCW (KMIPCW) corresponding to different censoring percentages, namely 35% (figure a), 50% (figure b), and 65% (figure c).
5 Discussion
Standard survival analysis techniques assume independence between time to event and censoring. This assumption is violated when covariates are associated to both event and censoring time. In case of dependent censoring, traditional methods, like the KM estimator, may give biased estimates for survival probabilities. IPCW can be used in these situations; this method corrects for censored subjects by giving extra weight to those who remain at risk, and assigning more weight to subjects that are more similar to the censored one.
IPCW was applied to a clinical data set where patients suffer from suicidal ideation. Time to remission was the event of interest. The IPCW method did not seem to have an effect on estimated survival curves at population level, but it did have an impact on individual level. This result does not necessarily imply that dependent censoring is not present in the data. There could be unmeasured covariates that influence both time to event and time to censoring, or there could be a mechanism that causes event and censoring times to be directly dependent, i.e. not only through covariates. In these cases, the censoring model estimated with IPCW does not fully describe the censoring mechanism. Therefore, the IPCW results do not completely correct for the censoring mechanism.
A simulation study was carried out to study the performance of IPCW in case of dependent censoring. Dependent censoring was induced in the generated survival data by simulating two time independent covariates that influence both time to event and time to censoring. Several different scenarios were generated by varying the sample size, the strength of the censoring mechanism, and the percentage of censored subjects. The simulation study showed that in each scenario, IPCW performs better than the standard survival technique. The better the fit for the censoring model, the more accurate the IPCW result. In this simulation study, event and censoring times were generated from an exponential distribution, i.e. constant hazard rates were assumed for both models. In practice, hazards may not be constant, but may vary over time. Therefore, an additional simulation study was done where hazards are not constant. Both time to event and censoring time were generated from Gompertz distribution. 12 Also in this situation, results (not shown in this article) showed that IPCW gives better survival estimates than the KM estimator.
The corrected group prognosis method (CGP) 13 might be used as an alternative to IPCW in case of dependent censoring. Here, the Cox model is fit to the whole data set. Then, survival curves, conditional on the observed covariates, are estimated for each subject in the data set. The marginal survival curve is then obtained by averaging over the covariate-specific curves. Since dependent censoring does not cause bias in the Cox model, this method will give an unbiased result for the marginal survival curve. The simulation study described in this article was repeated to compare the performance of CGP with the performance of IPCW. Simulation results suggest that both methods perform well in case of dependent censoring. However, while IPCW can deal also with time varying covariates (covariates which value may change over time), CGP cannot. In these situations, CGP will not perform well, since it does not incorporate time varying covariates in the estimations of individual survival curves for each subject. Time varying covariates can be included in the Cox model for time to censoring, and can therefore be incorporated in the IPCW methodology.
In the analysis of survival data, attention is mainly given to the survival times and prognostic factors, but almost no attention is given to the censoring mechanism. If the probability of being censored is not the same for each individual at risk, standard survival analysis techniques may give biased results. Further investigation on the censoring mechanism may be needed and IPCW that adjusts for dependent censoring can be applied.
Footnotes
Acknowledgement
The Department of Psychiatry of Leiden University Medical Center (LUMC) in The Netherlands is gratefully acknowledged for providing the data.
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
Supplement Material
Supplementary material is available for this article online.
References
Supplementary Material
Please find the following supplemental material available below.
For Open Access articles published under a Creative Commons License, all supplemental material carries the same license as the article it is associated with.
For non-Open Access articles published, all supplemental material carries a non-exclusive license, and permission requests for re-use of supplemental material or any part of supplemental material shall be sent directly to the copyright owner as specified in the copyright notice associated with the article.
