Comparison with Published Systems of a New Staging System for Papillary and Follicular Thyroid Carcinoma

Abstract

Background:

Several staging systems exist to estimate the prognosis for patients with thyroid carcinoma. Our goal was to develop a new staging system to predict cancer-specific survival (CSS) and evaluate it against published systems.

Methods:

The Cedars-Sinai Medical Center (CSMC)'s staging system was derived using data from an adjusted analysis of 1622 patients with differentiated thyroid carcinomas (DTCs) from the CSMC Thyroid Cancer Center. Mean follow-up time was 11.8 years. There were 1180 female and 442 male patients with a mean age of 46. Staging systems reviewed include University of Alabama (Birmingham) and M.D. Anderson Cancer Center (UAB-MDACC); the Tumor–Node–Metastasis (TNM) 5th and 7th editions; Memorial Sloan-Kettering (MSK); the National Thyroid Cancer Treatment Cooperative Study (NTCTCS); Ohio State; Clinical Class; Metastases, Age, Completeness of resection, Invasion, and tumor Size (MACIS); Noguchi; and the Yildirim model for predicting outcomes. The proportion of variance explained (PVE) and the C-index were computed to rank and compare each staging system's ability to predict CSS with this patient population.

Results:

Adjusted hazard ratios revealed that age at surgery of >45 years, the presence of distant metastases, capsular invasion, and vascular invasion were the most significant predictors of CSS in this patient population. The final CSMC risk score consists of low-, moderate-, and high-risk groups. Among the well-differentiated thyroid carcinoma staging systems, the CSMC and NTCTCS ranked highest with PVE values of 5% and 4.3%, respectively, while the NTCTCS and CSMC staging systems were reversed using the C-index (0.77 and 0.76, respectively).

Conclusion:

The PVE and C-index values were relatively low across all applicable staging systems and varied in each study reviewed. This suggests that no one staging system has been shown to be superior to another across different patient populations with DTC. In the future, additional factors, such as biological markers, added to the clinical and pathological characteristics may lead to the development of superior staging systems.

Introduction

N umerous staging systems have been developed for patients with differentiated thyroid carcinoma (DTC) (1 –31). As pointed out by Sherman et al. (5), there are three main goals for accurately staging thyroid cancer: (i) to estimate the prognosis and plan therapy for a given patient; (ii) to facilitate communication among physicians and institutions about individual patients or cohorts of patients using a common set of descriptors; and (iii) to allow appropriate stratification for the design and analysis of clinical trials or retrospective clinical studies. All of the existing staging systems achieve these goals in the population of patients used to generate the system. However, they do not all work as well when applied to other populations of patients with DTC. These discrepancies can be due to a number of factors, including differences in the pathologic interpretation of tumor histology, referral, and treatment biases inherent in data that emanates from a single institution, and biases introduced due to changes in the diagnosis and management of patients with DTC that have taken place over time. Among the latter are changes in the surgical management, use of radioactive iodine (RAI) and external beam radiation therapy, the advent of targeted therapies, and, perhaps, most significantly, the increasing incidence of incidental papillary microcarcinomas due, in part, to increased detection secondary to the widespread use of imaging technology (32).

In 2007, Cedars-Sinai Medical Center (CSMC) established a thyroid cancer database for patients receiving treatment at the institution. The goals of this study were to develop a staging system using this database that would allow us to predict cancer-specific survival (CSS) for patients with DTC and to compare our system with many of the previously published systems.

Methods

The study population consisted of 1622 patients treated for papillary (PTC) and follicular thyroid carcinomas (FTCs) at the CSMC Thyroid Cancer Center between the years 1950–2011. This database includes a retrospective review of data derived from the medical records of local endocrinologists and surgeons and the Department of Pathology and Laboratory Medicine. All of the operative and pathology reports were reviewed in detail. In particular, the majority of the pathology reports provided full and detailed descriptions of the tumors, which allowed for appropriate pathologic interpretation and classification of each tumor based on current standards. As surgical treatment of the thyroid gland has evolved throughout this time period, particularly during the last decade for prophylactic versus therapeutic central lymph node dissection (level VI), patients over the age of 45 may have been slightly upstaged from the time of primary diagnosis. The presence of distant metastatic disease was staged as M1 if the disease was discovered within 6 months of surgery on follow-up scans, including post-therapy RAI, computed tomography (CT), positron emission tomography (PET), or magnetic resonance imaging (MRI) scans. Patient deaths were confirmed by data from the CSMC cancer registry, the California Department of Public Health's Office of Vital Records, or the Social Security Death Index. The cause of death was ascertained from the treating physician or review of death certificates. This study was reviewed and approved by CSMC's Institutional Review Board.

Ten staging systems were identified from a comprehensive literature review and applied to the sample population specifically for cancer-specific mortality. For DTC, these included University of Alabama (Birmingham) and M.D. Anderson Cancer Center (UAB-MDACC) (2,4,7); the American Joint Committee on Cancer's Tumor–Node–Metastasis (TNM), both 5th and 7th editions (2,4,6,10,12,15,17,22,31); Memorial Sloan-Kettering (MSK) (2,4,6,8,9,19,20,28,29); the National Thyroid Cancer Treatment Cooperative Study (NTCTCS) (2,5,12,15); Ohio State (2 –4,6,12,15,16); Clinical Class (2 –4,6,11,12,17); and the Yildirim model for predicting outcomes (27). Additionally, for papillary thyroid carcinomas (PTCs) specifically, the Metastases, Age, Completeness of resection, Invasion, and tumor Size (MACIS) (1 –4,6,12,15,17 –22) and Noguchi (2,6,23) systems were included in the analysis.

Seven other staging systems were considered, but were not applicable to the CSMC data. The European Organization for Research and Treatment of Cancer (EORTC) (1 –4,6,12,25), the Age, Metastases, Extrathyroidal extension, and Size (AMES) (1 –4,6,12 –15,17,19,22); the University of Münster system (2,4,24), and the Virgen de la Arrixaca University Hospital at Murcia (Spain) (2,26) staging systems are models of all-cause mortality, while our study goals were to model CSS. Table 1 gives a brief summary of variables used in these staging systems. The Age, histologic Grade, Extrathyroidal extent, and tumor Size (AGES) (1,6,12,15); the DNA ploidy, Age, Metastases, Extent, and tumor Size (DAMES) (2,30); and the Sex, Age, and Grade (SAG) (6) systems are staging systems requiring data not captured in our data set.

Table 1.

Overview of Thyroid Cancer Staging Systems Reviewed

	AMES	Clinical Class	EORTC	MACIS	MSK	Munster	Noguchi	NTCTCS	Ohio State	Spain	TNM ^a	UAB-MDACC	Yildirim
Number of risk levels	2	4	5	4	3	2	3	4	4	3	4	3	4
Applicable pathologies	PTC FTC	PTC FTC	All	PTC	PTC FTC	PTC FTC	PTC	All	PTC FTC	PTC	All	PTC FTC	PTC FTC
Outcome
All cause	X		X			X				X
Disease-specific		X		X	X		X	X	X		X	X	X
Variables
Age	X		X	X	X		X	X		X	X	X	X
Sex	X		X				X
Primary tumor size	X			X	X		X	X	X	X	X		X
Primary tumor grade										X
Capsular involvement	X	X	X	X	X			X	X	X	X
T1 stage		X			X	X		X			X
N1 stage		X					X	X	X		X
M1 stage	X	X	X	X	X	X		X	X		X	X	X
# Metastatic sites			X
Type of resection				X
Pathology subtype					X			X		X
Multifocal									X
Vascular invasion													X

The Tumor–Node–Metastasis Staging System (TNM) is one of the most commonly used staging systems. This system was developed and is maintained by the AJCC and the UICC as a tool for doctors to stage different types of cancer based on the extent of the tumor (T), the extent of spread to the lymph nodes (N), and the presence of metastasis (M).

AJCC, American Joint Committee on Cancer; AMES, Age, Metastasis, Extrathyroidal extension, and Size; EORTC, European Organization for Research and Treatment of Cancer; FTC, follicular thyroid carcinoma; MACIS, Metastasis, Age, Completeness of resection, Invasion, and tumor Size; MSK, Memorial Sloan Kettering; NTCTCS, National Thyroid Cancer Treatment Cooperative Study; PTC, papillary thyroid carcinoma; UAB-MDACC, University of Alabama (Birmingham) and M.D. Anderson Cancer Center; UICC, Union for International Cancer Control.

Cox proportional hazard modeling (33) was used to relate risk factors to CSS times after confirming that the assumption of proportionality of hazards was met. Risk factors considered in both unadjusted and adjusted Cox models were variables, such as sex, age at surgery, tumor size, local invasion, distant metastasis, vascular invasion, nodal involvement, multifocality, the presence of metastatic disease, and RAI treatment. The initial variable selection for the fully adjusted model was performed as described by Collett (34), and then validated by means of bootstrap analysis (35,36). In the bootstrap procedure, 500 random samples of the 1622 cases were sampled with replacement. The model was then fitted to every bootstrap sample to determine estimates of coefficients for the Cox model of survival. Means of the estimated coefficients were then tested by way of a t-test. Variables with coefficients with a p-value>0.05 were removed from the model and the bootstrap validation process repeated.

The final set of variables selected was then fitted back to the original full data set in a Cox hazard model of CSS time to generate estimates of regression coefficients and hazard ratios. The final risk function was translated into risk scores (37), and then further stratified into 3 risk categories of low, moderate, and high risk to determine the CSMC staging system for DTC. To rank and compare each staging system's ability to predict CSS among this patient population, Kaplan–Meier curves were generated to visualize the data and the proportion of variance explained (PVE) for each staging classification (37) and was calculated from the following formula: PVE=R_M ²=1−(L_R /L_U )^2/N, where L_R and L_U are the restricted and unrestricted maximum likelihood in which stage classification is the only parameter; N is the number of patients whose data are used in the model; and R_M ² is the proportion of total uncertainty attributed to the model determined from a contingency table analysis of frequency of residual-disease–free versus disease stage. A higher PVE value for one model over another indicates superiority in predicting survival times in a Cox regression model (6,38).

Additionally, as a measure of concordance, the C-statistic for each model was also computed (39,40). The C-statistic in survival-time modeling is analogous to the area under a receiver-operating characteristic curve and represents an estimate that a model accurately distinguishes cases of higher risk of mortality from lower risk cases. The C-statistic ranges from 0.5 (indicating complete randomness, thus a poor performance) to 1.0 (perfect prediction and perfect fit). In general, a C-statistic above 0.7 indicates a good overall model fit (41).

Data are presented as means and standard deviations (SDs) or counts and percentages. Statistical significance was set at p<0.05. All statistical analysis was performed using SAS v.9.3 (Cary, NC).

Results

The study population included 1180 women and 442 men and a mean follow-up time of 11.8 years (SD=10.4). Table 2 gives basic demographics of the CSMC DTC patient population as well as the TNM stage distribution. Over a third of the study population were over 50 years of age at the time of surgery (n=630) with a mean age of 46 years (SD=15.6). PTC accounted for 93% of DTC cases.

Table 2.

Characteristics of Cases Included in Study

Time followed, years	11.8±10.4
Age at surgery, years	46.1±15.6
Primary tumor size, cm	1.22±1.34
Pathology subtype
PTC	1511 (93%)
FTC	111 (7%)
Cause of death	122 (7.5%)
Thyroid cancer	21 (1.3%)
Other cancers	37 (2.3%)
Other causes	64 (4.0%)
Sex
Male	442 (27.3%)
Female	1180 (72.8%)
Capsular invasion/extension	341 (21.0%)
Nearby tissues (T3)	225 (13.9%)
Local extension (T4a)	83 (5.1%)
Distant extension (T4b)	11 (0.7%)
Node positive	532 (32.8%)
Limited to central nodes	278 (17.1%)
Extended to lateralnodes	249 (15.4%)
Vascular invasion	202 (12.5%)
Multifocal	747 (46.1%)
Metastatic disease	30 (1.9%)
TNM stage
I	1227 (75.7%)
II	89 (5.5%)
III	185 (11.4%)
IVA	96 (5.9%)
IVB	9 (0.6%)
IVC	15 (0.9%)

Data are reported as mean±standard deviation or as number (percent).

The unadjusted and adjusted hazard ratios for CSS are presented in Table 3. In both the unadjusted and adjusted analysis, histology, tumor size, lymph node involvement, and multifocality were not found to be significant factors in CSS. While the male sex was significant as an independent variable, it was not significant in the adjusted analysis, and thus not included in the final model.

Table 3.

Cox Proportional Hazard Modeling of Cancer-Specific Survival Times

	Unadjusted			Adjusted
	p-Value	HR	[95% CI]	p-Value	HR	[95% CI]
Age at surgery >45 years	<0.0001	7.03	[3.70, 13.38]	<0.0001	8.35	[4.29, 16.24]
Metastatic disease—distant	<0.0001	10.07	[4.56, 22.23]	<0.0001	10.26	[4.21, 25.03]
Capsular extension (T3 or T4)	<0.0001	3.59	[1.93, 6.67]	0.0037	2.57	[1.36, 4.85]
Vascular invasion	0.0130	2.30	[1.19, 4.45]	0.0368	2.13	[1.05, 4.32]
Histology (PTC vs. FTC)	0.0975	1.88	[0.89, 3.96]
Male	0.0054	2.09	[1.24, 3.52]
Tumor size, cm	0.2522	1.12	[0.93, 1.35]
Positive lymph node(s)	0.4578	1.22	[0.72, 2.07]
Multifocality	0.5395	0.85	[0.50, 1.44]
Radiation treatment	0.5561	1.17	[0.69, 1.98]

HR, hazard ratio; CI, confidence interval.

Adjusted hazard ratios for this population revealed that age at surgery of >45 years, presence of distant metastatic disease, capsular invasion (T3/T4), and vascular invasion were the most significant predictors of CSS in DTC. Although RAI therapy was considered in the modeling with 53% of the DTC population receiving treatment (n=862), it was not found to be a significant predictor of CSS in this population. Additionally, we found that differentiating between N1a and N1b did not improve our model, as there was no statistical difference between the two. The final CSMC risk scoring method for CSS in DTC cases derived from the adjusted analysis is provided in Table 4. The presence of distant metastatic disease (12 points) and being over the age of 45 at the time of surgery (11 points) were found to be the largest risk factors for cancer-specific mortality, while capsular and vascular invasion status were less so, with 5 and 4 points, respectively. Based on our risk assessment criteria, patients who scored less than 10 points had the lowest overall risk of cancer-specific death, while patients with more than 19 points had the highest overall risk of cancer-specific death (Table 5). Within our risk stratification staging system, all three risk strata (low, moderate, and high) were each statistically significantly different from one another in predicting CSS time with p<0.001.

Table 4.

Cedars-Sinai Medical Center Risk Scoring

	Points
Age at surgery >45 years	11
Metastatic disease—distant	12
Capsular invasion (T3 or T4)	5
Vascular invasion	4
Total score	(range 0–32)

Table 5.

Cancer-Specific Survival Rates Based on Total Risk Score

		5 year		10 year		15 year
Total score	Risk group	Rate	[95% CI]	Rate	[95% CI]	Rate	[95% CI]
<10	Low	99.7%	[99.4%, 99.9%]	99.4%	[98.8%, 99.7%]	98.5%	[97.3%, 99.2%]
10–19	Moderate	98.4%	[97.3%, 99.0%]	96.6%	[94.7%, 97.9%]	92.2%	[88.7%, 94.7%]
>19	High	69.3%	[49.0%, 82.9%]	46.2%	[22.2%, 67.3%]	16.0%	[2.5%, 40.3%]

Using the clinicopathological data within our dataset, each case was staged according to previously published staging criteria for comparison. Kaplan–Meier curves of CSS times of applicable staging systems are presented in Figure 1 for DTC and Figure 2 for staging systems applicable to PTCs specifically. From these figures, it is apparent that the high-risk group has a poor survival rate, which requires more aggressive care. Furthermore, our low- and moderate-risk groups have significant differences in CSS times as early as 5 years after diagnosis (Table 5). The stratified risk assessment ranking by PVE and the C-index are summarized in Table 6. As higher PVE values indicate a better fit of the staging criteria, our new CSMC risk stratification system and the NTCTCS system performed the best (PVE of 5% and 4.3%, respectively), for DTC cases. Additionally, for PTC cases only, the CSMC staging system and MACIS system had the highest PVE values at 4.2% and 3.5%, respectively.

FIG. 1.

Kaplan-Meier curves of cancer-specific survival (CSS) times for differentiated thyroid carcinomas (DTC) by various staging criteria. p<0.001 for all staging criteria.

FIG. 2.

Kaplan-Meier curves of CSS times by various staging criteria specific only to PTC pathology. p<0.001 for all staging criteria.

Table 6.

Proportion of Variance Explained and C-Index Scores

	PTC cases only (n=1512)			All DTC cases (n=1622)
Method	PVE	C-index	[95% CI]	PVE	C-index	[95% CI]
CSMC	4.2%	0.74	[0.67, 0.81]	5.0%	0.76	[0.69, 0.82]
NTCTCS	3.5%	0.75	[0.67, 0.82]	4.3%	0.77	[0.70, 0.84]
TNM 5th ed.	3.3%	0.74	[0.67, 0.82]	4.0%	0.76	[0.70, 0.83]
MACIS^a	3.1%	0.72	[0.63, 0.80]
Noguchi^a	2.8%	0.72	[0.64, 0.80]
TNM 7th ed.	2.7%	0.72	[0.64, 0.81]	3.5%	0.75	[0.67, 0.82]
UAB-MDACC	2.7%	0.73	[0.66, 0.81]	3.4%	0.75	[0.69, 0.82]
MSK	2.4%	0.73	[0.65, 0.81]	3.1%	0.75	[0.68, 0.82]
Clinical Class	1.3%	0.63	[0.53, 0.72]	1.8%	0.65	[0.56, 0.74]
Yildirim	1.4%	0.66	[0.58, 0.74]	1.6%	0.67	[0.60, 0.74]
Ohio State	1.0%	0.66	[0.58, 0.75]	1.6%	0.66	[0.57, 0.75]

MACIS and Noguchi methods were scored only in PTC cases.

CSMC, Cedars-Sinai Medical Center; DTC, differentiated thyroid carcinoma; PVE, proportion of variance explained.

The PVE levels are all very low due to little variation in the original dataset. The C-statistic was also explored and showed that NTCTCS resulted in a C-index score of 0.77, with the CSMC and TNM 5th edition following closely behind at 0.76 for DTC. Furthermore, the C-index for the PTC cases specifically showed that the NTCTCS system ranked highest with a score of 0.75, while both CSMC and TNM 5th editions were 0.74. As a result, using both the PVE and C-index resulted in similar rankings among each staging system.

Discussion

The CSMC Staging System developed from the adjusted Cox proportional hazard model utilizes age, the presence of distant metastasis, capsular invasion, and vascular invasion to risk stratify the patients. Our model did not include sex, tumor size, regional lymph node metastasis, multifocality, or DTC histology (papillary or follicular), as these factors were not significant independent predictors of CSS in our population. To compare the different staging systems, cases from the CSMC database were classified according to the staging criteria listed in Table 1 using systems, including our own, developed to model CSS in DTC. The ability of each staging classification to predict CSS was compared with that of other staging classification schemes by computation of the PVE (38) and the C-index (39,40).

Although the TNM system is the most widely used and universally accepted, its PVE value for this patient population is below that of the CSMC and the NTCTCS systems. However, given the relatively low PVE and C-index scores across all of the applicable staging systems, it is apparent that all staging systems used in this study are less than ideal in predicting survival in this dataset. Review of multiple studies that compare staging systems also result in relatively low PVE scores (Table 7), which suggest that while there are many staging systems to choose from, there is not one single staging system that stands out in predicting thyroid carcinoma survival rates (1 –6,22,27,42 –45). It is important to note that each staging system mentioned was developed using a specific database from each institution's patient population (or group of institutions, as in the NTCTCS system). While there are many common variables used in each model, each system provides a poor overall fit when applied to other patient groups. This is most apparent in the Yildirim study (27), where their proposed mathematical model is nearly identical to the CSMC staging system. However, when their model was applied to the CSMC DTC population, their PVE and C-index scores were among the lowest ranked. While the two patient populations were similar in many respects, including sex, age, cervical lymph node involvement, and multifocality, the Yildirim DTC population had a much higher mortality rate and more patients presenting with distant metastases. The Yildirim study also had a higher percentage of FTC patients and capsular and vascular invasion cases. As a result, their relatively small dataset had much more variation and a greater mortality, all of which resulted in a high PVE value in their study. When applied to the CSMC dataset, where a majority of patients are in the low-risk category with low mortality rates, the Yildirim model had low PVE values.

Table 7.

Summary of Proportion of Variance Explained Tables

DTC
Method	%
Yildirim, 2005 (27)	(n=347)
Mathematical^a	23.4%
TNM 6th ed.	21.6%
AMES	21.6%
MACIS	21.3%
EORTC	18.7%
MSK	18.2%
Passler et al., 2003 (4)	(n=440)
MACIS	16.9%
EORTC	16.3%
TNM 5th ed.	14.0%
AMES	13.2%
Ohio State	12.1%
Clinical Class	11.7%
Munster	10.9%
MSK	10.3%
UAB-MDACC	10.1%
Sherman et al., 1998 (5)	(n=1607)
TNM 4th ed.	18.0%
NTCTCS	17.0%
Ohio State	15.0%
AMES	13.0%
EORTC	12.0%
Brierley et al., 1997 (6)	(n=382)
AGES	31.5%
TNM 4th ed.	28.3%
AMES	28.1%
EORTC	28.0%
MACIS	27.3%
Ohio State	22.9%
MSK	19.2%
Clinical Class	18.5%

PTC
Method	%
Lang et al., 2007 (2)	(n=589)
MACIS	18.7%
TNM 6th ed.	17.9%
EORTC	16.6%
Noguchi	14.3%
UAB-MDACC	14.0%
NTCTCS	13.6%
CIH	12.6%
Murcia	11.4%
AMES	10.5%
MSK	10.0%
Clinical Class	9.6%
Ohio State	7.7%
Munster	6.1%
Ankara	6.0%
Passler et al., 2003 (4)	(n=293)
MACIS	15.1%
EORTC	12.9%
TNM 5th ed.	10.2%
AMES	9.7%
MSK	9.2%
Clinical Class	7.6%
Ohio State	7.4%
Munster	6.7%
UAB-MDACC	6.0%
Voutilainen et al., 2003 (22)	(n=495)
MACIS	30.0%
TNM 5th ed.	28.5%
AMES	16.3%
Sherman et al., 1998 (5)	(n=281)
MACIS	15.0%
Clinical Class	14.0%
TNM 4th ed.	13.0%
NTCTCS	12.0%
Ohio State	12.0%
AMES	10.0%
EORTC	9.0%

FTC
Method	%
Lang et al., 2007 (3)	(n=171)
TNM 6th ed.	22.4%
Clinical Class	21.2%
MACIS^b	20.4%
AIM	20.0%
EORTC	18.9%
UAB-MDACC	18.7%
AMES	18.4%
NTCTCS	18.4%
Munster	17.8%
Ankara	14.3%
Ohio State	12.7%
MSK	11.2%
Noguchi^b	11.0%
CIH^b	0.7%
D'Avanzo et al., 2004 (1)	(n=86)
MACIS^b	48.0%
AGES	46.0%
EORTC	44.0%
AMES	40.0%
TNM 4th ed.	33.0%
Passler et al., 2003 (4)	(n=147)
EORTC	17.0%
TNM 5th ed.	16.7%
Clinical Class	15.8%
Munster	15.8%
MACIS^b	15.7%
Ohio State	14.3%
AMES	13.0%
UAB-MDACC	11.2%
MSK	4.8%
Brierley et al., 1997 (6)	(n=NA^c)
TNM 4th ed.	23.3%
AGES	23.1%
MACIS	20.6%
AMES	19.9%
EORTC	18.9%
Clinical Class	15.7%
MSK	15.2%
Ohio State	14.8%

Mathematical model for predicting outcomes.

Staging systems originally applicable to PTC only.

Data not available (communication with author).

AGES, Age, histologic Grade, Extrathyroidal extent, and tumor Size; AIM, Mayo Clinic—Age, Invasion to blood vessels, Metastases; CIH, Cancer Institute Hospital (Tokyo, Japan).

Additionally, the PVE value will naturally be higher for any staging criteria with more classification categories due to the increase in the degrees of freedom, thus suggesting a higher, more predictive model (1 –6,22,27). It is also interesting that although the PVE has been used to evaluate staging criteria in other studies, it has failed to conclusively prove its utility in determining which staging system is the best across each patient population (1 –6,18,19). A majority of the studies that applied the PVE resulted in dramatically different PVE values and no one staging system was consistently recommended across each study. Consequently, the PVE may not be an adequate method in ranking these staging systems. As a result, we also computed the C-index as another comparison measure across staging systems. It is worth noting that using the C-index resulted in a slightly higher NTCTCS score when compared to the CSMC system, for both the DTC (0.77 and 0.76, respectively) and PTC (0.75 and 0.74, respectively) patient population. However, despite using the C-index, we were still unable to demonstrate any one staging system to be significantly improved over another. Nevertheless, many of the C-index scores were >0.7, which indicates a good overall model fit (41), whereas our PVE values were quite low, indicating that these staging systems are a poor fit for our data. We conclude that these staging systems are perhaps not as predictive of survival time as one would hope (1 –6,27,42 –45). Moreover, with DTC's high survival rate due to a majority of patients being in the low-risk category, the vast majority of observed cases are censored observations, making it difficult to mathematically model death. As a result, data sets with a low number of patients and higher mortality rates will generate higher PVE values than larger data sets with relatively few deaths as was seen in our study.

Given that many of these staging systems were also developed many years ago using similar combinations of risk factors with comparable outcomes, the systems still encounter the same limitations and challenges. In the future, the addition of other prognostic factors may result in staging systems that will better differentiate between low- and moderate-risk populations (44,45). For example, the addition of biological markers, which are found in more aggressive DTC variants such as DNA ploidy (30,46 –50), BRAF mutations (48,49), CA 19-9 (50), and proliferating cell nuclear antigen (48,21,51), may aid in survival prediction.

This study has some limitations, including a selection bias, where a majority of the population are patients of our Thyroid Cancer Center, where physicians are more likely to screen for nodules, goiters, and other thyroid abnormalities. As a result, patients with thyroid microcarcinomas (≤1 cm) are more often diagnosed, leading this population to have a higher survival rate. The treatment of our patients has generally been consistent with the recommendations published by the American Thyroid Association (ATA) (52). In recent years, we have treated fewer patients with low-risk DTC with RAI than was done in the past (53). Although we did not re-examine the histology of each case, we did review the surgical and pathology reports in detail to classify patients into PTC or FTC categories.

In addition to the ATA guidelines, there are multiple other guidelines on DTC management, with some differences in recommendations (54 –58). Thus, there can be much variation between institutions (59,60). As a result, a proficiency bias also exists in comparing PVE values across each study, since it is not possible to review how each institution treated their patients (61,62). This can result in differing survival rates depending on how each institution treats their DTC patients. While it is important to consider the stage of thyroid carcinoma patients in predicting survival rates and to facilitate management of the disease, it is difficult to treat patients based solely on their stage due to varying factors, such as extent of surgery, surgical expertise, and existing comorbidities, as well as patient preference on their treatment plan. Therefore, it is of utmost importance to work with a multidisciplinary team of experts to determine the most successful way to treat each individual patient to optimize disease-free survival.

Footnotes

Disclosure Statement

The authors declare that no competing financial interests exist.

References

D'Avanzo

, Ituarte

, Treseler

, Kebebew

, Wu

, Wong

, Duh

, Siperstein

, Clark

. 2004. Prognostic scoring systems in patients with follicular thyroid cancer: a comparison of different staging systems in predicting the patient outcome. Thyroid, 14:453–458.

Lang

, Lo

, Chan

, Lam

, Wan

. 2007. Staging systems for papillary thyroid carcinoma, a review and comparison. Ann Surg, 245:366–378.

Lang

, Lo

, Chan

, Lam

, Wan

. 2007. Staging systems for follicular thyroid carcinoma: application to 171 consecutive patients treated in a tertiary referral center. Endocr Relat Cancer, 14:29–42.

Passler

, Prager

, Scheuba

, Kaserer

, Zettinig

, Niederle

. 2003. Application of staging systems for differentiated thyroid carcinoma in an endemic goiter region with iodine substitution. Ann Surg, 237:227–234.

Sherman

, Brierley

, Sperling

, Ain

, Bigos

, Cooper

, Haugen

, Ho

, Klein

, Ladenson

, Robbins

, Ross

, Specker

, Taylor

, Maxon

. 1998. Prospective multicenter study of thyroid carcinoma treatment, initial analysis of staging and outcome. Cancer, 83:1012–1021.

Brierley

, Panzarella

, Tsang

, Gospodarowicz

, O'Sullivan

. 1997. A comparison of different staging systems predictability of patient outcome, thyroid carcinoma as an example. Cancer, 79:2414–2423.

Beenken

, Roye

, Weiss

, Sellers

, Urist

, Diethelm

, Goepfert

. 2000. Extent of surgery for intermediate-risk well-differentiated thyroid cancer. Am J Surgery, 179:51–56.

Shaha

, Shah

, Loree

. 1996. Risk group stratification and prognostic factors in papillary carcinoma of thyroid. Ann Surg Oncol, 3:534–538.

Shaha

, Loree

, Shah

. 1994. Intermediate-risk group for differentiated carcinoma of thyroid. Surgery, 116:1036–1041.

10.

Shaha

. 2007. TNM classification of thyroid carcinoma. World J Surg, 31:879–887.

11.

DeGroot

, Kaplan

, McCormick

, Straus

. 1990. Natural history, treatment, and course of papillary thyroid carcinoma. J Clin Endocrinol Metab, 71:414–424.

12.

Sherman

. 1999. Toward a standard clinicopathologic staging approach for differentiated thyroid carcinoma. Semin Surg Oncol, 16:12–15.

13.

Haigh

, Urbach

, Rotstein

. 2007. AMES prognostic index and extent of thyroidectomy for well-differentiated thyroid cancer in the US. Surgery, 136:609–616.

14.

Cady

, Rossi

. 1998. An expanded view of risk-group definition in differentiated thyroid carcinoma. Surgery, 104:947–953.

15.

Wartofsky

, Van Norstrand

. 2006. Staging of thyroid cancer. Wartofsky

, Van Nostrand

. Thyroid Cancer: A Comprehensive Guide to Clinical Management, second. Humana Press: Totowa, NJ, 87–95.

16.

Mazzaferri

, Jhiang

. 1994. Long-term impact of initial surgical and medical therapy on papillary and follicular thyroid cancer. Am J Med, 97:418–428.

17.

, Chan

, Lam

, Wan

. 2005. Follicular thyroid carcinoma the role of histology and staging systems in predicting survival. Ann Surg, 242:708–715.

18.

Hay

, Bergstralh

, Goellner

, Ebersold

, Grant

. 1993. Predicting outcome in papillary thyroid carcinoma: development of a reliable prognostic scoring system in a cohort of 1779 patients surgically treated at one institution during 1940 through 1989. Surgery, 114:1050–1058.

19.

Chaplin

, O'Brien

, McNeil

, Haghighi

. 1999. Application of prognostic scoring systems in differentiated thyroid carcinoma. Aust NZ J Surg, 69:625–628.

20.

Shaha

, Shah

, Loree

. 1998. Patterns of failure in differentiated carcinoma of the thyroid based on risk groups. Head Neck, 20:26–30.

21.

Leite

KRM

, de Araujo

, Meirelles

MIR

, Costa

ALL

, Camara-Lopes

. 1999. No relationship between proliferative activity and the MACIS prognostic scoring system in papillary thyroid carcinoma. Head Neck, 21:602–605.

22.

Voutilainen

, Siironen

, Franssila

, Sivula

, Haapiainen

, Haglung

. 2003. AMES, MACIS and TNM prognostic classifications in papillary thyroid carcinoma. Anticancer Res, 23:4282–4288.

23.

Noguchi

, Murakami

, Kawamoto

. 1994. Classification of papillary cancer of the thyroid based on prognosis. World J Surg, 18:552–558.

24.

Lerch

, Schober

, Kuwert

, Hans-Bernard

. 1997. Survival of differentiated thyroid carcinoma studied in 500 patients. J Clin Oncol, 15:2067–2075.

25.

Byar

, Green

, Dor

, Williams

, Colon

, Van Gilse

, Mayer

, Sylvester

, Van Glabbeke

. 1979. A prognostic index for thyroid carcinoma, a study of the EORTC thyroid cancer cooperative group. Eur J Cancer, 15:1033–1041.

26.

Sebastian

, Gonzalez

JMR

, Paricio

, Perez

, Flores

, Madrona

, Romero

, Tebar

. 2000. Papillary thyroid carcinoma, prognostic index for survival including the histological variety. Arch Surg, 135:272–277.

27.

Yildirim

. 2005. A model for predicting outcomes in patients with differentiated thyroid cancer and model performance in comparison with other classification systems. J Am Coll Surg, 200:378–392.

28.

Shah

, Loree

, Dharker

, Strong

, Begg

, Vlamis

. 1992. Prognostic factors in differentiated carcinoma of the thyroid gland. Am J Surg, 164:658–661.

29.

Shaha

. 2004. Implications of prognostic factors and risk groups in the management of differentiated thyroid cancer. Laryngoscope, 114:393–402.

30.

Pasieka

, Zedenius

, Auer

, Grimelius

, Hoog

, Lundell

, Wallin

, Backdahl

. 1992. Addition of nuclear DNA content to the AMES risk-group classification for papillary thyroid cancer. Surgery, 112:1154–1159.

31.

Edge

, Byrd

, Carducci

, Compton

, Fritz

, Green

, Trotti

. 2009. AJCC Cancer Staging Manual, seventh. Springer: New York.

32.

Davies

, Welch

. 2006. Increasing incidence of thyroid cancer in the United States, 1973–2002. JAMA, 295:2164–2167.

33.

Cox

. 1972. Regression models and life-tables. J Roy Stat Soc Ser B, 34:187–220.

34.

Collett

. 2003. Modeling survival data. Collett

. Modelling Survival Data in Medical Research, Second. Chapman & Hall/CRC Press: Boca Raton, FL, 54–106.

35.

Effron

. 1979. Bootstrap methods: another look at the jackknife. Ann Stat, 7:1–26.

36.

Brunelli

, Rocco

. 2006. Internal validation of risk models in lung resection surgery: Bootstrap versus training-and-test sampling. J Thorac Cardiovasc Surg, 131:1243–1247.

37.

Sullivan

, Massaro

, D'Agostino BD

. 2004. Presentation of multivariate data for clinical use: the Framingham Study risk score. Stat Med, 23:1631–1660.

38.

Schemper

. 1993. The relative importance of prognostic factors in studies of survival. Stat Med, 12:2377–2382.

39.

Kremers

. 2007. Concordance for survival time data: fixed and time-dependent covariates and possible ties in predictor and time. Department of Health Sciences Research and the William J. von Liegib Transplant Center, Mayo Clinic: Rochester, MN.

40.

Harrell

, Lee

, Mark

. 1996. Tutorial in biostatistics, multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors. Stat Med, 15:361–387.

41.

Hosmer

, Lemenshow

. 2000. Applied Logistic Regression, second. John Wiley and Sons: New York.

42.

Steinmuller

, Klupp

, Rayes

, Ulrich

, Jonas

, Graf

, Neuhas

. 2000. Prognostic factors in patients with differentiated thyroid carcinoma. Eur J Surg, 166:29–33.

43.

Kingma

, Van den Bergen

, De Vries

. 1991. Prognostic scoring systems in differentiated thyroid carcinoma: which is the best? Neth J Surg, 43:63–66.

44.

Harness

, McLeod

, Thompson

, Noble

, Burney

. 1998. Deaths due to differentiated thyroid cancer: a 46 year perspective. World J Surg, 12:623–629.

45.

Zafon

. 2012. Papillary thyroid microcarcinoma: do classical staging systems need to be changed? Ward

. Thyroid and Parathyroid Diseases: New Insights into Some Old and Some New Issues. www.intechopen.com/books/thyroid-and-parathyroid-diseases-new-insights-into-some-old-and-some-new-issues/papillary-thyroid-microcarcinoma-do-classical-staging-systems-need-to-be-changed. 2012 August 6.

46.

Nishida

, Nakao

, Hamaji

, Nakahara

, Tsujimoto

. 1996. Overexpression of p53 protein and DNA content are important biologic prognostic factors for thyroid cancer. Surgery, 119:568–575.

47.

Cohn

, Backdahl

, Forsslung

, Auer

, Zetterberg

, Lundell

, Granberg

, Lowhagen

, Willems

, Cady

. 1984. Biological considerations and operative strategy in papillary thyroid carcinoma: arguments against the routine performance of total thyroidectomy. Surgery, 96:957–971.

48.

Grogan

, Mitmaker

, Clark

. 2010. The evolution of biomarkers in thyroid cancer-from mass screening to a personalized biosignature. Cancers, 2:885–912.

49.

Capri

, Mechanick

, Saussex

, Nicolini

. 2010. Thyroid tumor marker genomics and proteomics: diagnostic and clinical implications. J Cell Physiol, 224:612–619.

50.

Hashimoto

, Matsubara

, Mizukami

, Miyazaki

, Michigishi

, Yaniahara

. 1990. Tumor markers and oncogene expression in thyroid cancer using biochemical and immunohistochemical studies. Endocrinol Jpn, 37:247–254.

51.

Liang

, Zhong

, Luo

, Huang

, Lin

, Zhan

, Xie

, Li

. 2011. Diagnostic value of 16 cellular tumor markers for metastatic thyroid cancer: an immunohistochemical study. Anticancer Res, 31:3433–3440.

52.

Cooper

, Doherty

, Haugen

, Kloos

, Lee

, Mandel

, Mazzaferri

, McIver

, Pacini

, Schlumberger

, Sherman

, Steward

, Tuttle

. 2009. Revised American Thyroid Association management guidelines for patients with thyroid nodules and differentiated thyroid cancer. Thyroid, 19:1167–1214.

53.

Sacks

, Fung

, Chang

, Waxman

, Braunstein

. 2010. The effectiveness of radioactive iodine treatment of low risk thyroid cancer: a systematic analysis of the peer reviewed literature from 1966 to April 2008. Thyroid, 20:1235–1245.

54.

American Association of Clinical Endocrinologists. 2001. AACE/AAES medical/surgical guidelines for clinical practice management of thyroid carcinoma. Endocr Pract, 7:202–220.

55.

British Thyroid Association and Royal College of Physicians. 2007. Guidelines for the management of thyroid cancer. second www.british-thyroid-associations.org. 2012 October 1.

56.

National Comprehensive Cancer Network 2009 Thyroid carcinoma. www.nccn.org/professionals/physicians_gls/PDF/thyroid.pdf. 2012 October 8.

57.

Pacini

, Schlumberger

, Dralle

, Elisei

, Smit

, Wiersinga

. 2006. European consensus for the management of patients with differentiated thyroid carcinoma of the follicular epithelium. Eur J Endocrinol, 154:787–803.

58.

Luster

, Clarke

, Dietlein

, Lassmann

, Lind

, Oyen

, Tennvall

, Bombardieri

. 2008. Guidelines for radioiodine therapy of differentiated thyroid cancer. Eur J Nucl Med Mol Imaging, 35:1941–1959.

59.

Famakinwa

, Sanziana

, Wang

, Sosa

. 2010. ATA practice guidelines for the treatment of differentiated thyroid cancer: were they followed in the United States? Am J Surg, 199:189–198.

60.

Gulcelik

, Gulcelik

, Kuru

, Camlibel

, Alagol

. 2007. Prognostic factors determining survival in differentiated thyroid cancer. J Surg Oncol, 96:598–604.

61.

Krisha

, Maithreya

, Surapaneni

. 2010. Research bias: a review for medical students. J Clin Diag Res, 4:2320–2324.

62.

Johnston

. 2002. Moving forward by looking back: retrospective clinical studies. J Orthod, 29:221–226.