Abstract
Background:
Although the importance of tumor size in papillary thyroid cancer (PTC) is well established, there is no research investigating whether age modifies the impact of tumor size, and there is conflicting evidence regarding optimal size thresholds for prognostic discrimination. We aimed to verify that tumor size is an independent prognostic factor in PTC, investigate the impact of patient age, and identify optimal size cutoffs for risk stratification using objective measures of model performance.
Methods:
A retrospective analysis of 574 patients with PTC, using multivariate Cox regression models to test the impact of tumor size on recurrence-free survival (RFS). Subgroup analyses were performed in patients aged <55 and ≥55 years. Exploratory analyses to identify optimal size cutoffs for prognostic discrimination were performed using the proportion of variation explained (PVE) and Harrell's C-index.
Results:
Tumor size predicted RFS on multivariate analysis in the overall study cohort (hazard ratio [HR] 1.16; [95% confidence interval (CI)1.01–1.34]; p = 0.038). In subgroup analysis, there was no association between tumor size and RFS in patients aged <55 years (HR 1.11; [CI 0.89–1.38]; p = 0.362). In contrast, size was an independent predictor of RFS in patients aged ≥55 years (HR 1.52; [CI 1.11–2.07]; p = 0.009). In this subgroup, an optimal size threshold of >2 cm versus ≤2 cm (HR 5.24; [CI 2.30–11.92]; p < 0.001; PVE: 36%; C-index: 0.66) provided the greatest prognostic discrimination. There was no incremental improvement in prognostic value by further stratification of size.
Conclusion:
In our PTC cohort, the impact of tumor size on RFS was limited to patients aged ≥55 years. A single size threshold of 2 cm maximized prognostic discrimination with tumors >2 cm associated with a five times higher risk of recurrence than those ≤2 cm. These findings need to be validated in independent large cohorts and the potential management and staging implications further studied.
Introduction
T
The eighth edition of AJCC staging utilizes size thresholds of 1, 2, and 4 cm, stratifying tumors into T1a, T1b, T2, and T3a categories respectively based on size to predict survival. In contrast, the revised ATA Initial Stratification System for risk of recurrence in 2015 only incorporated the 1 cm cutoff, reflecting the excellent prognosis of papillary microcarcinomas. The 2 and 4 cm cutoffs may have been based, at least in part, on convention in head and neck cancer, which includes mucosal squamous cell carcinomas that exhibit vastly different disease biology and outcomes compared with PTC. There is limited research specifically addressing the prognostic performance of the T staging system in PTC to validate the presently used size cutoffs. Within the literature, evidence regarding optimal size thresholds for prognostic discrimination in PTC is conflicted, with different studies selecting thresholds of 1, 1.5, 2, 3, and 4 cm (2,7,11 –15).
In addition, while the association between age and survival is recognized in both TNM staging and other risk stratification systems, to our knowledge no studies have investigated whether age modifies the prognostic impact of tumor size. With respect to recurrence, size plays a minor role in the ATA Initial Risk Stratification system and age is not included. In contrast, multiple studies have suggested that the presence of nodal metastases portend worse survival and recurrence outcomes in older patients with WDTC (16 –18). Therefore, it is reasonable to question whether age may also modify the impact of tumor size on recurrence.
The primary aim of this study was to determine if tumor size is an independent predictor of recurrence in PTC. Our secondary aims were (1) to establish whether this prognostic impact depends on patient age, and (2) to identify optimal size cutoffs for stratifying recurrence risk using objective measures of model performance.
Methods
Study population
The study population was identified from a retrospective database review of all patients undergoing surgery for PTC from 1987 through 2016 at Liverpool Hospital, Sydney, Australia. The study was approved by the Area Health Service Ethics Review Board. Tumor size was defined by the greatest diameter on direct measurement at histopathology. Additional demographic, pathological, treatment, and follow-up data were collected for all patients.
Surgical treatment and adjuvant therapy
At our institution, a total thyroidectomy was predominantly performed for all malignancies, with selected papillary microcarcinomas (≤1 cm in size) treated with hemithyroidectomy at the discretion of the treating surgeon. During the early study period, central neck dissection was performed only when suspicious nodes were identified on preoperative imaging or intraoperative assessment. During the later study period, prophylactic central neck dissection was generally performed in patients with tumors ≥1 cm in size. Dissection of the lateral neck was only performed in patients with confirmed lateral neck nodal metastases based on either preoperative fine needle aspiration or intraoperative frozen section, and generally included compartments II–V. Patients were selected for adjuvant radioactive iodine (RAI) therapy based on tumor size (≥1 cm), gross or microscopic extrathyroidal extension (ETE), involved surgical margins, nodal metastases, and aggressive histologic variants of PTC. Thyrotropin suppression was implemented for most patients based on risk stratification and medical contraindications as per surgeon discretion. Patients were reevaluated every 6 months in the first 2 years posttreatment and every 12 months thereafter. Follow-up consisted of clinical history, examination, thyroglobulin (Tg) monitoring and functional and structural imaging using diagnostic RAI scans and neck ultrasound. Computed tomography and positron emission tomography were utilized selectively if clinically indicated.
Statistical analysis
Data was collated using Microsoft Excel (Microsoft, Redmond, Washington) and statistical analysis performed using Stata version 12.0 SE (Stata Corporation, College Station, TX). All statistics were two sided, and a p value <0.05 was considered statistically significant. The clinical endpoint was recurrence-free survival (RFS) calculated from the date of surgery to the date of first disease recurrence. Disease recurrence was defined as (i) disease confirmation on pathologic analysis of a suspicious structural lesion by cytology or excisional biopsy, (ii) grossly abnormal imaging findings in the presence of rising Tg or Tg-antibodies, or (iii) persistent or rising Tg values that prompted additional RAI therapy (biochemical recurrence).
The association between median tumor size and other adverse prognostic factors was assessed using the Kruskal–Wallis test. Univariate Cox proportional hazards regression was used to test the association between tumor size and RFS. Other potential covariates including age at diagnosis (<55, ≥55 years), sex, ETE (nil, microscopic, macroscopic), multifocality, PTC subtype, pathological nodal stage (pN0, pN1a, pN1b), and extranodal extension (ENE) were also tested for prognostic significance. Statistically significant and clinically important covariates were used to develop a multivariate Cox proportional hazards regression model to assess the independent prognostic significance of tumor size. Survival curves were generated by the Kaplan–Meier method when appropriate.
Exploratory analyses were then performed to identify optimal cutoff points for tumor size with an a priori decision to dichotomize size based on 1 mm increments. Prognostic performance was evaluated using the proportion of variation explained (PVE), Harrell's concordance index (C-index), and visual inspection of Kaplan–Meier curves for stratification into distinct prognostic categories. The relative effectiveness of a risk prediction model compared to another is most commonly determined by calculating the PVE. The PVE ranges from 0% to 100%, with a higher number indicating superiority in predicting RFS (19). The C-index provides a measure of model discrimination, with a value of 1 indicating perfect prediction, while 0.5 is equivalent to the toss of a coin (20). Discrimination is the ability of a model to distinguish individuals who experience the outcome from those who remain event free. For a prognostic model, the C-index is the chance that given two individuals, one who will develop the event of interest and one who will remain event free, the prediction model will assign a higher probability of the event to the former. Since size data is combined with ETE, pN, and age for staging purposes, sensitivity analyses were performed to determine if optimal size thresholds differ after adjustment for these factors.
Results
Patient demographics
The study population consisted of 574 patients with PTC, including 455 women and 119 men, with a median age of 47.7 years (range: 14.9–86.9 years) and median follow-up of 3.3 years (mean 4.6 years). There were 534 total thyroidectomies (including 59 complete thyroidectomies), 38 hemithyroidectomies, and 2 isthmusectomies. Central neck dissection was performed in 230 patients (40.1%), generally as a prophylactic procedure, and lateral neck dissection was performed in 83 patients (14.5%), all of which were for clinically or radiologically evident disease. Postoperative RAI was administered in 393 (68.5%) patients. In terms of histological subtype, classic PTC was diagnosed in 431 (75.1%) patients, follicular variant in 120 (20.9%), oncocytic variant in 13 (2.3%), diffuse sclerosing variant in 6 (1.1%), and tall cell variant in 4 (0.7%) patients. A summary of relevant demographic and clinicopathological data is provided in Table 1.
T stage based on American Joint Committee on Cancer Staging Manual, eighth edition,
pT, pathological T stage; pN, pathological nodal (N) stage.
Primary tumor size
The mean and median maximum tumor size were 1.6 cm and 1.2 cm, respectively (range 0.3–11.5 cm). There were statistically significant associations between increasing tumor size and lymphovascular invasion (p < 0.001), nodal metastases (p < 0.001) and ETE (p < 0.001). A trend was also noted for ENE (p = 0.075). As expected, patients with larger tumors were more likely to undergo total thyroidectomy (p < 0.001) and prophylactic central neck dissection (p < 0.001), and receive RAI (p < 0.001).
Tumor size and recurrence-free survival
There were 67 recurrences, including 40 locoregional, 7 with distant metastases and 20 biochemical recurrences. The median time to recurrence was 1.6 years, with the latest recurrence occurring 14 years after surgery. On univariate analysis, tumor size was significantly associated with RFS (p < 0.001). As shown in Table 2, other factors associated with reduced RFS were age ≥55 years (p = 0.026), male sex (p = 0.004), microscopic ETE (p = 0.004), macroscopic ETE (p = 0.002), central and lateral neck nodal metastases (pN1a, p < 0.001; pN1b p < 0.001), and ENE (p < 0.001). Statistically significant covariates were used to develop a multivariate model, in which maximum tumor size remained a significant predictor of RFS (hazard ratio, HR, 1.16; [confidence interval, CI, 1.01–1.34]; p = 0.038) after adjusting for age and nodal metastases. This translates to a 16% increased risk of disease recurrence with every 1 cm increase in size of the primary tumor. Since the distribution of tumor size data was right skewed, we repeated the analysis after logarithmic transformation and found consistent results on both univariate (p = 0.001) and multivariate (p = 0.025) analyses. Similarly, the analysis was repeated with size as a categorical variable using the current 2 and 4 cm thresholds, and results remained consistent.
CI, 95% confidence interval; ETE, extrathyroidal extension; HR, hazard ratio; NS, not significant.
Interaction between tumor size and age
We then performed subgroup analysis of tumor size in patients aged <55 and ≥55 years. In patients <55 years old, we found no association between tumor size and RFS on multivariate analysis (HR 1.11; [CI 0.89–1.38]; p = 0.362). In contrast, in patients aged ≥55 years, tumor size was a statistically significant predictor of RFS on multivariate analysis after adjusting for sex, ETE, nodal metastases and ENE (HR 1.52; [CI 1.11–2.07]; p = 0.009). In this subgroup, the clinical impact was also higher with a 52% increase in risk of recurrence with each 1 cm increase in tumor size. Figure 1 demonstrates the difference in relative hazard of recurrence with increasing size based on patient age, after adjusting for ETE and pathological nodal stage. Again, these results were consistent when size was analyzed as a log-transformed or categorical variable. The Kaplan–Meier curves for RFS based on the existing AJCC size categories of 2 and 4 cm are shown in Figure 2. In our population, risk of recurrence in the 2–4 cm (HR 2.16, [CI 1.28–3.66]; p = 0.004) and ≥4 cm groups (HR 2.28, [CI 0.96–5.41]; p = 0.061) were similar. When directly compared, there was no statistically significant difference in RFS between tumors >4 cm versus those 2–4 cm in size (p = 0.906).

Plot of relative hazard of recurrence based on increasing tumor size (after adjustment for extrathyroidal extension and nodal metastases) in patients <55 vs. ≥55 years of age, showing that the adverse prognostic impact of increasing tumor size occurs predominantly in older patients.

Recurrence-free survival stratified by current American Joint Committee on Cancer size categories of 2 and 4 cm, showing similar survival between patients with tumor size 2–4 cm and ≥4 cm.
Optimal size thresholds
We proceeded to perform exploratory analyses to identify optimal size categories for prognostic discrimination by dichotomizing tumor size based on 1 mm increments in patients aged ≥55 years. Based on the PVE and C-index, we found an optimal cutoff of 2 cm (>2 cm versus ≤2 cm: (HR 5.24, [CI 2.30–11.92]; p < 0.001; PVE: 36%; C-index: 0.66). The optimal cutoff remained at 2 cm after adjustment for ETE and the presence of nodal metastases (HR 4.70, [CI 1.70–12.97]; p = 0.003; PVE: 49%; C-index: 0.77). The analysis was repeated within the ≤2 cm and >2 cm groups, however, we failed to identify further tumor size thresholds that provided additional prognostic discrimination within each size category. In support of our results, we found a statistically significant interaction between age and tumor size (p = 0.024), demonstrating that the impact of tumor size on recurrence risk is limited to the older patient subgroup. This is depicted in the form of a Kaplan-Meier curve in Figure 3. The reduced RFS in the group with tumors >2 cm in size and aged ≥55 years was apparent despite this group being treated more aggressively than tumors ≤2 cm, with higher rates of total thyroidectomy (100% versus 90%, p = 0.059), prophylactic central neck dissection (48% versus 14%, p < 0.001) and RAI administration (100% versus 50%, p < 0.001)

Recurrence-free survival stratified based on age (<55 vs. ≥55 years) and primary tumor size (≤2 cm vs. >2 cm) in overall study cohort demonstrating that the prognostic impact of tumor size is limited to the older patient subgroup.
Discussion
This study confirms the well accepted relationship between primary tumor size and recurrence in an Australian cohort of 574 patients with papillary thyroid carcinoma. Since the adverse prognostic impact of nodal metastases in PTC has been shown to be predominantly in older patients, we hypothesized that a similar relationship might exist for primary tumor size (16 –18). This was confirmed in subgroup analyses, which showed that the association between tumor size and RFS is weak in younger (<55 years) patients and not statistically significant on multivariate analysis (p = 0.362). In contrast, tumor size was a significant predictor of RFS on multivariate analysis in those aged ≥55 years. Assuming a linear relationship, this translated to a 52% increase in relative risk of recurrence per centimeter increase in tumor size. To our knowledge, this is the first study to show that the prognostic impact of tumor size in PTC appears to be limited to older patients.
Exploratory analyses in patients aged ≥55 years identified a single cutoff at 2 cm that maximized prognostic discrimination, with tumors >2 cm in size associated with a 5-fold increase in risk of recurrence compared with tumors ≤2 cm. This was despite the fact that these patients received more aggressive treatment; hence, our results may underestimate the true adverse prognostic impact of increasing tumor size. This result is consistent with the 2 cm size threshold recommended by some authors and used in the AJCC T staging system. Most notably, Ito et al. demonstrated that tumor size >2 cm was a strong predictor of lymph node recurrence in their review of 3219 patients (13). Similarly, in their study of 500 patients, Machens et al. demonstrated that tumor size >2 cm was associated with a significantly greater burden of extrathyroidal extension, lymph node metastases and distant metastases (14).
As shown in Figure 1, in patients over 55 years of age, there appeared to be a continuum of increased risk of recurrence with increasing primary tumor size. Despite this, we found no improvement in prognostic prediction by sub-stratifying tumors >2 cm in size. However, this may reflect limited statistical power since only 13.4% of patients aged ≥55 had tumors 2–4 cm in size and 6.7% were >4 cm. In support of this study's findings, Wang et al. demonstrated similar disease-specific survival between patients with tumor size <1 cm and 1–2 cm in their study of 1522 patients of all ages (21). Similarly, in a large, population-level study utilizing the National Cancer Data Base and the Surveillance, Epidemiology, and End Results program, Anderson et al. demonstrated similar overall survival between T1a and T1b patients of all ages (22). However, our results are in contrast to the American National Registry study by Bilimoria and colleagues. Although not the primary outcome examined in the study, the authors demonstrated a significant reduction in recurrence for tumors <1 cm compared to tumors 1–2 cm in size, with tumor size examined independent of age (11).
Our finding that there is no additional prognostic value in stratifying tumors >2 cm in size contradicts several studies. In the multi-center study of 1077 patients by Mazzaferi et al., analysis of size did not reach significance in tumors <1.5 cm but shared a linear relationship with recurrence in tumors >1.5 cm (2). Simpson et al. found that tumors >4 cm in size had worse disease-specific survival than those in the <1 and 1–4 cm groups in their review of 1074 patients (7). When compared to tumors <3 cm in size, Shah and colleagues found that tumors 3–5 cm and >5 cm conferred a 40% and 170% relative increase in risk of disease-specific death on multivariate analysis, respectively, in 931 patients (5). Finally, Shaha et al. also found that tumors >4 cm independently predicted worse disease-specific survival (6). It is important to note that most of these studies also included follicular thyroid carcinomas in their analysis, which may have influenced the results since follicular carcinomas tend to be larger at presentation (14), diagnosed in older patients, and are associated with worse outcomes (2,23). Furthermore, these older studies may not accurately reflect current clinical practice, particularly given the increasing number of small papillary thyroid cancers being diagnosed and the changes in diagnosis of recurrence with routine use of high-resolution ultrasound and sensitive thyroglobulin assays.
Although previous authors have recommended a range of optimal size cutoffs for PTC, it is important to note that our analysis is limited to the older subset of patients in whom size appeared to be an important prognostic factor, and this may also, in part, account for differences in our study. To our knowledge, there are no previous studies analyzing the impact of tumor size in the context of patient age, despite the long-standing recognition of age as an important prognostic factor in WDTC and its critical role in the AJCC TNM staging system. Furthermore, clearly most adverse risk features in papillary thyroid carcinoma presumably represent a spectrum of risk from a disease biology perspective, whether this be patient age, the extent of extrathyroidal extension, the volume of nodal metastatic disease, or tumor size. However, sensible categorization of variables allows clinical utility in terms of informing clinical decision making, incorporation into prognostic systems, research studies, and communication between clinicians.
Our findings may have a number of potential management implications, although the current retrospective analysis was not designed to investigate these. Even within the framework of the ATA Management Guidelines, there are wide variations in practice internationally between centers. Many clinicians base decisions regarding extent of surgery, including suitability for lobectomy versus total thyroidectomy and the role of prophylactic central neck dissection, largely on tumor size without consideration of patient age. Similarly, the administration and dosing of adjuvant radioactive iodine may need to be reconsidered in the context of the patient's age if the main adverse feature is primary tumor size. Finally, our results may also help provide guidance on the intensity of surveillance post treatment by more accurate risk stratification. Clearly these issues need to be addressed carefully in future studies if our results are validated by other institutions.
There are a number of limitations in this study that warrant consideration. Firstly, the mean follow-up of 4.6 years meant a proportion of late recurrences were not captured in our dataset. Secondly, the study is underpowered to use the endpoint of overall survival, which the TNM system is designed to predict, as well as disease-specific survival and local recurrence, which would be of interest. Limitation in power may also contribute to the finding that size has no prognostic value for patients aged <55 years, and our failure to find useful size thresholds beyond 2 cm in patients aged ≥55 years. Thirdly, due to the retrospective study design, patients with larger tumors received more aggressive treatment, and hence this study may underestimate the true extent of clinical impact from increasing tumor size.
Conclusion
To our knowledge, this is the first study to show that the impact of tumor size risk of recurrence in PTC appears to be limited to patients aged ≥55 years. In this group, we found a single size threshold of 2 cm provided maximum prognostic discrimination, with tumors >2 cm associated with a five times higher risk of recurrence than those ≤2 cm. There was no incremental improvement in prognostic discrimination by further stratification of size. Tumor size did not appear to impact RFS in younger patients. These findings need to be validated in independent large cohorts and the potential management and risk stratification implications further studied.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
