Abstract
Background:
This study used a propensity score analysis to assess the roles of core-needle biopsy (CNB) and fine-needle aspiration (FNA) in the evaluation of thyroid incidentalomas detected on 18F-fluorodeoxyglucose positron emission tomography/computed tomography (18F-FDG PET/CT).
Methods:
The study population was obtained from a historical cohort who underwent 18F-FDG PET/CT between October 2008 and September 2015. Patients were included who underwent ultrasound-guided CNB or FNA for incidental focal uptake of 18F-FDG in the thyroid gland on PET/CT. The primary study outcomes included the inconclusive result rates in the CNB and FNA groups. The secondary outcome measures included the non-diagnostic result rate and the diagnostic performance for neoplasms. Multivariate analysis, propensity score matching, and inverse probability weighting were conducted.
Results:
A total of 1360 nodules from 1338 patients were included in this study: 859 nodules from 850 patients underwent FNA, and 501 nodules from 488 patients underwent CNB. Compared to FNA, CNB demonstrated a significantly lower inconclusive result rate in the pooled cohort (23.8% vs. 35.4%; p < 0.001), propensity score-matched cohorts (22.9% vs. 36.6%; p < 0.001), and with inverse probability weighting (22.4% vs. 35.2%; p < 0.001). Non-diagnostic result rates were also significantly lower in CNB than in FNA. The diagnostic performance of the two groups in the pooled and matched cohorts was similar, with no significant differences found.
Conclusions:
The significantly lower inconclusive result rates in CNB than in FNA were consistent within the propensity score-matched cohorts. Therefore, CNB appears to be a promising diagnostic tool for patients with thyroid incidentalomas detected on 18F-FDG PET/CT.
Introduction
W
Core-needle biopsy (CNB) has been suggested as an alternative approach for collecting samples for thyroid nodule evaluation, as it is safe and well-tolerated, and demonstrates a low incidence of complications (13 –18). A recent meta-analysis demonstrated that CNB resulted in a significantly lower rate of inconclusive results (8.0%) than did FNA (40.2%) (19). Additionally, in comparison with FNA, CNB resulted in a lower rate of inconclusive results on secondary investigations of thyroid nodules following previous FNA results with non-diagnostic (11,14) or atypia of undetermined significance/follicular lesions of undetermined significance (AUS/FLUS) cytology results (14,15). Given the advantage of reduced inconclusive results for CNB of thyroid nodules, it was hypothesized that CNB may reduce the inconclusive result rate in the diagnosis of thyroid incidentalomas detected on 18F-FDG PET/CT. To the authors' knowledge, there are no studies comparing CNB and FNA for diagnosing thyroid incidentalomas detected on 18F-FDG PET/CT. This study therefore assessed the role of CNB and FNA in the evaluation of thyroid incidentalomas detected on 18F-FDG PET/CT using propensity score analysis.
Materials and Methods
This observational study was approved by the Institutional Review Board of the Asan Medical Center, and the requirement for informed consent for inclusion in the study was waived. Written informed consent for thyroid US and US-guided biopsy procedures was obtained from all patients before each US examination. The methods and reporting of results are in accordance with the Strengthening the Reporting of Observational Studies in Epidemiology statement (20). There was no external financial support for this study.
Study population
The study population was obtained from a historical cohort of 96,942 consecutive patients who underwent 18F-FDG PET/CT at the Asan Medical Center (a 2700-bed academic tertiary referral hospital in Seoul, Republic of Korea) between October 2008 and September 2015. The inclusion criteria used to select patients were (i) underwent 18F-FDG PET/CT for workup of non-thyroidal malignancies or a health checkup, (ii) had an incidentally detected focal uptake in the thyroid gland on 18F-FDG PET/CT, and (iii) underwent US-guided CNB or FNA. Patients were excluded from the study population if they (i) had known thyroid disease and (ii) did not undergo US-guided CNB or FNA (Fig. 1). Finally, 1338 patients were included in the study. A malignant diagnosis was made when malignancy was confirmed on the surgical specimen, CNB histology, or FNA cytology. A diagnosis of a benign nodule was made when any one of the following criteria was met: (i) confirmation from a surgical specimen, or (ii) benign CNB histology or FNA cytology findings.

Patient flow in the study patients. CNB, core-needle biopsy; FNA, fine-needle aspiration.
18F-FDG PET/CT acquisition and image analysis
All patients fasted for at least 6 h before their 18F-FDG PET/CT examination. Patients were injected with 370–555 MBq of 18F-FDG. Venous blood glucose levels were controlled to ensure that they were <150 mg/dL. Patients were allowed to rest in a sitting or supine position for 60 min prior to scanning. Thereafter, PET/CTs were obtained using a Biograph Sensation 16, Biograph Truepoint 40 (Siemens Healthcare, Erlangen, Germany), Discovery STe 8, Discovery 690, Discovery 690 Elite, or Discovery 710 (GE Healthcare, Waukesha, WI) scanner. CT scans were performed first, followed by a PET acquisition from the upper thighs to the skull base. CT scan parameters were as follows: 120 kVp, Care dose 4D, and a slice thickness of 5 mm for the Biograph Sensation 16 and Biograph Truepoint 40 scanners, and 140 kVp, auto mA, and a slice thickness of 3.75 mm for the Discovery STe 8, Discovery 690, Discovery 690 Elite, and Discovery 710 scanners. No intravenous contrast agent was used. PET images were reconstructed by an iterative algorithm with attenuation correction. Voxel sizes were 3.91 mm × 3.91 mm × 5.0 mm for the Biograph Sensation 16, 2.98 mm × 2.98 mm × 5.0 mm for the Biograph Truepoint 40, 3.91 mm × 3.91 mm × 3.75 mm for the Discovery STe 8, and 2.60 mm × 2.60 mm × 3.75 mm for the Discovery STe 8, Discovery 690, Discovery 690 Elite, and Discovery 710. CT images, attenuation-corrected PET images, and combined PET/CT images were evaluated visually by experienced nuclear medicine board-certificated physicians (>10 years of clinical experience) using a dedicated workstation (True D for Siemens; AW for GE Healthcare).
US-guided FNA and CNB procedures
Thyroid US was performed on one of four US systems equipped with a high-frequency linear transducer: an iU22, a HDI-5000 (Philips Healthcare, Bothell, WA), an EUB-7500 (Hitachi Medical Systems, Tokyo, Japan), or an Aixplorer (SuperSonic Imagine, Aix-en-Provence, France). US-guided procedures were performed by radiologists under the supervision of three faculty radiologists (J.H.B., J.H.L., and Y.J.C., with, respectively, 19, 14, and 7 years of clinical experience in performing and evaluating thyroid US).
US-guided CNB and FNA procedures were conducted as reported in current practice guidelines (18,21). CNB procedures were performed using a disposable 1.1 or 1.6 cm excursion 18-gauge double-action spring-activated needle (TSK Ace-cut; Create Medic, Yokohama, Japan) under local anesthesia with 1% lidocaine (14,15,22,23). FNA procedures were normally performed using a 23-gauge needle. The sufficiency of the CNB or FNA procedure was monitored using real-time US, and the adequacy of the tissue core was assessed by visual inspection (23). An additional CNB or FNA was recommended in the case of insufficient biopsy material on visual assessment.
Histopathologic analysis of CNB and cytopathologic analysis of FNA
All CNB specimens and FNA cytology were interpreted by a thyroid cytopathologist (D.E.S., with 12 years of clinical experience in thyroid cytopathology). FNA cytology results were classified into six categories according to The Bethesda System for Reporting Thyroid Cytopathology: non-diagnostic, benign, AUS/FLUS, follicular neoplasm or suspicious for a follicular neoplasm, suspicious for malignancy, and malignant (24). Although standardization of the CNB diagnostic criteria for thyroid nodules had yet to be established, the histopathologic results of CNB were also classified into the same six categories similar to those used in the Bethesda System (14,15,22,24,25).
Analysis of the US findings
To evaluate the US characteristics of the nodules, the US images were reviewed separately by two radiologists (C.H.S. and Y.J.C. with 5 and 7 years of clinical experience in performing and evaluating thyroid US). Reviewers were blinded to any clinical or radiologic information. Discrepancies between the US findings of the two reviewers were resolved by a third reviewer (J.H.B. with 19 years of clinical experience in performing and evaluating thyroid US). The US characteristics of the thyroid nodules were assessed according to the following features: nodule size, composition (solid, predominantly solid, predominantly cystic, or cystic), presence of spongiform alterations, shape (ovoid to round, taller than wide, or irregular), margins (smooth, spiculated, or ill-defined), echogenicity (isoechoic, hypoechoic, markedly hypoechoic, or hyperechoic), and calcifications (none, microcalcifications, macrocalcifications, or rim calcifications) (23,26 –29).
Outcome measures
The primary study outcomes included comparison of the inconclusive result rates between CNB and FNA groups. An inconclusive diagnosis was defined as FNA or CNB results showing non-diagnostic and AUS/FLUS results (15,22,23). The secondary outcome measures included the non-diagnostic result rate and the diagnostic performance for neoplasms. The diagnostic criteria for neoplasms were defined as follicular neoplasm or suspicious for a follicular neoplasm, suspicious for malignancy, and malignant (15,22,23).
Statistical analysis
All patients who satisfied the eligibility criteria at the baseline were included in this observational study. The baseline characteristics of the two groups were compared using chi-square or Fisher's exact tests for categorical variables, and Student's t-tests or Mann–Whitney tests for continuous variables. To determine the risk factors associated with inconclusive results, a multivariate logistic regression analysis was performed using generalized estimating equations that accounted for the clustering of the same subjects, and that were fitted using a backward elimination approach. Potential adjustment variables were age, sex, FNA, CNB, and US characteristics, including the nodule size, composition, spongiform alterations, shape, margin, echogenicity, and calcifications. Variables included in the final analysis were CNB, the nodule size, composition, shape, margin, echogenicity, and calcifications.
It was anticipated that the CNB and FNA groups may differ in their baseline characteristics. Therefore, baseline variables from the CNB and FNA groups were obtained to aid in adjusting the comparisons. To reduce any selection bias and potentially confounding variables, adjustments for significant differences in variables in the baseline characteristics of patients were conducted using propensity score matching and inverse probability weighting (30 –32). Propensity scores based on patient characteristics were developed using a logistic regression model with the adjustment of between-group differences. After propensity score matching was conducted, the baseline variables were compared using a marginal homogeneity test for categorical variables and a paired t-test for continuous variables. The distribution of propensity scores was assessed allowing sufficient overlap between the groups to ensure comparability (31). The Hosmer–Lemeshow goodness-of-fit statistic and the c-statistic were conducted to evaluate an adequate level of calibration. Inverse probability weighting, which can be used to compensate for imbalances in both groups, was also conducted (32). Statistical analyses were performed using SAS v9.4 (SAS Institute, Cary, NC) and IBM SPSS Statistics for Windows v21.0 (IBM, Armonk, NY), with statistical significance being defined as p < 0.05.
Results
Characteristics of the pooled cohort
A total of 1360 nodules from 1338 patients were included in this study: 859 nodules from 850 patients were subjected to FNA (FNA group), and 501 nodules from 488 patients were subjected to CNB (CNB group; Fig. 1). A final diagnosis was made for 945 of these nodules, with 523 receiving a final diagnosis of malignancy (479 classic-type papillary thyroid carcinomas, 14 follicular variant papillary thyroid carcinomas, 11 follicular carcinomas, 11 metastases, 5 medullary thyroid carcinomas, and 3 others). The estimated risk of malignancy was therefore between 38.5% (523/1360) and 55.3% (523/945), depending on whether the denominator was the total number of nodules (n = 1360) or the number of nodules that received a final diagnosis (n = 945). Table 1 shows the baseline characteristics of the pooled cohort and demonstrates that more than half of the baseline characteristics in the FNA and CNB groups were comparable, including age, final malignant nodules, composition, spongiform alterations, margin, echogenicity, and calcifications. The mean ± standard deviation time interval between the 18F-FDG PET/CT and the CNB or FNA procedure was 2.5 ± 1.7 months. There were no major complications in either the FNA or CNB group, and no patients required hospital admission or intervention.
CNB, core-needle biopsy; FNA, fine-needle aspiration; US, ultrasound.
Risk factors for inconclusive results in the pooled patient cohort
The inconclusive result rate was significantly lower in the CNB group (23.8%; 119/501 nodules) than in the FNA group (35.4%; 304/859 nodules; p < 0.001). The multivariate logistic regression analysis demonstrated that CNB (0.53 [confidence interval (CI) 0.41–0.70]), a spiculated margin (0.39 [CI 0.25–0.60]), hypoechoic echogenicity (0.55 [CI 0.41–0.72]), markedly hypoechoic echogenicity (0.31 [CI 0.19–0.53]), and macrocalcifications (0.43 [CI 0.27–0.71]) reduced the risks associated with inconclusive results (Table 2).
Characteristics of the matched cohort
Using propensity score matching conducted on the whole patient population, 467 patients who underwent CNB were matched 1:1 with patients who underwent FNA. The logistic regression model including the nine variables (age, sex, nodule size, composition, spongiform alterations, shape, margin, echogenicity, and calcifications) yielded a c-statistic of 0.661. The Hosmer–Lemeshow goodness-of-fit test showed that the propensity score model demonstrated an adequate level of calibration (p = 0.6281). In the matched cohorts, there were no longer significant differences between the CNB and FNA groups for any of the variables (Table 3).
Study outcomes for the pooled and matched cohorts
Table 4 compares the diagnostic results obtained using CNB and FNA for thyroid incidentalomas detected by 18F-FDG PET/CT in the pooled cohort. Including all nodules, CNB demonstrated a consistently lower inconclusive result rate than FNA in the pooled cohorts (23.8% vs. 35.4%; p < 0.001), the propensity score matched cohorts (22.9% vs. 36.6%; p < 0.001), and with inverse probability weighting (22.4% vs. 35.2%; p < 0.001) (Table 5). The non-diagnostic result rates for all nodules were also significantly lower in the CNB group than in the FNA group in the pooled cohorts (3.4% vs. 9.5%; p < 0.001), propensity score matched cohorts (3.0% vs. 11.6%; p < 0.001), and with inverse probability weighting (3.2% vs. 9.5%; p < 0.001). The diagnostic performance for neoplasms did not show any significant difference between CNB and FNA in either the pooled or matched cohorts (Table 6). For nodules ≥1 cm across the maximum diameter, CNB demonstrated significantly lower inconclusive and non-diagnostic results rates in the pooled cohorts, propensity score matched cohorts, and with inverse probability weighting (Table 5).
Data are number of nodules, with percentages in parentheses. Percentages do not add up to 100% because of rounding.
F-FDG PET/CT, 18F-fluorodeoxyglucose positron emission tomography/computed tomography; AUS, atypia of undetermined significance; FLUS, follicular lesion of undetermined significance; FN, follicular neoplasm; SFN, suspicious for a follicular neoplasm.
Data are percentages of nodules, with number in parentheses.
Discussion
This study compared the role of US-guided CNB and FNA in the assessment of thyroid incidentalomas detected on 18F-FDG PET/CT. It demonstrates that CNB showed consistently lower inconclusive result rates in the pooled cohorts, multivariate analysis, propensity score matching, and with inverse probability weighting. Additionally, CNB showed lower non-diagnostic results across all analyses compared to FNA. The diagnostic performance for neoplasms was similar between CNB and FNA. Therefore, CNB appears to be a useful diagnostic approach for patients with thyroid incidentalomas detected on 18F-FDG PET/CT.
Thyroid incidentalomas detected on 18F-FDG PET/CT are considered to be of potential clinical relevance, as the increased 18F-FDG uptake alone implies the possibility of malignancy (33 –36). In the present study, the estimated risk of malignancy was between 38.4% (523/1360) and 54.8% (523/954), depending on whether the denominator was the total number of nodules (n = 1360) or the number of nodules that received a final diagnosis (n = 945). This high prevalence of malignant disease suggests that focal 18F-FDG thyroid incidentalomas should be subjected to further diagnostic workup (6). Several previous studies focused on use of the standardized uptake value (SUV), US findings, or FNA for the diagnostic workup of thyroid incidentalomas detected on 18F-FDG PET/CT (7 –10). Yet, to the best of the authors' knowledge, no study has evaluated the role of CNB in the assessment of thyroid incidentalomas detected on 18F-FDG PET/CT.
CNB is an effective and safe diagnostic modality for thyroid nodules. The present observational study used robust methodologies, including univariate and multivariate logistic regression analyses, propensity score matching, and inverse probability weighting, to reduce the effects of potential confounding variables and selection bias. The study results demonstrate that inconclusive result rates were consistently lower with CNB than with FNA in the pooled, propensity score matched, and inverse probability weighting analyses. This was the case for all nodules and those nodules ≥1 cm in diameter. The low incidence of inconclusive results for CNB may be explained by several factors. First, CNB provides larger tissue samples, which may facilitate a more precise histological diagnosis than that available from FNA, and may also permit additional immunohistochemical staining for the differential diagnosis (12,37). Second, CNB provides more information about the histological architecture and the relationship between the nodule and adjacent thyroid tissue (14,17). Future prospective studies are needed to validate the role of CNB in the diagnosis of thyroid incidentalomas detected on 18F-FDG PET/CT. In this context, performing CNB for patients with thyroid incidentalomas detected on 18F-FDG PET/CT is cautiously recommended. However, less-experienced operators may have difficulties in finding the needle tip under US guidance, which can increase the possibility of complications (18), and the risk of malignancy of thyroid nodules were variable according to the US characteristics and previous biopsy results (6,27,29,38). Therefore, future research is needed to determine the management flow of thyroid incidentalomas detected on 18F-FDG PET/CT in relationship to experience with CNB, US characteristics, and previous biopsy results.
This study has several limitations. First, the study is based on observational data. To reduce the inherent limitations of the observational study, multivariate regression and propensity score analyses were applied. Second, because this is a single-center study, the generalizability of these results may be limited, and a further prospective multicenter study may be needed. Third, the study did not evaluate the maximum SUV values, as the data had been acquired on a variety of PET/CT scanners. The data in this study originated from six different scanners over a period of around seven years. These scanners had different manufacturers, manufacturing dates, and specifications, and, accordingly, different imaging protocols. The SUV is a semi-quantitative value affected by several factors (39). In this study, different voxel sizes and resolutions could have led to differences in SUV. Differences in the CT voltage for attenuation correction could also have affected SUV (40). SUV is highly dependent on the size of the region of interest (41), which implies that SUV could be less accurate for smaller lesions. As most thyroid nodules in this study were around 1.5 cm, it is thought that interpretation of the SUV of thyroidal lesions requires much consideration. Additionally, the differentiation of benign and malignant lesions according to SUV is controversial because of the overlap in SUV between benign and malignant states (42). The 2015 ATA recommends workup for incidental 18F-FDG uptake in the thyroid, without providing any specific SUV cutoff. In this context, visual evaluations were mainly performed, as they were straightforward and time-saving in clinical practice, especially for the purpose of screening. In this study, SUVs of thyroid nodules were >2.0 in all except two nodules. As the SUV was calculated based on the lean body weight, this value may correspond to 2.5–3.0 if the SUV was calculate corrected for body weight.
In conclusion, the observations of significantly lower inconclusive result rates for CNB than for FNA were overall consistent in the propensity score matched cohorts, and therefore CNB appears to be a promising diagnostic tool for patients with thyroid incidentalomas detected on 18F-FDG PET/CT.
Footnotes
Author Disclosure Statement
No competing financial interests exist.
