Abstract
Background
Radiomic feature extraction from cone-beam computed tomography (CBCT) images in radiotherapy has potential for predicting tumor control and treatment-related toxicity. However, the reliability of CBCT-based radiomics is limited by variations in scatter intensity associated with differences in patient size.
Objective
To evaluate the impact of patient size on CBCT-derived radiomic features and investigate whether a novel quantitative CBCT technique can reduce patient size–induced radiomic feature variability.
Methods
Phantoms representing small and large body habitus were scanned using a linac-mounted CBCT incorporating a two-dimensional antiscatter grid (CBCT-2DASG) and standard clinical protocols. Ninety-three intensity and texture features were extracted from CBCT and multidetector CT (MDCT) images, and robust features were identified by analyzing the concordance correlation between small and large phantom images.
Results
Compared with MDCT, 53% of CBCT-2DASG features were robust to patient size variation, versus 8 features from standard CBCT. The CBCT-2DASG reduced size–dependent feature variation, approximately fourfold on average for intensity-based features, with improvement varying across feature types.
Conclusions
CBCT-2DASG with effective scatter suppression reduces patient size–dependent radiomic feature variability in phantom experiments, improving the robustness of CBCT-derived features to patient size variation. This represents a promising step toward more reliable CBCT-based radiomics in radiotherapy.
Introduction
Extraction and analysis of radiomic features from CBCT images acquired during radiotherapy have the potential to capture underlying biological processes, disease characteristics, and treatment responses.1,2 This strategy offers a means of assessing the initial tumor response to radiotherapy and toxicity prediction, thereby supporting personalized radiation treatment decisions. This approach has been investigated across various radiotherapy sites, such as lung, 3 prostate,4,5 and head and neck. 6
However, the accuracy of radiomics-based models for monitoring treatment response is significantly influenced by artifacts and noise in CBCT images, which result from a combination of factors including patient motion, scatter, beam hardening, reconstruction methods, imaging protocol variations, and anatomical site characteristics. 7 The repeatability and reproducibility of extracted radiomic features are further affected by variations in imaging equipment, patient positioning, and inter-institutional differences in acquisition and reconstruction parameters. Addressing all of these sources of variability will ultimately be necessary to achieve clinically robust CBCT-based radiomics.8,9
In CBCT imaging, one of the primary factors that compromise feature repeatability and reproducibility is patient size. The amounts of scatter and primary X-ray fluence reaching the detector depend on patient size. A larger tissue volume within the field of view increases scatter intensity and attenuates primary X-rays, which degrades CT number accuracy, enhances image artifacts, reduces contrast, and increases noise. These effects cause variability in image features among patients of different sizes, potentially introducing patient size dependency into radiomics-based prediction models. 10 While stratifying radiomics data based on anatomical characteristics or body habitus is one possible solution, it has two major drawbacks. 11 First, it reduces the data available in each subgroup, potentially compromising model accuracy. Second, even within subgroups, CT numbers, noise, and artifacts may still differ among patients, thereby decreasing the accuracy of the extracted radiomic features. Therefore, improving the robustness of radiomics features to patient size variation is an essential step, which is one of several challenges that must be collectively addressed to enable reliable CBCT-based radiomics in clinical practice. While the present study focuses specifically on scatter, one of the most fundamental and dominant sources of image quality degradation in linac-mounted CBCT, organ motion, beam hardening, protocol variability, and anatomical site differences each contribute independently to radiomic feature instability and will require dedicated investigation in future work.
To date, the impact of patient size on CBCT radiomic feature variability has not been thoroughly investigated, nor have effective strategies been developed to mitigate these effects. This study therefore addresses one well-defined and physically fundamental contributor to radiomic instability in CBCT, with the explicit understanding that a complete solution will require parallel efforts addressing the broader set of factors that affect image quality and feature reproducibility in clinical CBCT imaging. This study seeks to investigate how patient size influences radiomic features in CBCT images acquired with a C-arm linac-mounted system. Moreover, it assesses the effectiveness of a novel scatter-suppression method designed to improve the quantitative accuracy of CBCT and reduce variability in texture and intensity features.
In previous studies, a quantitative CBCT technique incorporating a two-dimensional (2D) antiscatter grid was developed and demonstrated to enhance image quality by reducing scatter, image lag, and beam hardening.12,13 In this study, a similar CBCT imaging approach utilizing a 2D antiscatter grid (CBCT-2DASG) was applied to investigate its impact on the robustness of radiomic features, in comparison with multidetector computed tomography (MDCT) and standard clinical CBCT images. Patient size effects were simulated using two phantoms of different sizes, fabricated from identical materials.
Prior studies on CBCT scatter correction have primarily evaluated image quality metrics such as CT number accuracy, contrast-to-noise ratio, and artifact reduction, without assessing the downstream impact on radiomic feature stability. Conversely, studies on radiomic reproducibility have examined the effects of reconstruction algorithms, scanner variability, and protocol differences, but have not systematically isolated the role of patient size–dependent scatter as a source of radiomic variability in linac-mounted CBCT. The present study bridges these two areas for the first time, providing a direct quantitative link between scatter physics and radiomic feature instability, and demonstrating that hardware-based scatter suppression can improve radiomic robustness. Thus, the novelty of this work is twofold: (1) the first systematic investigation of scatter-induced radiomic feature variability in linac-mounted CBCT, and (2) the demonstration that hardware-based scatter suppression via a 2D antiscatter grid reduces this variability in phantom experiments, an advance that complements ongoing methodological developments in radiomics analysis.
Methods
Acquisition of CBCT and MDCT images
To simulate variations in patient size within the imaging field of view, this study utilized images of two Gammex electron density phantoms (Sun Nuclear Corporation, FL) with identical material compositions but different sizes. One phantom was a small, head-sized cylindrical model with a diameter of 20 cm, while the other was a large, pelvis-sized phantom with a lateral dimension of 40 cm. CBCT images were acquired using the clinically available Pelvis protocol (125 kVp, 1080 mAs) on a Varian TrueBeam onboard CBCT system (Varian Medical Systems, Palo Alto, CA). For each phantom, two CBCT scan sets were obtained using the same setup and acquisition parameters: one with the system's standard clinical image reconstruction software, and the other incorporating a prototype 2D antiscatter grid (ASG) installed on the flat panel detector. The CBCT images acquired with the 2D ASG underwent additional preprocessing to correct residual scatter, beam hardening, and image lag, using dedicated research software developed specifically for the 2D ASG technique. The clinical CBCT scans were reconstructed using both the standard filtered backprojection method 14 and the Iterative Reconstruction (IR) method 15 available within the clinical CBCT system, which were labelled as Standard and iCBCT options, respectively, in the clinical software. The Standard and iCBCT protocols employed different scatter-correction techniques: the former used scatter kernel superposition–based correction, whereas the latter utilized a three-dimensional object model and numerical solution of the Boltzmann transport equation for scatter estimation and correction.15,16
Raw CBCT projections acquired with the 2D ASG were exported and processed using the research implementation of the CBCT-2DASG technique, and reconstructed once using standard filtered backprojection and once using iterative reconstruction to reduce image noise. Consequently, a total of four CBCT image datasets were generated per phantom, two reconstructed with the clinical CBCT software and two with the quantitative CBCT-2DASG method. All CBCT images were reconstructed with a voxel size of 0.9 × 0.9 × 2 mm3. Additionally, the phantoms were scanned using a Philips Brilliance Big Bore 16-slice MDCT system (Philips Healthcare, Cleveland, OH) employed for radiotherapy simulation, which served as the reference (“gold standard”) imaging modality. MDCT images were acquired with a voxel size of 0.9 × 0.9 × 3 mm317 in helical acquisition mode at 120 kVp. All CBCT and MDCT images were rigidly registered, and the MDCT voxel dimensions were resampled to 0.9 × 0.9 × 2 mm3 for consistency.
Briefly, the CBCT-2DASG approach incorporates multiple image fidelity improvement methods. 18 The key component of this approach is a focused 2D antiscatter grid. 19 Its key properties are described in a prior study. 20 In addition, remaining scatter, image lag and beam hardening were corrected. 21 Images were reconstructed using the standard Feldkamp-Davis-Kress (FDK) algorithm and Alternating Steepest Descent by Projection onto Convex Sets (ASD-POCS) based iterative reconstruction method modified for offset detector scanning geometry.12,22,23 These images were designated as CBCT-2DASG FDK and CBCT-2DASG IR, respectively, throughout the text. The physical phantoms and the corresponding CBCT and MDCT image datasets generated for this study are shown in Figure 1.

(a) Picture of the electron density phantom used in imaging experiments. (b) CBCT and MDCT images of the head and central section of the pelvis sized electron density phantoms are shown. Clinical CBCT standard reconstruction employs kernel superposition-based scatter correction and FDK reconstruction method. Clinical CBCT advanced reconstruction employs a model-based scatter correction method and iterative reconstruction. (c), the absolute difference between the head and pelvis sized phantoms are shown in color map.
Selection of robust radiomics features in MDCT images
Ideally, radiomic feature values should remain consistent regardless of the size of the imaged object. However, this is not always achievable for certain texture and intensity features, even in MDCT images, due to changes in noise texture and potential variations in Hounsfield Unit (HU) values related to patient size. Therefore, in this study, a linear correlation model was used to relate the feature values extracted from the small- and large-sized phantom images. To identify radiomic features least affected by phantom size variations, features exhibiting high Concordance Correlation Coefficients (CCC) were selected. The CCC quantifies the deviation of measured feature values from the ideal correlation described above and is formulated as follows:
CCC reflects both the degree to which the best-fit line deviates from the ideal 45° line (
However, radiomic feature pairs extracted from small and large phantoms may exhibit strong linear correlation yet have dissimilar absolute values, thereby lowering the CCC. This effect is illustrated using two hypothetical feature pairs in Figure 2. Figure 2(a) shows a hypothetical feature pair with Cb values close to one, indicating a slope similar to the ideal 45° line and therefore an optimal Cb while the dispersion of data points around the fitted line reduces the ρ value. In contrast, Figure 2(b presents a second hypothetical feature pair that demonstrates a higher degree of linear correlation but larger differences between paired values, which reduce the Cb. The fitted line in this case has a slope that deviates substantially from the ideal 45° line.

Visual representation of a linear fit for a pair of features when (a) bias correction factor is close to 1 but Pearson correlation coefficient is less than 1 and (b) Pearson correlation coefficient is close to 1 but bias correction factor is less than 1. In (c), blue circles indicate the measured radiomic feature values and the red line is the fitted linear regression model in MDCT images. Predicted, radiomics feature values are indicated by yellow circles. Measured radiomics features from a high and low fidelity CBCT image sets (blue circles) are shown in (d) and (e), respectively. When compared to the linear regression model derived from MDCT images, measured feature values have smaller deviations with respect to the predicted values by the linear model in high fidelity CBCT images.
Such inherent dissimilarities in feature values extracted from small and large phantom images make the CCC a less reliable metric for identifying radiomic features that are robust to phantom size variations. Therefore, in this study, the Pearson correlation coefficient (ρ) was prioritized over the CCC as the primary measure for selecting radiomic features robust against phantom size variations.
To select such robust features, 23 Regions of Interest (ROIs) were placed in water-equivalent central section of each small and large electron density phantom MDCT image. Bone equivalent inserts were also placed in the phantom, which helped to mimic image artifacts induced in soft tissues by bony anatomy. 93 texture and intensity features
25
were extracted from each ROI using the 3D Slicer software
26
and
It is important to emphasize that this selection criterion was not intended to identify features with equivalent absolute values across phantom sizes, but rather features whose size-dependent behavior is sufficiently consistent and linear to support the regression-based modeling described in Section 2.3.
Modeling the phantom size dependence of CBCT radiomics feature values
After selecting reliable radiomic features as described in Section 2.2, feature values were extracted from the MDCT images of the small and large phantoms. For each radiomic feature, the correlation between feature values of the two phantoms was modeled using linear regression. Ideally, a comparable size-dependent scaling behavior would be preserved in CBCT images. However, image texture and intensity features may differ between MDCT and CBCT due to variations in reconstruction methods, image noise, and resolution characteristics of the imaging systems. The reduced image quality of CBCT further causes CBCT-derived feature values to deviate from the MDCT-based model. Therefore, the linear correlations between the small- and large-phantom radiomic features extracted from MDCT and CBCT images were evaluated in this study. We hypothesized that artifact-free CBCT images would exhibit smaller deviations between MDCT model-predicted and measured CBCT feature values, indicating greater robustness of CBCT radiomic features to variations in object size.
This approach is illustrated in Figure 2. For a given radiomic feature, feature values were extracted from regions of interest (ROIs) placed in the small and large phantom MDCT images (blue circles in Figure 2(c)), and a linear regression model was fitted. Subsequently, the same regression model was applied to radiomic feature values from high-fidelity CBCT images (Figure 2(d)), where the measured and predicted feature values were in close agreement. In contrast, larger residual errors were observed between the measured and predicted feature values for low-fidelity CBCT images (Figure 2(e)). These findings indicate that high-fidelity CBCT exhibits greater feature robustness with respect to phantom size compared with low-fidelity CBCT.
The linear regression model that correlates radiomics feature values in small and large phantom MDCT images could be formulated as:
Finally, distributions of
A direct numerical comparison of radiomic feature values between MDCT and CBCT is not appropriate here, as the two modalities differ in noise characteristics, scatter properties, reconstruction algorithms, and detector geometry, all of which influence feature values independently of patient size. Rather than comparing absolute feature values across modalities, we therefore assess whether each CBCT method preserves the size-dependent scaling behavior established in MDCT. This approach isolates patient size effects from modality-specific imaging differences, which is the relevant comparison for evaluating size-induced radiomic variability.
Results
Selection of robust radiomics features in MDCT images
Among the 18 intensity and 75 texture features analyzed, only 5 intensity features and 1 texture feature had a
Bias correction factor and Pearson's correlation coefficient for selected features, extracted from small and large phantom MDCT images, ◉ represents low or no correlation (less than
Second,
Robustness of CBCT radiomics features
In Figure 3, measured feature pairs with predicted feature pairs across different CBCT protocols is shown for the First Order Mean Absolute Deviation feature, where MDCT exhibits smaller deviations between measured and predicted feature values. CBCT-2DASG IR has the least errors between measured and predicted feature values among CBCT modalities. Clinical CBCT image features exhibited larger deviations compared to the ones extracted from CBCT-2DASG. This observation could be generalized to other features as well.

Demonstration of actual feature values and predicted feature values for first order mean absolute deviation (FOMAD) feature for all imaging methods investigated. Predicted linear model is a linear fit to the feature values extracted from the MDCT images. It represents the reference, or predicted, linear correlation between feature values extracted from the large and small phantoms. Ideally, a similar linear correlation is expected in CBCT images, if the texture and intensity characteristics of CBCT images are comparable to MDCT. Among all four CBCT methods investigated, the residual error between the measured values and the linear model is smallest for the CBCT-2DASG IR method.
The boxplot representation of normalized feature errors for all 45 selected features across the five imaging modalities is shown in Figure 4. For intensity-based features, the MDCT images exhibited the lowest normalized feature error, with an average of 1.2 ± 1.1%. The CBCT-2DASG IR and CBCT-2DASG FDK reconstructions yielded normalized feature errors of 2.0 ± 1.9% and 2.9 ± 2.1%, respectively, whereas the clinical CBCT IR and FDK images showed higher errors of 8.0 ± 5.8% and 26.5 ± 20.4%, respectively. This degradation in intensity-based features was likely related to differences in artifacts and CT number accuracy between the clinical FDK and IR images of the small and large phantoms. Consequently, larger deviations were expected between the measured CBCT feature values and those predicted by the linear regression model derived from MDCT images.

Boxplots of normalized feature errors in different imaging modalities for (a) only intensity-based (b) only texture-based and (c) both intensity and texture-based features. The median, interquartile range (25th–75th percentiles), and minimum–maximum values were represented by the horizontal line, box, and whiskers, respectively.
For texture-based features, the MDCT images had a normalized feature error of 6.1 ± 12.1%. The CBCT-2DASG FDK and IR images produced normalized feature errors of 4.5 ± 7.4% and 8.6 ± 6.5%, respectively. In comparison, the clinical IR and FDK images showed higher normalized feature errors of 11.2 ± 14.9% and 24.7 ± 21.1%, respectively. Overall, both CBCT-2DASG FDK and IR methods demonstrated smaller normalized feature errors compared with the clinical CBCT IR and FDK reconstructions.
The degree of improvement varied across feature classes, with intensity-based features showing the largest reductions in size-dependent error and texture-based features showing more modest improvements, reflecting the known sensitivity of higher-order texture features to residual image artifacts beyond scatter alone.
Statistical significance of radiomics feature errors between CBCT methods and MDCT
Table 2 presents the p-values calculated for the normalized error distributions of each CBCT type compared with the gold-standard MDCT. For 8 of the 10 intensity-based features analyzed, no statistically significant difference was detected in radiomic feature errors between CBCT-2DASG IR and MDCT images. In contrast, the CBCT-2DASG FDK, clinical IR, and clinical FDK reconstructions each yielded only one intensity-based feature for which no statistically significant difference relative to MDCT was detected.
The statistical significance of differences in radiomics feature modeling errors between MDCT and CBCT methods for selected features. A p-value of 0.05 or greater indicates that no statistically significant difference was detected between the mean error distributions of MDCT and the respective CBCT method. This should not be interpreted as evidence of equivalence between the two methods.
Similarly, no statistically significant difference in feature errors was detected for 16 of the 35 texture-based features when comparing CBCT-2DASG IR with MDCT. The CBCT-2DASG FDK, clinical IR, and clinical FDK images had 11, 7, and 7 features, respectively, with phantom-size–dependent errors that were not significantly different from those in MDCT.
Discussion
Consistency in CT image quality across patient cohorts and CT systems plays a crucial role in ensuring the reliability of radiomic features and in developing robust radiomics-based treatment response models in radiotherapy. Variations in scan protocol parameters, reconstruction methods, denoising filters, and imaging systems can introduce scan-to-scan variability in image quality, thereby reducing the consistency of radiomic features across scans.9,31–34
A key consideration in this study is the use of the Pearson correlation coefficient (ρ) as the primary criterion for selecting radiomic features robust to phantom size variation, rather than the Concordance Correlation Coefficient (CCC) or a direct measure of value equivalence. The goal of the feature selection step was not to identify features whose absolute values are identical between small and large phantoms; such equivalence is not expected, even in MDCT, given that object size influences noise texture, CT number accuracy, and image characteristics in ways that affect feature values systematically. As shown in Table 1 and discussed in Section 2.2, the average bias correction factor across all features was only 0.3 ± 0.3 in MDCT images, confirming that absolute value equivalence between small and large phantom features is the exception rather than the rule, even in the gold standard modality. Requiring equivalence as a selection criterion would therefore discard the vast majority of features not because they are unstable, but simply because they respond predictably to changes in object size.
Instead, the ρ-based selection was designed to identify features whose size-dependent behavior is sufficiently consistent and predictable to be modeled, specifically, features for which the relationship between small and large phantom values follows a stable linear pattern. Such features are candidates for size normalization or correction using the regression framework described in Section 2.3, which is what the downstream analysis exploits. A feature that scales predictably with object size is not clinically unusable, it is correctable, provided the size-dependent relationship is well characterized. The prediction error analysis in Section 2.3 then provides the more stringent evaluation: it quantifies how well each CBCT method preserves the size-dependent scaling established in MDCT, capturing both systematic bias and variability in absolute feature values relative to the expected behavior. The two steps are therefore complementary by design, and the ρ threshold should be understood as a filter for predictability of size-dependent behavior, not a claim of value equivalence. We acknowledge that formal equivalence testing would represent a more rigorous standard for features intended for direct cross-size comparison without correction, and we recommend this as a direction for future work in which size-correction models are prospectively validated in patient cohorts.
As demonstrated in this study, both intensity- and texture-based radiomic features are also influenced by the size of the imaged object in CBCT images used for image-guided radiotherapy. Feature values extracted from small and large phantoms differed significantly, reflecting changes in image characteristics, such as noise texture, CT numbers, and artifact patterns, that are strongly dependent on object size.
Although patient size dependent variations in radiomic features are also observed in MDCT images, as shown in this study, the problem is more pronounced in clinically utilized CBCT images due to their relatively lower image quality. Patient size strongly affects the scatter content in CBCT projections, thereby increasing the inconsistency of image features compared with MDCT images. While advanced scatter correction algorithms and noise reduction using iterative reconstruction improve the stability and consistency of radiomic features, substantial variations in CBCT feature values were still observed with changes in phantom size.
A more quantitative CBCT imaging technique, CBCT-2DASG, improved the robustness of radiomic features to variations in object size compared with standard and advanced CBCT protocols on a linac-mounted CBCT system. For example, no statistically significant difference in phantom size–dependent feature error was detected for approximately 80% of intensity-based features and approximately 50% of texture-based features when comparing iteratively reconstructed CBCT-2DASG images with MDCT. In contrast, for CBCT images acquired using standard clinical protocols, fewer than 20% of intensity- or texture-based features showed no statistically significant difference in size-dependent errors relative to MDCT. The higher stability of radiomic features in CBCT-2DASG images can be attributed to more effective scatter mitigation enabled by the implementation of a 2D-ASG.
It is important to note that the absence of a statistically significant difference between CBCT-2DASG IR and MDCT error distributions, as reported in Table 2, reflects the limited statistical power of the present study rather than a demonstration of equivalence between the two methods. A non-significant p-value from a one-way ANOVA indicates only that the data did not provide sufficient evidence to reject the null hypothesis of equal means, it does not confirm that the distributions are equivalent. As clearly visible in Figure 4, the error distributions of CBCT-2DASG IR, while substantially narrower than those of standard clinical CBCT, remain broader than those of MDCT, indicating that residual size-dependent feature variability persists even with the most effective scatter suppression evaluated in this study. This residual variability likely reflects contributions from sources other than scatter alone, including beam hardening, reconstruction algorithm differences, and spatially variant noise characteristics.
A key challenge in evaluating the effect of patient size on the robustness of CBCT-derived radiomic features is the absence of a gold-standard reference for size-dependent feature variability. In this study, MDCT-derived features served as the reference because MDCT provides artifact-free images with high CT number accuracy and is the clinical standard for radiotherapy treatment planning. However, the MDCT-derived linear regression model is not expected to hold perfectly in CBCT. This is explicitly not an assumption of our framework. MDCT and CBCT are different imaging technologies with distinct noise properties, scatter characteristics, and reconstruction algorithms, all of which influence radiomic feature values independently of object size. We neither expect nor require the CBCT feature values to lie on the MDCT-derived regression line. Rather, the MDCT model serves as a stable, reproducible benchmark that captures how features scale with object size in a high-fidelity, artifact-free reference modality. Deviations of CBCT feature values from this benchmark, quantified as prediction errors in Eq. 3, reflect the combined effect of modality differences and scatter-induced image degradation. Critically, the comparison in this study is not between CBCT and MDCT feature values in absolute terms, but between different CBCT methods in terms of how closely each preserves the size-dependent scaling behavior established in MDCT. A CBCT method that produces smaller prediction errors is one whose size-dependent feature behavior more closely resembles that of a high-fidelity reference modality, regardless of whether the absolute feature values are identical. This is the relevant metric for evaluating the impact of scatter suppression on radiomic feature robustness, and it does not require the assumption that the linear relationship transfers perfectly across modalities. The validity of this approach is further supported by the observation that MDCT itself produces non-zero prediction errors in our framework, as the small residual variability in MDCT feature values across ROIs means that even the reference modality does not perfectly satisfy its own model. This confirms that the framework is not designed to enforce perfect agreement, but to rank imaging methods by the consistency of their size-dependent feature behavior relative to a well-characterized reference.
To assess size-related effects, a two-step approach was implemented. First, MDCT features showing a linear correlation between values from small and large phantoms were identified. These features were then extracted from CBCT images to test whether the same correlation model held. This method, adopted in the absence of an established reference for size-dependent feature variations, used MDCT-based linear models as a surrogate standard. Future work may focus on developing advanced models that remove the reliance on MDCT, enabling direct comparison of radiomic features across patients with varying body habitus. The use of MDCT as a reference standard follows established practice in CBCT image quality evaluation and was adopted here as the most clinically relevant surrogate for artifact-free imaging. A linear model was chosen as a parsimonious first approximation of size-dependent feature behavior. While non-linear relationships between feature values and object size were not systematically investigated, such relationships may exist for certain feature classes, particularly those sensitive to non-linear changes in noise texture or CT number accuracy with increasing scatter. Exploration of non-linear models is deferred to future work.
In this study, the water-equivalent section of the phantom was used to analyze variability in radiomic features. The phantom body was composed of water-equivalent plastic, enabling feature extraction from multiple locations within an axial plane. This approach allowed the effects of spatially variant noise and CT number fluctuations for a given tissue type, a major limitation of CBCT imaging, to be captured. Radiomic analysis based on other tissue-equivalent inserts was intentionally avoided, as these materials are typically positioned at a single fixed location within the phantom. To evaluate the influence of insert position on feature variability, the insert would need to be repositioned and rescanned. An additional challenge arises from the placement of bone-equivalent inserts, as high-density materials can introduce streak artifacts in CBCT images, creating complex interplay between soft-tissue and bone insert positioning and their combined impact on feature variability. A more comprehensive investigation should therefore examine how different soft- and bone-tissue compositions influence CBCT-derived radiomic features.
It is worth noting that existing radiomics phantoms were not designed to account for patient size variation, as their material inserts are fixed at specific locations within a single phantom geometry, making them unsuitable for the specific purpose of this study, which required two phantoms of identical composition but different sizes to isolate the effect of object size on scatter. Furthermore, textural complexity of phantom insert materials and simple water-equivalent cylinders remains substantially lower than that of human tissues, a recognized limitation of phantom-based radiomics characterization.34,35 The features identified as robust to phantom size variation in this study should therefore be regarded as candidates for further validation in anatomically realistic phantoms and patient cohorts, rather than as a definitive list of clinically robust features.
Intensity- and texture-based features used in outcome response modeling are often tissue- and disease-site specific. For instance, Jimenez-del-Toro et al. developed a 3D-printed phantom with liver-mimicking properties to support radiomics-based differentiation of liver tissue classes. 36 In contrast, the present study analyzed features in water-equivalent regions. Consequently, radiomic features identified as robust to patient size variations in water-equivalent phantoms may not exhibit the same robustness in patient-specific, disease-site radiomics models. Nonetheless, this study underscores the importance of accounting for patient size in CBCT-based radiomics modeling and highlights the potential need to incorporate patient size effects explicitly into predictive models. While a linear regression model was used here to represent size effects, higher-order models may be explored in future work.
To emulate the effects of different patient body habitus, head- and pelvis-sized electron density phantoms were employed. Although radiomic feature values in outcome response modeling are disease site–specific, and thus not expected to be identical between head and pelvis regions, the objective of this study was to evaluate the extent of radiomic feature variability as a function of object size in CBCT images. Therefore, a commonly used electron density phantom was utilized in two different sizes representative of variations in patient anatomy and body habitus. Once more anatomically realistic phantoms that replicate inter-patient anatomical variability become available, future investigations may integrate disease site–specific analyses with patient size effects to more comprehensively evaluate radiomic feature variability.
The findings of this phantom study suggest potential implications for clinical CBCT-based radiomics workflows. In clinical patient cohorts, body habitus varies considerably across individuals, and scatter introduces a systematic size-dependent bias into radiomic feature values that could confound radiomics-based predictive models. The results of this study suggest that effective hardware-based scatter suppression via the CBCT-2DASG technique can reduce this source of bias, thereby improving the consistency of radiomic features across patients of varying size. However, realizing this potential will require comprehensive clinical validation beyond the phantom experiments reported here. From a workflow integration standpoint, the CBCT-2DASG preprocessing pipeline was applied as an offline processing step to raw CBCT projections in this work. Clinical integration would require embedding CBCT-2DASG data correction pipeline into the existing CBCT reconstruction chain. Moreover, 2D ASG hardware should be integrated with the CBCT detector.
Beyond patient size, several additional factors contribute to variability and uncertainty in radiomic features, including organ motion–induced artifacts in CBCT images. In the context of CBCT, organ motion poses a major challenge to image quality and the robustness of extracted features. Future studies aimed at comprehensively assessing radiomic feature stability should therefore consider both patient size and organ motion effects simultaneously.
Despite the dependence of radiomic feature values on patient size, models can be developed to account for these variations. In this study, a linear model was established to correlate feature values between MDCT images of small and large phantoms. Approximately 50% of the radiomic features in CBCT-2DASG images demonstrated a consistent linear correlation between small- and large-phantom scans, comparable to that observed in MDCT, suggesting that effective scatter suppression preserves the predictable size-dependent behavior of these features. Such a linear relationship could be utilized to transform feature values obtained from patients with smaller body habitus into the feature domain of larger patients, or vice versa. Alternatively, patient size–dependent features could be normalized to a common reference domain representing a nominal patient size.
The clinical implementation of the CBCT-2DASG technique involves several practical considerations worth addressing explicitly. The use of 2D ASG does not increase imaging dose. In a prospective clinical trial, the CBCT-2DASG was operated at the same imaging dose as standard clinical CBCT protocols and achieved improved contrast-to-noise ratio compared to standard filtered backprojection reconstructions at the same dose level. 37 It is important to note that 2D ASG has higher primary transmission than conventional radiographic ASGs in CBCTs and substantially better scatter rejection. Better scatter suppression and high primary transmission by 2D ASG improves contrast-to-noise ratio while maintaining same imaging dose.20,38 Regarding the weight of the device, the 2D ASG adds a fraction of the weight of the existing flat panel detector assembly. With respect to clinical translation, the prospective clinical trial evaluating the CBCT-2DASG has demonstrated that clinical integration is feasible and the CBCT system performs as intended under clinical conditions. 37 Hardware cost remains a consideration for future widespread clinical adoption; however, cost-benefit trade-offs are most meaningfully assessed once clinical benefit has been established.
Conclusion
Radiomic features extracted from CBCT images acquired throughout the radiotherapy treatment course have potential for assessing treatment response and predicting toxicity. Such CBCT-based outcome prediction approaches may be particularly valuable in adaptive radiotherapy, enabling modification of treatment regimens on a patient-specific basis to improve tumor control while minimizing normal tissue complications.39,40 However, variations in image quality associated with patient size can influence radiomic feature values, potentially reducing the accuracy of outcome models derived from serial CBCT imaging. Therefore, to effectively integrate CBCT-derived radiomic features into treatment response or toxicity prediction frameworks, it is essential to account for scatter associated with patient size variations and their impact on feature stability and reliability.
These findings suggest that effective scatter suppression via a 2D antiscatter grid is a promising step toward improving the reliability of CBCT-derived radiomic features, which may in turn support more robust radiotherapy outcome prediction in future clinical studies. Additional factors, such as variations in imaging protocols and the impact of organ motion, must be also considered in future research. Addressing these challenges will be essential for improving the robustness and clinical applicability of CBCT-based radiomics.
Footnotes
Author contributions
Funding
This work was funded in part by a grant from NIH/NCI R01CA245270.
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Data availability statement
Data will be made available upon request.
