Abstract
Digital twins (DTs) provide real-time understanding of structural response by coupling measurements with continuously updated models; this capability supports structural health monitoring (SHM) by enabling timely condition assessment, anomaly detection, and decision support. This study presents convolutional neural network (CNN)-based surrogates for predicting structural behavior within a DT framework, with a focus on improving surrogate model generalization and robustness. As traditional finite element (FE) models remain computationally demanding, limiting real-time SHM, this work introduces deep learning surrogates that replace FE simulations to enable faster analysis. Two existing CNN architectures, SCSNet and StressNet, are utilized to train on diverse datasets representing 2D plate structures subjected to varying loads and boundary conditions. StressNet, which uses multi-channel input encoding, outperforms SCSNet in predictive accuracy and robustness, particularly on unseen scenarios. This highlights the importance of dataset diversity in enhancing generalization. To demonstrate the integration of such surrogates into a full DT framework, the study also incorporates an iterative updating strategy that refines model inputs using displacement (structural behavior measurements). This component, supported by a set of CNN-based “calculators,” allows bidirectional learning between physical twin measurements and digital twin input, enabling continuous model updating. An inverse problem example is adopted to present the practice of a full DT framework. Together, these forward and updating components form a cohesive DT framework that links physical observations with virtual predictions. The results show significant promise for using CNNs to achieve efficient and accurate structural response prediction, paving the way for near real-time SHM and improved infrastructure decision-making through data-driven digital twin technologies.
Keywords
Practical applications
This study presents a proof of concept for using deep learning to predict the structural behavior of a sub-component—a 2D plate—within a digital twin framework. By replacing traditional finite element simulations with fast convolutional neural network (CNN) surrogates, the approach enables near real-time prediction of stress and displacement under varying loads and boundary conditions. Among the two employed CNN model architectures, StressNet outperformed SCSNet in accuracy and robustness, particularly in scenarios outside the training set. A key feature of this work is the integration of an updating mechanism, where ground truth displacement data from the physical structure is used to iteratively refine the digital twin’s input conditions. This updating strategy forms a feedback loop between the physical and digital systems, allowing the twin to stay synchronized with real-world changes. The ability to continuously update predictions based on sensor data achieved from the physical structure makes this digital twin framework especially valuable for structural health monitoring, enabling early damage detection, timely decision-making, and promotion of infrastructure resilience and sustainability.
Introduction
Infrastructure systems across the globe are comprised of a diverse series of assets, systems, and networks, both physical and virtual, that are vital to the foundation of our society, including economy, security, and integrity (Department of Homeland Security, 2020). These systems are complex, interdependent, interconnected, and diverse, encompassing the dwellings that we live in, the water that we drink, the power that we use, the transportation services that move us, and the communication systems that connect us. In the United States, this critical infrastructure is often overlooked and undervalued, with much of the infrastructure systems that serve society already built, and in many cases in a state of disrepair (American Society of Civil Engineers, 2021). However, this state of disrepair has occurred over long time horizons and often goes unacknowledged. Ignoring this slow decay is expedient, at least until there is a critical failure such as bridge or building collapse (Hao, 2010; Peng et al., 2019), power outage (Ma et al., 2021), or severe flooding (Soldovieri et al., 2021), where lives are lost, property is damaged, or financial impacts are consequential.
Amongst these infrastructure systems, the systems support essential services such as shelter and mobility represent a unique class of assets because these assets are typically classified as life-safety critical, meaning their failure has the potential to result in significant loss of human life; this is especially true when considering the structural systems within these classes (Scuro and Fusaro, 2022; Singh and Sehgal, 2021). Proper structural design ensures that these systems perform as intended in their operational environment once constructed; however, once these systems are built, the challenge associated with evaluating and monitoring the complex behavior of these highly redundant systems represents a long-term challenge, as these systems typically remain in service for more than 50 years. As societal dependence on technology continues to grow, the underlying physical infrastructure systems must evolve to ensure that they are equipped to serve as the smart and agile infrastructure system the future demands.
The concept of a digital twin provides an emerging technological framework that offers a path toward creating this smart and agile infrastructure of the future. Recently, the National Academies of Science, Engineering, and Medicine (NASEM) has suggested the formal definition of a digital twin as the following: “A digital twin is a set of virtual information constructs that mimics the structure, context, and behavior of a natural, engineered, or social system (or system-of-systems), is dynamically updated with data from its physical twin, has a predictive capability, and informs decisions that realize value. The bidirectional interaction between the virtual and the physical is central to the digital twin (National Academies of Sciences, Engineering, and Medicine, 2024).” Central to this definition is the virtual representation of the engineered system, and in the context of a structural system, the model representation of the real-world asset.
In modern structural engineering practice, these models are analyzed using stiffness-based or energy-based solutions such as the matrix structural analysis or the finite element analysis (FEA). For more complex systems, FEA becomes an essential tool for describing structural behaviors. FEA employs a discretization method to divide complicated structures into a finite set of smaller elements, leading to the accurate representation of the geometry and material properties. FEA empowers engineers and researchers to simulate physical phenomena and analyze their behavior, such as analyzing stress and deformation fields. The assessment of specific behaviors, particularly stress, becomes essential when considering the potential for structural failure under specific loading and boundary conditions (Jiang et al., 2021). On the other hand, FEA can be computationally demanding and time-consuming, which also limits the two-way interaction between the simulation model and the measured data from the physical twin required for creating digital twins. To achieve this vision for real-time interaction within a digital twin framework, strategies for expediting the computational digital model are needed to align with the real-time capabilities of sensor measurements. One solution is the use of machine learning models to accelerate the FEA process, where these models act as surrogate models capable of generating near-instant structural response predictions once trained. Several machine learning surrogate approaches have been explored in recent years and are gaining increasing attention in structural mechanics applications. For example, Physics-Informed Neural Networks (PINNs) incorporate governing equations directly into the learning process, providing a physics-guided modeling strategy. However, they can face challenges related to complex loss function design, geometric representation, and numerical stability, which may affect robustness and generalization in some applications (Nasir et al., 2025; Plankovskyy et al., 2025; Xu et al., 2025). Generative Adversarial Networks (GANs) have also been investigated for structural prediction tasks but may experience training instability or mode collapse in certain settings (Barsha and Eberle, 2025). These approaches represent promising research directions, but their practical implementation in structural mechanics can still present challenges. Consequently, continued exploration of alternative surrogate modeling strategies remains important for improving the stability, robustness, and computational efficiency of structural response prediction.
Motivation
In recent years, the development of digital twin (DT) frameworks has been explored at both the system level and asset level. These efforts have included advances in both the virtual (simulation) and physical (real-world) components of DTs (Sun et al., 2024). Within the scope of our research, we envision a strategy for an asset-level digital twin that integrates simulation, sensing, and real-time updating. A schematic overview of this full framework is provided in Figure 1, illustrating the broader vision. The focus of this paper is limited to a single component within that framework: the surrogate modeling of structural behavior. Within this manuscript, the focus centers on a subset of the framework; extending the generalizability and the robustness of FEA deep learning surrogates to represent the virtual piece. Section 6.4 presents a sample demonstration of how the updating process functions within the digital twin framework. As part of future work, this approach will be extended to incorporate data from the physical system for real-time updates of the virtual model. Overview of the proposed digital twin framework for structural behavior prediction. Full-field measurement methods—such as image-based techniques—capture a sub-piece of a piece of the structure responses. This data informs a deep learning-based surrogate model, enabling accurate prediction and updating of the virtual model to reflect real-time structural performance.
To overcome the computational challenges of FEA and enable real-time interaction within digital twin frameworks, recent studies have explored the use of machine learning (ML) to predict structural properties in the form of surrogate modeling for FEA (Nie et al., 2020). Properties including elastic strain (Goh et al., 2017; Mardt et al., 2018), macroscopic stiffness and yield strength (Pathan et al., 2019), and anisotropic effective properties of composites (Rao and Liu, 2020) have been predicted using data-driven surrogates. These methods, however, often yield average outputs, which do not account for local variations. In contrast, recent studies utilizing deep learning techniques, specifically convolutional neural networks (CNNs), have demonstrated the capability to predict full-field mechanical responses leveraging their strong spatial feature extraction capabilities, such as detailed stress distributions. Utilizing deep learning approaches, the innovative problem-solving process circumvents the solution of complex system equations and physics-based modeling through training robust neural networks, ultimately advancing computational efficiency. As an example, Nie et al. (2020) presented two CNN architectures to predict 2D linear elastic deformations. The first network, SCSNet, is a single-channel prediction neural network, whereas StressNet, the second network, contains five separate input channels with identical image sizes for geometry, load, and boundary conditions. SCSNet’s single input matrix encodes geometry and loading, while the output matrix provides von Mises stress values, with each entry representing the stress at a corresponding element of the 2D plate. StressNet takes input from five channels representing geometry, loading patterns, and boundary conditions, each provided as 32 × 24 matrices. The output, similar to the SCSNet framework, is a single-channel 32 × 24 matrix representing the von Mises stress field. These two CNN architectures were used to evaluate a representative structural member, a cantilever beam with variable loading configurations on the free end. In this work, SCSNet and StressNet demonstrated their capability to predict the von Mises stress field, achieving a mean relative error of 10.43% and 2.04%, respectively. The results validated the viability of a deep learning based simulation protocol for describing the full-field mechanical response of structural systems.
Building from this foundation, this study extends the work of Nie et al. (2020), with the motivation of this work centered on the formulation of a modeling strategy to facilitate the development of asset level digital twins. This work aims to address a key limitation present in digital twin literature in structural engineering, where previous efforts have faced challenges in data latency and real-time model updating (Sakr and Sadhu, 2024). These limitations can hinder the real-time interaction between physical infrastructure and its digital representation, which is a key requirement of digital twin systems. To address this challenge, this work utilizes CNN-based surrogate models capable of providing rapid structural response predictions once trained, thereby supporting faster model updating within a digital twin framework. Leveraging the model architecture of SCSNet and StressNet, this work aims to formulate new surrogate models using three new datasets, to assess their accuracy, robustness, and scalability. A core motivation in this work centers on developing surrogate models that are able to describe the mechanical response of structural components with integrated damage, a precursor to identifying and characterizing unknown damage within a structural system (Dizaji et al., 2021; Lee et al., 2023). Within the formulation of a surrogate, three datasets with internal damage were developed for training and validation. The first dataset considers a two-dimensional plate with a fixed boundary condition on the left side and a load applied on the right side, simulating damage through systematically moving square voids that change location and size across the plate. The second dataset introduces full randomness, where voids, boundary conditions, and loads can appear anywhere on the plate without following any specific structure, determined solely by probabilities. The third dataset blends structure and randomness by retaining the boundary and loading conditions of the first dataset while incorporating the moving square void patterns. Within these squares, each element can independently be a solid, void, boundary condition, or load, introducing a layer of randomness while preserving the overall geometric structure with respect to real-world geometry creation. These model extensions, along with their corresponding datasets, form the basis for a virtual model representation of a structural component within our proposed asset-level digital twin, a critical feature for facilitating collaboration with the experimental response of the physical twin.
In real-world applications, structural plate components—such as flanges, webs, and stiffeners in a steel beam—have distinct geometries, boundary conditions, and loading scenarios. These components form interconnected systems, and building a surrogate model for such systems requires a high degree of generalizability to accurately predict unseen structural configurations. This paper focuses on extending the generalizability and robustness of existing surrogate models for FEA (i.e., the virtual representation), advancing toward a digital twin framework that updates in real-time using data from the physical system. The broader framework illustrated in Figure 1 also includes an updating procedure that leverages data from the physical counterpart. A simulated example of this updating process is presented in Section 6.4, while the integration of real-world measurements remains the focus of ongoing work and is beyond the scope of this manuscript. The manuscript is organized as follows: Section 4 discusses related work and background, Section 5 describes the methodology development, Section 6 highlights the results and discussion, Section 7 summarizes the work and conclusions, and Section 8 provides recommendations and future work.
Related works
Literature review
The growing body of literature on deep learning has demonstrated the significant advances that has occurred in the field of machine learning. In the domain of computer vision, a subset of machine learning applied to imagery, this has been particularly true for approaches that leverage Convolutional Neural Networks (CNNs), which are known for learning feature hierarchies directly from image data. A comprehensive review of CNNs is available in the rapidly group body of literature (Cong and Zhou, 2022; Voulodimos et al., 2018) with select examples relevant to the mechanics focus of this work. In general, CNNs can be described as a machine learning approach that takes images as input, decomposes them into multiple arrays representing the features of the image, and further extracts information from the arrays (Nie et al., 2020). CNNs have proven effective across a number of image processing tasks, including recognition of objects, classes, and categories, but have also been demonstrated as capable of resolving mechanics-based problems. Early prominence in computer vision tasks gained by CNNs are derived from the success of model architectures such as AlexNet (Krizhevsky et al., 2012), which secured an early success in the ImageNet Large Scale Visual Recognition Challenge, providing a path for leveraging deep model architectures for image recognition tasks. Similarly, LeCun et al. (2015) elaborated on CNNs’ architecture, highlighting their efficiency in handling large datasets. Further advancements include He et al. (2016) development of ResNet, addressing the vanishing gradient problem in deep networks. In practical applications, Esteva et al. (2017) demonstrated CNNs’ effectiveness in medical diagnostics, while Vaswani (2017) introduced the transformer model, expanding deep learning beyond traditional CNNs. These developments underscore CNNs’ transformative role in advancing artificial intelligence.
In reviewing related work on surrogate models for stress field prediction, CNNs provide a common thread for translation from model to prediction that crosses discipline boundaries. For example, Liang et al. (2018) developed a deep-learning surrogate model for stress and strain analysis of aortic walls. Their study considered mean absolute error (MAE) and normalized mean absolute error (NMAE) in describing model performance, and their results showed that the proposed model was effective as a computationally efficient surrogate model. The model successfully estimated stress distributions, achieving a NMAE of 0.078% in the best case and 1.131% in the worst case. As a step forward from traditional CNN architectures, deep CNNs consist of multiple convolutional layers, often combined with pooling layers, activation functions, and fully connected layers, to extract and process features from input data (Khan et al., 2020). The ”deep” nature refers to the substantial depth of these layers, enabling the model to capture complex spatial features and relationships (Khan et al., 2020; Krizhevsky et al., 2012b). Tao et al. (2022) utilized a deep CNN on photoelastic fringe patterns to estimate the stress field. While their proposed network model provided improved image space mapping, the algorithm was only able to calculate the difference of principal stresses as opposed to direct measures of stress. Shao et al. (2023) investigated the use of full CNNs and long-short-term memory neural networks (LSTM) in predicting an individual’s element stress-time sequence. Full CNNs can accept input images of any size and make predictions for each pixel, retaining the spatial information of the original image (Fu and Qu, 2018). Unlike traditional CNNs, which include fully connected layers to map the extracted features to a final output, full CNNs rely solely on convolutional layers. This design allows full CNNs to preserve spatial hierarchies throughout the network, enabling more effective feature extraction compared to traditional CNNs (Long et al., 2015). The neural network was used to predict the stress distribution of the plate impacted by a bullet while the LSTM was used to estimate the stress sequence of an individual element on the plate. While their results demonstrated high accuracy and could model the general stress distribution, the surrogate model had some difficulties in capturing local stress variations. Krokos et al. (2022) conducted training on a CNN to generate stress predictions for structures characterized by randomly distributed microscopic features. The CNN underwent training on specific sections of the geometry, as opposed to the entire geometry, allowing the model to discern the impact of microscale features on the Tresca stress field. Although the model demonstrated accurate predictions for stress field data with similar testing data, its accuracy suffered when confronted with unseen test data, including noncircular holes and varying distributions of circular holes. Bhaduri et al. (2022) adapted UNet to output stress field images from composite plates under uniaxial tension. Inputs consisted of various numbers of fibers and spatial arrangements. The model was able to predict stress fields accurately; however, the error in all metrics increased with the decrease in fibers. Bolandi et al. (2022) employed a deep CNN to emulate FEA and forecast high-resolution stress distributions in loaded two-dimensional steel plates subjected to varying loading and boundary conditions in the elastic region. When compared to finite-element simulations that utilized a partial differential equation solver, the network could predict the von Mises stress distribution with an acceptable level of accuracy. In related work, Bolandi et al., (2022b) utilized the same concept to predict dynamic stress distribution across 100 time steps, by including long short-term memory (LSTM) layers in their deep CNN. While the model was able to predict the stress contours of steel gusset plates, it struggled with generating high-stress concentrations at 90-degree angles.
Related research on stress fields has also included neural networks without convolutional layers, such as recurrent neural networks or generative adversarial networks (Jiang et al., 2021; Jokar and Semperlotti, 2021). Liu et al. (2021) presented RitzNet, an unsupervised neural network model employing deep neural network functionalities as the approximation model for the prediction of two-dimensional stress fields. While the network was able to produce stress field results that were comparable to the ground truth, their model struggled with singularities and stress concentrations. Jokar and Semperlotti (2021) introduced Finite Element Neural Network (FENN), combining pre-trained neural network models and FEA to mimic the response of interconnected physical systems. The model leveraged Bidirectional Recurrent Neural Networks, capable of managing variable input sequence sizes and bidirectionality, enabling the network to receive feedback from both preceding and subsequent states. The model evaluated the mechanical response of a one-dimensional elastic bar under static loading, achieving a maximum error of less than 2.5% at each node compared to the reference solution obtained from FEA. Feng and Prabhakar (2021) introduced a neural network designed to achieve stress distributions across different types of composite materials, including models featuring volume fraction randomness and those with spatial randomness. In comparison to StressNet, the proposed model demonstrated enhanced accuracy, particularly in models characterized by volume fraction randomness and significant local stress concentrations. Nevertheless, it encountered challenges in accurately predicting stress in models with spatial randomness, leading to errors in stress predictions. More recently, Jiang et al. (2021) introduced StressGAN, a generative adversarial network (GAN), to generate stress distributions when presented with arbitrary geometries, loads and boundary conditions. StressGAN incorporated a conditional Generative Adversarial Network (cGAN) framework, wherein the generator was trained to map input conditions and geometries to output stress distributions, while the discriminator’s role was to differentiate between genuine samples and those generated by the generator. Two datasets were used to evaluate StressGAN’s performance: a fine-mesh multiple-structure dataset and coarse-mesh cantilever beam dataset. The model underwent testing on both the complete dataset and subsets categorized by different criteria, revealing that StressGAN outperformed the baseline model, particularly in scenarios featuring intricate geometry, varied load, and boundary conditions. Nevertheless, StressGAN encountered challenges when dealing with geometries featuring openings. Yang et al. (2021) used a cGAN to predict strain and stress tensors without prior domain knowledge. In their study, two-dimensional composite material shell geometries from FEM were used to train and test the model, which was able to accurately predict stress and strain tensors while considering the mechanical principles of the soft and brittle materials in the input geometries. Nie et al. (2020) proposed SCSNet and StressNet to predict stress fields in two-dimensional cantilever structures. When trained and validated on the original dataset, which consisted of variations of hole size, shape, and location, SCSNet had a mean relative error of 10.43% and StressNet had a mean relative error of 2.04%. These models demonstrated decent performance in terms of efficiency and accuracy, which will be incorporated into this study and will be discussed in greater detail in Section 5.
Contribution
These works have demonstrated the potential for characterizing mechanical behavior using deep learning surrogate models, a requirement for our goal of developing asset-level digital twins for structural systems. Real-time analysis with deep learning surrogates leads to a faster analysis in the field of structural health monitoring. One of the advantages of using these models is the ability to predict structural behavior without using computationally demanding traditional finite element analysis tools. This advancement would not only accelerate the analysis process but also broaden the scope of applications in structural engineering.
To further advance this digital twin concept, our study is built from the modeling framework developed by Nie et al. (2020), which introduced two convolutional neural networks (CNNs) to predict von Mises stress fields from the structure’s geometry, loading pattern, and boundary conditions. Amongst the various strategies described in the literature review, this approach provides a robust foundation because of the accuracy in the stress field predictions for a range of structural configurations, which are similar in type to our dataset. Therefore, adopting and further extending their approach aligns with our goal of a more generalizable, robust, and experimentally validated framework. The first CNN evaluated, Single Channel Network (SCSNet), consists of a single input channel and is limited to modeling relatively simple mechanics problems. The input consists of geometries and loads’ locations information all in one channel and the vector of loads’ magnitude was concatenated after upsampling layers to the model. The second network, StressNet, is a Multi-Channel Input Network employing five input channels. This design enables the model to be adaptable and applicable to various two-dimensional conditions. The first channel defines geometry with a value of 1 (solid) or 0 (void), the second and third channels define load (q x and q y ) and the fourth and fifth channels specify boundary conditions (−1 for constrained and 0 for free). In the original study by Nie et al. (2020), both models were trained and tested on two-dimensional linear elastic cantilever beams subjected to external static loads and were able to successfully forecast stress fields in cantilevered structures with diverse geometries and loads.
While previous CNN-based surrogate models like SCSNet and StressNet have demonstrated success in predicting stress fields for structured 2D problems, they have been primarily trained and tested on datasets with controlled and relatively simple geometries, loading, and boundary conditions. This limits their generalizability to more complex or irregular cases that are common in real-world structures. In contrast, the present work extends the surrogate modeling framework to substantially more challenging and unstructured scenarios by introducing datasets with unconstrained randomness in geometry, load distribution, and boundary conditions—conditions under which traditional CNN surrogates have not been previously tested. Furthermore, the Controlled Random dataset deliberately isolates different sources of variability, enabling a systematic evaluation of model robustness and failure modes, which is not available in prior studies. Moreover, the Controlled Random dataset is explicitly designed to assess generalizability by training the CNN on controlled-random configurations and evaluating its performance on a separate, known dataset. This approach enables a systematic assessment of whether exposure to diverse yet structured variability allows the model to generalize to previously unseen and more realistic scenarios—a capability that is quantitatively examined in this work.
Our focus is to extend the robustness of existing FEA-surrogate models to be able to predict unseen case scenarios. This work also integrates these enhanced surrogate models into an iterative inverse-updating workflow. This workflow demonstrates how surrogate predictions can support near real-time digital twin updating, which addresses a key challenge that prior CNN-based stress predictors did not explore.
This approach aims to improve the robustness, generalization, and practical applicability of CNN-based surrogate models, forming a foundational step toward developing asset-level digital twins capable of integrating both simulated and experimental data. The following points summarize the main contributions and highlight the novelty of this study: • Introduction of three new datasets—structured, fully random, and controlled random—to evaluate model generalization. • Extension and retraining of SCSNet and StressNet under these datasets to assess failure modes and robustness in unstructured mechanics problems that were not addressed in earlier work. • This work takes a step toward generalizable digital twins capable of integrating real-world structural uncertainties. • A near real-time updating example of the digital twin is presented to emphasize the importance of the FEA surrogate within the framework. The updating method, originally introduced as a proof of concept by Zhiyanpour et al. (2025), is used here to demonstrate how the virtual model can respond to physical measurements.
Methodology
This study develops CNN-based surrogate models to replace finite element simulations for predicting the full-field mechanical response of 2D structural components. Two existing architectures, SCSNet and StressNet (Nie et al., 2020), are adapted and upgraded from TensorFlow 1 to TensorFlow 2. SCSNet and StressNet were selected because they were specifically developed for spatial stress prediction tasks and have previously demonstrated strong performance as surrogate finite element models in structural response prediction (Nie et al., 2020). The focus of this study is therefore not on proposing new neural network architectures, but on evaluating how dataset design and input encoding influence the robustness and generalization of surrogate models within a digital twin framework. As shown Nie et al. (2020), both CNN models can predict the structural behavior of two-dimensional plates under varied loading, geometry, and boundary conditions. This use case reflects planar elements subjected to in-plane loading (e.g., structural connections, plate elements, or tension coupons) and can be validated against experimental data in future studies. A key contribution of this work is the generation of three datasets with increasing levels of randomness to represent different 2D plate configurations. These datasets aim to improve the generalizability and robustness of the models. Additionally, the accuracy and versatility of the surrogates are enhanced. The resulting surrogate model is integrated into a digital twin framework that supports continuous near real-time updating based on observed structural behavior.
Finite element analysis ground truth
Finite Element Analysis (FEA) is a powerful computational approach for describing structural behavior such as stress, strain, and displacement. FEA is a well-established method, and a comprehensive discussion on FEA can be found in the literature (Allaire, 1999; Bathe, 2006; Buchanan, 1995; Dhatt et al., 2012; Huebner et al., 2001). In summary, FEA maps the structure into discrete elements that describe the system boundary conditions, loads, and constitutive relationships, enabling the solution of the partial differential equations to describe full-field mechanical response. For structural systems, this full-field mechanical response is inclusive of the displacements and deformations that define the stress/strain tensors. Within this work, von Mises stress serves as the response measure for assessing the CNNs. The von Mises stress for a structural element plate effectively combines the three stress components into a single equivalent stress value: σ
xx
(normal stress in the x-direction), σ
yy
(normal stress in the y-direction), and σ
xy
(shear stress in the xy-plane) of a material. The formula for von Mises stress (σ
vm
) is defined in equation (1) as:
The CNN model could be developed for other mechanical relationships (i.e. displacement, deformation, stress components), but the von Mises stress provides an aggregate mechanical response that can also be compared with material failure criteria.
Model description
In this study, deep convolutional neural networks (CNNs) are formulated to predict von Mises stress distributions. These CNNs are designed with the goal of serving as a surrogate representation of the physics-based representations of structural behavior traditionally derived from FE models. A detailed overview of the SCSNet and the StressNet are described within the following subsections.
SCSNet: The implemented SCSNet model has a single-input channel to predict von Mises stress values. The input channel contains geometry and loading information in a matrix format, the output channel matrix includes von Mises stress values for the corresponding element in the two-dimensional plate. The model architecture for SCSNet is structured as a convolutional autoencoder with an encoder-decoder with featuring constant CNN layers, allowing the model to learn to encode the input into simpler signals and then reconstruct it. The model takes input as a 32 pixel by 24 pixel matrix as input. Beyond this input, the model further leverages the characteristics of multiple complementary layers with various shapes, including convolutional, max-pooling, reshaping, and fully connected layers. By passing the input through the aforementioned layers, the model extracts the processed features and predicts stress values. The architecture of the model consists of five convolutional layers; two of these layers are dedicated to downsampling and each of them is followed by a pooling layer. The output from the convolutional layers is flattened through a reshaping function and then passed into two fully connected layers. The load vector is then concatenated with the latent features, then the data is processed through additional fully connected layers. Finally, three deconvolutional layers are used for upsampling to the original input matrix size. To maintain the consistency between the input and output shapes, zero padding is used (Nie et al., 2020). A visual description of the model architecture is shown in Figure 2 (visual description developed using the Visualkeras library in Python). SCSNet architecture with single-channel input and von Mises stress output.
StressNet: To improve the accuracy and versatility of SCSNet, StressNet was introduced by Nie et al. (2020). The SCSNet model has a simpler architecture and is designed for less complex scenarios. StressNet has multiple input channels and employs a downsampling- and-upsampling structure integrated with five squeeze-and-excitation (SE) ResNet modules. The downsampling component consists of three convolutional layers followed by five squeeze-and-excitation layers, and the upsampling component is comprised of three deconvolutional layers. The input for StressNet includes contributions from five channels that represent different aspects of geometry, loading patterns, and boundary conditions. The input information of each channel is feed into the model through a series of two-dimensional 32 by 24 matrices. Similar to the SCSNet framework, the output from StressNet is a single-channel von Mises stress field with matrix dimensions of 32 by 24. In contrast to SCSNet, the enhanced architecture of StressNet employs variable kernel sizes and residual blocks to mitigate the vanishing gradient problem through adaptive layer depth selection, resulting in a more dynamic and flexible model. A visual description of the model architecture is shown in Figure 3 (visual description developed using the Visualkeras library in Python). StressNet architecture with multi-channel input: incorporates geometry, loading, and boundary conditions as inputs, with von Mises stress as the output.
Dataset
To train SCSNet and StressNet, three different datasets were generated through simulation using SolidsPy, a finite element method (FEM) Python library (Gómez and Guarín-Zapata, 2018). SolidsPy is an open-source 2D FEM framework widely used in research and teaching for linear elastic analysis, and it has been employed in several recent studies to generate simulation-based datasets for surrogate modeling and structural mechanics applications (Jiang et al., 2021; Nie et al., 2018, 2020; Wilt et al., 2020). The use of SolidsPy in this work was motivated by its established credibility in the literature. To further validate its suitability, we performed a direct side-by-side comparison with the commercial FEM software ANSYS, observing an average discrepancy of approximately 10% between the two solutions, which is acceptable for data-driven surrogate model training. Firstly, the known dataset was generated. In this dataset, the boundary conditions and loading patterns were limited to defined only at the ends of the plates, and square voids of various sizes moved over the surface. To generalize the CNN models, a second dataset with more versatility was introduced. This second dataset is referred to as the Random dataset because all elements are generated using a randomization function, with each element potentially representing a void, solid region, boundary condition, or applied load. However, to cover all possible sample scenarios in the second dataset based on the resolution of the sample plate, which is 24 × 32, 7684 samples would need to be created, which is impractical and not scalable so the random dataset ultimately represents full randomization without consideration of the real world physical representation of the model. Therefore, a third dataset, termed the Controlled Random dataset, was introduced as a constrained alternative that expands the variability of the Known dataset while remaining more computationally manageable than the fully Random dataset. The controlled random dataset represents a controlled version of the aforementioned randomness, retaining the same boundary conditions at the edges of the plates and including square-moving void patterns similar to the known dataset; while randomness is applied to the elements within the void patterns. The controlled random dataset works as a cornerstone for further extending the generalizability and robustness of the CNN models. The following subsections describe the structure and design of the three datasets in more detail.
Dataset 1 (Known): The first dataset describes a two-dimensional plate fixed in both the longitudinal (x) and transverse (y) directions at the left end, with a uniform load applied at the right end in the x and y directions. Within this geometric constraint, the magnitude of the load is varied from 0 to 100 N/mm2, in steps of 20 N/mm2, and an orientation angle θ with relative to the longitudinal x-axis that ranged from 0 to 2π in steps of π/6. This dataset contains 2,561 geometry patterns and 169,026 samples in total. The geometry patterns consist of moving square-shaped voids of varied sizes over the surface. In Figure 4, (a) sample of the known dataset for the single-input channel model (SCSNet) is shown in Figure 4(a), and the same sample for the multi-input channel model (StressNet) is shown in Figure 4(b). All the two-dimensional samples have the same size of 32 by 24 mm2, and a quadrilateral mesh with a structured mesh size of 1 mm2 for the finite element analysis. The material properties of the quadrilateral elements were designated as homogeneous, isotropic, and linear elastic with a Young’s modulus of 210 GPa and a Poisson’s ratio of 0.3. Example known dataset input/output with load in x = −20 N/mm2 and load in y = 35 N/mm2: (a) for SCSNet with single-channel input and von Mises stress as output, and (b) for StressNet with multi-channel input, including geometry, loading, and boundary conditions as inputs, and von Mises stress as output.
Dataset 2 (Random): The second dataset describes a two-dimensional plate in which all the elements can be randomized, with the randomness for elements defined such that there is a chance that 25% is void or 75% is solid. In the randomization, the assumption for a solid element is that it could either be a load location or a boundary condition, with a 33% probability for each. The randomization parameters were empirically selected to introduce sufficient variability in geometry, loading, and boundary conditions while maintaining stable finite element solutions. A void probability of 25% was adopted to introduce noticeable geometric diversity without excessively fragmenting the structural domain. For solid elements, equal probability (33%) for load, boundary condition, or regular solid was used to avoid bias toward any particular mechanical configuration. This balanced sampling increases diversity in the training data and supports improved generalization of the CNN surrogate. In the data generation, there was a likelihood of discontinuity such that groups of solid elements could be surrounded by voids like an island. In such cases, these solid elements were modified to represent voids to ensure model stability and integrity. In this dataset, all plates maintained the same overall geometry and material properties as the known dataset. For the elements with applied loads, the load magnitude was varied from 0 to 30 N/mm2, in steps of 5 N/mm2, and the angle θ relative to the longitudinal x-axis ranged from 0 to 2π in steps of π/6. For this dataset, 3,000 random scenarios were defined, and based on the different load conditions, the total sample size was 233,922. Figure 5 provides an illustration of a random sample for both the single-input CNN (Figure 5(a)), and the multi-input CNN (Figure 5(b)). Example random dataset input/output with load in x = 5 N/mm2 and load in y = 5 N/mm2: (a) for SCSNet with single-channel input and von Mises stress as output, and (b) for StressNet with multi-channel input, including geometry, loading, and boundary conditions as inputs, and von Mises stress as output.
Dataset 3 (Controlled Random): The third dataset can be described as a hybrid of the first two datasets, with a constrained degree of randomness. The Controlled Random dataset was introduced as an intermediate step in the dataset design process. After first developing the Known dataset with well-defined structural patterns, a fully Random dataset was generated to expand the variability of the sample space. However, because the fully randomized design space becomes extremely large and impractical to cover comprehensively, a constrained randomization strategy was needed. The Controlled Random dataset therefore preserves the global structural configuration of the Known dataset while introducing controlled element-level randomness within the internal pattern region. This approach expands the diversity of the training samples in a more structured and computationally manageable way, with the goal of improving model robustness and generalization. The square void patterns used in the dataset can also be interpreted as simplified representations of localized structural degradation or material loss, such as corrosion or damage zones, which may occur in real structural systems. While the current implementation uses simplified geometric patterns, this concept provides a foundation for future studies exploring more realistic deterioration mechanisms or damage scenarios. In this dataset, the plate has the same 32 by 24 mm2 geometry as the first two datasets, with the same structured mesh size of 1 mm2 for the finite element analysis. Similar to the known dataset, the left end is fixed in both the longitudinal and transverse directions, with the load applied in both directions on the right edge. Within this dataset, elements internal to the structure could be loaded based on a randomization function. Similar to the known dataset, the same moving squares concept was employed to allow for different size patterns, but instead of restricting these patterns to voided regions, randomness is applied. The randomness in this dataset follows the same method used in the random dataset: each element has a 25% probability of being void and a 75% probability of being solid. For solid elements, there is an equal 33% chance of them being designated as a load location, a boundary condition, or remaining as a regular solid. The protocol for eliminating element islands was also employed within this dataset. Since the number of geometry patterns created by moving squares is 2,561, we generated an equal number of random patterns to match this count. By defining the load magnitude similarly to the known dataset, the controlled random dataset resulted in 66 load patterns and a total of 169,026 structural samples. Figure 6 provides an illustration of a sample of the controlled random dataset for both the single-input CNN (Figure 6(a)), and the multi-input CNN (Figure 6(b)). Example controlled random dataset input/output with load in x = −20 N/mm2 and load in y = 35 N/mm2: (a) for SCSNet with single-channel input and von Mises stress as output, and (b) for StressNet with multi-channel input, including geometry, loading, and boundary conditions as inputs, and von Mises stress as output.
Model training and testing
Summary of model training/testing.
Loss function and metrics
To evaluate the performance of the CNN models, mean squared error (MSE), mean absolute error (MAE), and mean relative error (MRE) were used to quantify the quality of the model predictions, same as Nie et al. (2020). The formulas for MSE, MAE, and MRE are shown in equations (2)–(4). Within these performance measures,
Results and discussion
This section summarizes and discusses the results from the model training, testing. Predictions of the von Mises stress distributions for the structural members within the training solution space are included in this section. These results included both the cumulative distribution of stresses, which were quantitatively evaluated using the MSE, MAE, and MRE metrics, and the spatial distribution of results, which also enabled qualitative model characterization. Model Robustness is discussed in detail. Additionally, the concept of updating in the digital twin—formulated as an inverse problem example—is demonstrated in the final section to illustrate the functionality of the FEA surrogate, which highlights that the surrogate model is a core component within the broader digital twin framework, with the forward prediction capability being the primary focus of this paper.
Training and testing convergence
For both SCSNet and StressNet, the MSE metric provided an overall measure of model convergence and was used for both testing and training for all three types of datasets. Results for each model are presented concurrently in Figures 7–9, for each of the datasets. In each figure pair, the results are presented on different scales to support visualization of model convergence with (a) arithmetic coordinates and (b) scaled in logarithmic coordinates. For consistency across datasets, each model was evaluated through 5,000 training epochs. When comparing training and testing performance with the Mean Absolute Error (MAE) metric, similar convergence characteristics were observed, the final values for all metrics for each model are presented in the following in Table 2 in section 6.2. MSE curves for the first type of dataset for both CNNs in the testing and training sets; (a) is in arithmetic coordinates, and (b) is in logarithmic coordinates. MSE curves for the second type of dataset for both CNNs in the testing and training sets; (a) is in arithmetic coordinates, and (b) is in logarithmic coordinates. MSE curves for the third type of dataset for both CNNs in the testing and training sets; (a) is in arithmetic coordinates, and (b) is in logarithmic coordinates. Summary of SCSNet and StressNet training/testing error metrics.


Model performance
Based on the results derived from the training and testing process (Figures 7–9), StressNet performed better than SCSNet in terms of convergence and the final value after 5,000 epochs of training. In all curves, StressNet consistently exhibits a lower error compared to SCSNet, demonstrating the superior performance of the multi-input channel CNN. A summary of the final MSE, MAE, and MRE measures for CNN models on their train and test sets are shown in Table 2, along with the corresponding average von Mises stress values for each dataset. These error metrics all provide a measure of relative error between prediction and ground truth, but the MSE metric was selected to support more effective convergence as a result of the amplified effort of the squaring function. In Table 2, the metric results of the average of von Mises stress for each dataset for both CNN models are presented. These summary results indicate relatively poor performance ofthe SCSNet model and excellent model performance of the StressNet Model. For the SCSNet model, the relative error described by the MRE ranged from 29.48% to 74.66% for the known and controlled random datasets, respectively, whereas with a relative error for the StressNet model range from 0.48 to 1.01% for the known and controlled random datasets, respectively. The results reinforce the observation that the StressNet model with the multi-channel input was ultimately more effective in predicting mechanical response. The results for the StressNet model also suggest that it was also highly effective for the random dataset, exhibiting a relative error (MRE) of 1.49%.
Evaluating the evolution of the von Mises stress prediction qualitatively allows for a visual comparison of the evolution of the training performance and a comparison with the ground truth results derived from the finite element models. Figure 10 illustrate this evolution for the StressNet model for each dataset with models used in the testing dataset, demonstrating the improved prediction outcomes as the models converge approaching the 5,000 epochs training boundary limit. Comparing the predictions from the final epoch with the ground truth reveals no visual clear distinction between them, suggesting that the StressNet can describe both the global response and the local response characteristics within the solution space. StressNet performance at different epochs during training on the (a) known dataset, (b) random dataset, and (c) controlled random dataset. Left to right: epoch = 5, 50, 5000, and the ground truth.
Summary of StressNet performance on additional stress components, including stress in xx (S xx ), stress in yy (S yy ), and shear stress (S xy ) on the controlled random dataset.
Model robustness
Summary of StressNet training/testing MRE for three different samples in each dataset.
Since StressNet has a better performance compared to SCSNet, the results in Table 4 and the following plots for relative error are only calculated based on the trained StressNet. The relative error plot is the difference between the von Mises stress prediction of the StressNet and the ground truth over the whole surface for each element. In Figure 11, the output of the StressNet model, trained on the known dataset and tested on a sample with configurations similar to those in the known dataset, is displayed. The relative error plot shows a maximum error of less than 1%. In Figure 12, the relative error is plotted for the model trained on the controlled random dataset and tested on a sample with configurations similar to those in the controlled random dataset. The error map shows that the relative error is under 1.7% for all elements. Relative error for the StressNet prediction on a sample from the known dataset while it was trained on the known dataset. Relative error for the StressNet prediction on a sample from the controlled random dataset while it was trained on the controlled random dataset.

Figure 13 shows the performance of the trained StressNet on the controlled random dataset with an input of the known dataset, demonstrating an accurate model for most of the elements. The model still struggles to recognize voids, leading to skewed results across the surface. This issue may stem from the nature of the controlled random dataset, which features similar boundary conditions near the margins but lacks sufficient voids. The colormaps also reveal that predictions are more accurate near the margins compared to the voids and areas adjacent to the voids. Relative error for the StressNet prediction on a sample from the known dataset while it was trained on the controlled random dataset.
Although the model demonstrates strong generalization within similar dataset distributions, several failure cases are observed in cross-dataset evaluation. Higher prediction errors occur when models trained on highly randomized datasets are applied to structured configurations, and vice versa. These discrepancies are particularly noticeable near void boundaries and regions with increased structural complexity. This behavior is attributed to differences in geometric patterns and randomization intensity between datasets, which introduce configurations that are underrepresented in the training data. The results suggest that improved generalization may be achieved by incorporating additional intermediate geometries and varying levels of randomization to better bridge structured and fully randomized configurations; however, this generalization was beyond the scope of our study.
Surrogate integration into digital twin framework
Overall, these results demonstrate that the effectiveness of deploying a surrogate CNN model for predicting the mechanical response of a structural component is an alternative approach to traditional finite element analyses. While these surrogates offer an alternate approach, our results also demonstrate some of the challenges associated with model robustness and generalizability that must be considered in developing these surrogates. However, within the context of a digital twin, the value gained through the formulation of a surrogate is expected to be realized in the engagement between this surrogate and real-world measurements that can be used to inform the virtual model (or surrogate). As an illustration of this concept, the proposed surrogates are reformulated around the concept of an inverse solution to identify unknown or uncertain characteristics within our structural system (asset-level digital twin).
To showcase the application of the proposed Convolutional Neural Networks (CNNs) within a digital twin framework, we introduce an inverse problem example in the following sub-section. The inverse problems aim to bridge the gap between observed physical behavior and the underlying parameters that govern it within the finite element analysis surrogate. By solving these inverse problems, we can effectively update the digital twin, ensuring its fidelity to the physical system as conditions change or new data becomes available. This ongoing update process is what transforms a static model into a digital twin, enabling robust structural behavior prediction and informed decision-making.
Application example
For the proposed updating strategy, we developed four convolutional neural networks (CNNs) (Zhiyanpour et al., 2025), referred here as calculators. The CNNs were trained on the same dataset as the “known dataset.” Each individual CNNs estimate a specific structural parameter from full-filed displacements: • • • •
All four CNNs share the same architecture. Three convolutional layers (each followed by a ReLU) first extract low-to mid-level features. Two later convolutional layers include max-pooling, which downsamples the feature maps to enlarge the receptive field, reduce computation, and help control overfitting while adding a measure of translation invariance. The resulting feature map is then flattened and passed through three fully connected layers; the final output is reshaped back to the element/grid layout to produce per-element predictions (geometry, boundary state, or loads). After training the four CNNs, we evaluated their performance using the same protocol as during training. Mean squared error (MSE) served as the loss function, with its formulation given in equation (2).
These “calculators,” operate collaboratively in an iterative manner. The output of each CNN is used to update the corresponding input parameter for the next iteration. This process continues until the predicted structural characteristics converge to a stable and accurate configuration. In essence, the method solves an inverse problem: it infers the underlying structural parameters from observed surface displacement fields to iteratively update the digital twin. To illustrate the approach, the updating workflow is defined as follows. Given a measured displacement field D from the physical twin and initial digital-twin guesses
Here, C G , C B , C x , and C y are the four trained CNN “calculators.” The process repeats until the parameter updates are sufficiently small (i.e., convergence).
The proposed updating strategy employs four trained CNNs, each dedicated to predicting a specific structural parameter—geometry, boundary conditions, or applied loads in the x and y directions based on full-field displacement data (considered ground truth) and initial estimates of the unknown parameters. These CNNs, referred to as “calculators,” operate collaboratively in an iterative manner. The output of each CNN is used to update the corresponding input parameter for the next iteration. This process continues until the predicted structural characteristics converge to a stable and accurate configuration. In essence, the method solves an inverse problem: it infers the underlying structural parameters from observed surface displacement fields to iteratively update the digital twin.
Consider a cantilever plate (same configurations as the known dataset that is explained in the dataset section) as a sample scenario to illustrate this updating process. In the digital twin of a 2D cantilever plate, a new displacement field is acquired from the physical asset via sensors. The updating process begins by feeding the new displacement data into the four pre-trained CNN-powered calculators along with initial guesses for the plate’s geometry, boundary conditions, and applied loads. For instance, the ”Geometry Calculator” takes the known displacement, guessed boundary conditions, and guessed loads to predict an updated geometry. Simultaneously, the other calculators update their respective parameters. These newly predicted parameters then become the inputs for the next iteration. This iterative refinement continues until the predictions for geometry, boundary conditions, and loading stabilize, converging to values that accurately explain the observed displacement field. This rapid and accurate updating allows the digital twin to reflect the current state of the physical plate, even when the exact cause of a change is initially unknown.
Inverse problems of this type are inherently prone to singularities and non-uniqueness, because multiple combinations of geometry, loads, and boundary conditions can produce displacement fields that are indistinguishable within measurement resolution. For example, a slight geometric change near the edge of a void may generate a displacement response similar to that caused by a modest variation in load magnitude or boundary constraints, causing the mapping from observed displacements back to structural parameters one-to-many. This ill-posedness is further amplified when all parameters are estimated simultaneously, as cross-correlated effects can obscure the true source of the response. To mitigate these ambiguities, the proposed framework adopts a modular “calculator” architecture that decouples the inference task into four lower-dimensional subproblems, one each for geometry, boundary conditions, and the x- and y-components of loading. By constraining each surrogate network to specialize in a single parameter family, the approach reduces the solution space explored at each iteration and lowers the degree of ill-posedness associated with the global inversion. This decomposition suppresses cross-parameter interference, promotes stable convergence, and ensures that ambiguous or nearly singular cases are resolved through incremental, parameter-specific updates rather than a single monolithic inversion step. As a result, the iterative updating process maintains robustness even when different parameter configurations yield similar displacement fields, enabling the digital twin to converge toward a physically consistent and interpretable solution. In inverse modeling applications, simulation-generated training data are inherently noise-free and may not fully represent the uncertainty present in real-world measurements. This discrepancy can affect model performance when applied to noisy observational data. Prior studies (Ibrahim et al., 2019) have shown that incorporating representative noise patterns into the training process can improve robustness, leading to enhanced parameter identification accuracy and more stable convergence behavior under noisy conditions. These findings highlight the importance of accounting for measurement uncertainty when extending inverse modeling frameworks to structural health monitoring applications.
In Figure 14, the predictions for all four calculators are shown for a single sample after 8 iterations. The convergence threshold was set to 0.1%—defined as the relative change between predictions from two consecutive iterations—and each iteration took approximately 20 milliseconds, supporting the goal of near-real-time updating within a digital twin framework. When evaluated on a CPU-based configuration comparable to a laptop-class system, the full iterative updating framework required approximately 130 ms per iteration, further demonstrating the feasibility of near-real-time deployment without specialized hardware. To illustrate the iteration process in more detail, Figure 15 tracks the evolution of predictions at a specific element located at coordinate (32, 4), with the bottom-left corner defined as the origin. All unknowns—geometry, loading in the x and y directions, and boundary conditions—were initialized to zero. The ground truth values for this element were: geometry = 1, loading in x = 16.5, loading in y = 10, and boundary conditions = 0. After 8 iterations, the predictions converged closely to the ground truth, with maximum errors less than approximately 8% for loading and boundary conditions. However, for a few elements (less than 1%), the model failed to correctly predict voids in the geometry. Predictions of structural information using calculators with a convergence threshold of 0.1% error, resulting in 8 iterations for all elements in one sample. Each iteration for each calculator took approximately 20 milliseconds. The plots show the ground truth, predicted values, and corresponding relative errors (%). Convergence of all four: Geometry, boundary conditions (BC), load in x, and Load in y. For the element at the coordinate (32 mm, 4 mm), over 8 iterations using calculators.

Conclusion
Building from the foundational concept of a digital twin for infrastructure systems, this work describes a computational study focused on the virtual component of this digital twin framework. This study aims to develop surrogate models as an alternative to traditional finite element models, a critical step for real-time evaluation within the broader digital twin framework. To achieve this goal, this work leverages early works by Nie et al. (2020) as the foundation for the surrogate modeling framework development and seeks to expand the generalizability of the models proposed. Within this broader digital twin framework, these surrogate models are envisioned to effectively characterize the mechanical response of structural systems. Within this vision, a core focus of this work centers on improving the robustness and generalization of surrogate models through dataset design, while acknowledging that complete coverage of all possible structural configurations remains an open challenge.
To enhance the generalizability and robustness, three new datasets were created to validate the performance of two deep learning models (SCSNet and StressNet) re-developed in TensorFlow version 2 based on the research by Nie et al. (2020). The models were formulated based on this work, but the datasets used in training and testing were designed to measure surrogate performance and evaluate model robustness, a key step in the formulation of a digital twin framework for structural systems. The first model, SCSNet, is a single-channel input with a less complex architecture compared to the multi-channel input CNN, StressNet. All three datasets consist of a two-dimensional plate, and von Mises stress values are calculated for each element by using the SolidsPy Python library. The first dataset is the known dataset that has 169,026 samples, the second one is the fully random dataset with 233,922 sample sizes. As creating all possible scenarios in the random dataset was impractical, the third dataset consisted of a mixture of both first and second dataset characteristics, with 169,026 samples. When evaluating the results across models, the following was observed from the study results: • Across all datasets, StressNet performed better than SCSNet, demonstrating a high degree of accuracy and demonstrating promise for leveraging a surrogate model as a replacement for a traditional finite element analysis for specific scenarios. However, it should be noted that SCSNet had a shorter training time based on Table 1, averaging about half that of the multi-channel model, due to its single-channel input and simplified architecture. The StressNet model trained on the controlled random dataset performed significantly better in predicting the known input compared to the model trained on the random dataset, implying that the robustness of the proposed surrogate model in this study can be achieved through appropriate training dataset featuring a good balance between training sphere coverage and volume of data. The accuracy and generalizability of the model can be improved by increasing the dataset size and more void variability. • The deep learning approach is a promising substitute for traditional finite element analysis techniques as can be seen from the metric errors for the CNNs that were trained on a certain type of data and performed perfectly enough on similar data. This finding was observed in the model trained on the known dataset and tested with the same sample. For example, as shown in Figure 11, which corresponds to a sample from Table 4, the Mean Relative Error (MRE) for predicting von Mises stress values across all elements is 0.16%, with the maximum error being less than 1%. Similarly, Figure 12 represents a sample from the controlled random dataset, tested on a model trained with the same dataset. The average error for all elements in this case is 1.27%, as presented in Table 4, and the maximum error, based on Figure 12, is less than 2%. Finally, for a random sample tested on the model trained with the random dataset, the overall MRE for all samples in Table 4 is approximately 1%. These examples support the finding that surrogate models can consistently provide reliable predictions when trained and tested on similar datasets. • By adding random patterns in a controlled format (the controlled random dataset) it is possible to create a generalized model that is able to predict von Mises stress values over the surface of samples with a little different pattern. The findings from this work, demonstrate the foundation of this concept, but additional training of the fully random model is required for full validation. In this work, the concept of a random pattern was first applied to the random dataset, where randomness was distributed across the entire surface. The model trained on the random dataset performed well only on similar random samples. To address this limitation, the controlled random dataset was introduced, resulting in a significant improvement in predicting von Mises stress. For example, as shown in Table 4, the Mean Relative Error (MRE) for the three presented samples tested on the model trained on the random dataset is around 30%, whereas the MRE for the model trained on the controlled random dataset is less than 8%. A closer examination of Table 2 reveals that the model trained on the controlled random dataset and tested on the known dataset achieves an MRE of just 0.85% on one of the test samples. Furthermore, Figure 13, which illustrates the prediction, ground truth, and colormap error for this test sample, shows that the maximum error in predicting von Mises stress for individual elements is below 30%. The higher errors are concentrated near the void, as indicated by the figure.
While the results demonstrate the effectiveness of the proposed surrogate modeling approach for predicting structural responses, several limitations should be noted. The present study is restricted to linear elastic material behavior under static loading conditions, as the training datasets were generated using linear finite element simulations. In addition, the current implementation focuses on two-dimensional structural configurations and considers only solid–void material representations. Therefore, the proposed model has not yet been validated for nonlinear material behavior, dynamic loading scenarios, or structures with more complex material compositions. These limitations define the current scope of applicability of the surrogate model and highlight important directions for future development toward more comprehensive digital twin frameworks.
Moving beyond the surrogate formulation to illustrate the surrogate’s role in a broader digital twin framework, this study also implemented a novel approach for updating structural information—formulated as an inverse problem—based on known displacement fields. By leveraging four CNNs trained to estimate geometry, boundary conditions, and loading in both x and y directions, the digital twin is capable of iteratively refining its input parameters to align with real-world measurements. Since each iteration process is less than 20 milliseconds, this aligns with the near-real-time updating goal. This updating strategy provides a lightweight, data-driven alternative to traditional numerical calibration or retraining methods and allows the digital model to remain synchronized with the physical structure over time. The trained models achieved low error metrics across both training and testing sets, and the iterative process demonstrated strong convergence toward the ground truth.
Overall, this work highlights the potential of deep learning–based surrogates for accelerating finite element approximations and enabling fast prediction of structural response. Within the proposed framework, the surrogate model serves as the forward component that provides rapid structural response predictions, while the inverse updating strategy demonstrates how structural parameters such as geometry, loading, and boundary conditions can be iteratively inferred from observed displacement fields. While the present study is conducted within a controlled simulation environment, the results provide insight into how dataset design influences the robustness and generalization of surrogate models. The proposed approach should therefore be viewed as an initial step toward integrating data-driven surrogate models and inverse updating strategies within broader digital twin frameworks, rather than a complete deployment-ready digital twin solution. Future work will focus on expanding dataset diversity, incorporating experimental measurements, and addressing additional real-world uncertainties.
Future work
The proposed surrogate model is expected to function as a robust component within a generalizable digital twin framework for predicting and monitoring structural behaviors. The implications may extend to broader infrastructure systems, augmenting human decision-making efficacy, and advancing structural health monitoring practices. The surrogate models showed promising results for 2D samples, laying the foundation for extending these findings to 3D models. The proposed surrogate model is expected to serve as a robust component within a generalizable digital twin framework for predicting and monitoring structural behavior. The results obtained for 2D samples provide a foundation for extending this framework to three-dimensional structural systems, which more closely represent real infrastructure components. Future work will focus on expanding the datasets to include more diverse material behaviors beyond the current solid–void representation, incorporating heterogeneous materials, nonlinear material responses, and more varied loading scenarios. Increasing geometric variability and flexibility in boundary conditions will further improve model robustness. Additionally, future efforts will incorporate real-world experimental data to validate the model’s performance in practical applications. Recent studies have also demonstrated that integrating physics knowledge into CNN loss functions can improve predictive performance (Bolandi et al., 2022). Adapting SCSNet and StressNet to incorporate physics-based constraints represents another promising direction for future work. Future work will also explore alternative deep learning architectures, such as U-Net and attention-based or Transformer-based models, to further evaluate the impact of network design on surrogate model performance within the proposed digital twin framework. work will also investigate deployment-oriented optimizations, including pruning, quantization, and edge-device implementation.
Footnotes
Acknowledgments
The findings of this work have been developed through the support provided by the National Science Foundation under Grant Number 2136724. The research team would also like to acknowledge the contributions provided by member of the Infrastructure Simulation, Sensing and Evaluation Lab (I-S2EE) research group, including Dr. Mehrdad Shafiei Dizaji and Connor Lyons.
Author contributions
Zahra Zhiyanpour conceived the data generation strategy, developed the simulated datasets, adapted and implemented the deep learning models, and led the manuscript preparation. Ayatollah S. Yehia and Zhidong Zhang contributed to manuscript preparation and technical review. Devin K. Harris supervised the study, defined the research scope, guided the methodology, contributed to the interpretation of results, and assisted in manuscript writing and revision.
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Science Foundation under Grant Number 2136724.
Declaration of conflicting interests
The authors declared the following potential conflicts of interest with respect to the research, authorship, and/or publication of this article: The authors declare that there are no conflicts of interest associated with this work. Any opinions, findings, and conclusions or recommendations expressed herein are those of the authors and do not necessarily reflect the views of the National Science Foundation.
