Transformer-based dual-input deep learning framework for remaining useful life prediction using sensor fusion and attention mechanism

Abstract

Accurate prediction of the remaining useful life (RUL) of aircraft engines is critical for predictive maintenance and operational reliability. It enables airlines to plan maintenance operations, thereby avoiding unexpected failures and unnecessary unscheduled downtime. Through the use of sensor signals and complex deep learning models, RUL estimation ensures safety, maximizes engine utilization, and saves on maintenance costs. This work addresses the fundamental problem of precisely forecasting the RUL of aircraft engines based on NASA’s Commercial Modular Aero-Propulsion System Simulation (NASA C-MAPSS) dataset, which has modeling difficulties, including truncated RUL labels, operational regime shifts, nonlinear degradation patterns, and sensor redundancy, which makes conventional modeling approaches inadequate. To overcome these obstacles, this paper proposes a novel transformer-based dual-input model (TDIM), which effectively integrates raw sensor sequences and aggregated statistical features through a parallel encoding design. The TDIM is evaluated against the state-of-the-art architectures on the NASA C-MAPSS turbofan engine degradation dataset. Experimental results demonstrate that the TDIM significantly outperforms the existing baselines, achieving an $R^{2}$ of 0.99 and a low root-mean-square error of 3.21 (with 99.15% prediction accuracy). Overall, the results demonstrate that transformer dual-based architecture can effectively address the inherent limitations of the dataset and significantly improve RUL estimation accuracy, paving the way for more robust and reliable predictive maintenance systems.

Keywords

Aircraft engine deep learning NASA C-MAPSS dataset predictive maintenance remaining useful life

Introduction

The advent of smart industrial systems has led to the increased usage of new technologies like artificial intelligence (AI), machine learning (ML), and sensor-based monitoring. These technological advancements are essential in transforming conventional maintenance strategies into more proactive and intelligent methods.^1,2 Among their most impactful applications is predictive maintenance (PdM), which focuses on anticipating equipment failures before they occur.^3,4 According to McKinsey & Company, PdM can save 30%–50% in machine downtime and extend machine life by 20%–40%, resulting in an improvement in productivity and cost reduction.⁵ It not only reduces unplanned downtimes and maintenance but also improves operational safety and reliability.^6,7 A key activity in PdM is the precise calculation of an asset’s remaining useful life (RUL), which is the time remaining until the system declines below a functioning limit. In safety-critical industries such as aerospace, power generation, and transportation, prediction of RUL is critical to scheduling maintenance and mitigating risk.^8,9 According to Deloitte, precise RUL forecasting is important, as sudden equipment breakdowns are the cause of almost 42% of unplanned downtime in industrial environments.¹⁰ Aircraft turbofan engines, being safety-critical and expensive assets, are prime candidates for PdM systems. For instance, engine failures on aircraft have disastrous outcomes; hence, accurate prediction of RUL facilitates proactive action and optimizes asset availability.¹¹ With the proliferation of high-resolution sensor data from sophisticated machines, there is a mounting need for resilient AI-based models that can learn from time-series signals under diverse operation conditions.^12–15

To realize these objectives, modern industrial PdM architectures combine Industrial Internet of Things (IIoT)-enabled devices and industrial sensors using standardized protocols to sense and forward data to a central core engine. The core engine processes raw data, and then, data are sent through a user interface layer that includes modules such as asset managers, alarms, schedulers, and user managers to yield actionable insights and facilitate intuitive decision-making in industrial environments.^16–18

However, with the rise of IIoT and the availability of high-frequency, multivariate sensor data from industrial machinery, several challenges have emerged. These industrial subsets are typically high-dimensional, nonstationary, noisy, and exhibits complex temporal dynamics. These characteristics make it difficult to model and extract meaningful patterns using conventional statistical methods. Therefore, in recent years, ML and deep learning (DL) approaches have been widely used to overcome these restrictions. Recurrent neural networks (RNNs) have demonstrated considerable potential in capturing long-range dependencies in time-series data, especially long short-term memory (LSTM) networks and their bidirectional variations (BiLSTM).^19,20 More recently, transformer architectures, initially developed for natural language processing, have proved to be superior at sequential modeling tasks by exploiting attention mechanisms in order to access global context.

One prominent benchmark for data-driven prognostics research is the NASA Commercial Modular Aero-Propulsion System Simulation (NASA C-MAPSS) turbofan engine dataset, or rather its FD001 subset, representing simulated turbofan engine degradation paths under controlled operational conditions.^21–25 The C-MAPSS dataset is a computer-simulated run-to-failure dataset for aircraft jet engines, commonly used for the progress and validation of RUL prediction models. It provides time-series sensor data for multiple engines under various fault modes and operating conditions and has spurred numerous studies in developing prognostic algorithms.^26,27 However, its complex temporal dependencies, high dimensionality, and noise level pose challenges for conventional ML approaches.

To address this, various DL models have been investigated such as RNNs, and especially the LSTM and BiLSTM architectures proved very capable of capturing long-range dependencies. More recently, transformer-based architectures, relying on self-attention mechanisms, reached state-of-the-art performance in terms of modeling sequential dependencies by effectively capturing global context. Another improvement has been achieved by hybrid architectures such as CNN-LSTM-Attention, multistream transformers, and dual-branch temporal encoders through fusing spatial and temporal representations. However, most models usually suffer from heavy cross-attention mechanisms and redundant temporal branches, further increasing the computational cost and the risk of semantic dilution of degradation trends. Despite extensive research into DL architectures for RUL estimation, current approaches suffer from some important limitations: (a) they often regard all sensor data as homogeneous temporal sequences, without considering the complementary nature of dynamic and static degradation information; (b) traditional multistream and attention-based fusion models suffer from redundant temporal encoding, leading to inefficiency and information redundancy; and (c) the heavy fusion mechanisms, including cross-attention and multistage fusion, increase the computational overhead, which restricts deployment in real-time industrial systems.

Hence, there is the need for a compact, computationally efficient, and heterogeneous model capable of decoupling short-term dynamic variations from long-term degradation patterns without sacrificing strong prediction accuracy.^22,28 To bridge these gaps, this study introduces a transformer-based dual input model (TDIM) for RUL prediction of turbofan engines where novelty lies in its heterogeneous dual-stream architecture that decouples sensor data into a dynamic temporal stream (transformer encoder) for sequential sensor data and a static statistical stream (feedforward encoder) for aggregated window-level features. The model further employs a lightweight late-fusion mechanism that avoids heavy cross-attention and substantially reduces computational cost, while also integrating noise reduction, RUL clipping, and attention-based feature weighting for achieving robust temporal learning and generalization. The major contributions of this work are as follows:

This article proposes, implements, and rigorously evaluates a novel TDIM for RUL prediction in turbofan engines, which integrates temporal sequence modeling with aggregated statistical features to capture both short-term dynamics and long-term degradation patterns. A dual-stream model architecture is applied, where sequential and statistical inputs are processed in parallel for enhanced RUL estimation accuracy.

It introduces a heterogeneous dual-stream fusion mechanism that explicitly decouples the temporal and statistical streams before fusion, enabling complementary learning between dynamic and static degradation trends a distinction from conventional single-input or twin temporal-branch models.

It validates the effectiveness and practical relevance of the proposed TDIM, experimental results are obtained on the widely used NASA C-MAPSS FD001 dataset, demonstrating superior performance over state-of-the-art baselines in root-mean-square error (RMSE), R-squared ( $R^{2}$ ), and accuracy within $\pm 10$ cycles.

The outline of the article is as follows. The second section offers an extensive literature review, noting progress in DL methods in PdM. The third section provides the description of the NASA Turbofan Engine Degradation and discusses the proposed model methodology along with the evaluation metrics. The fourth section highlights the experimental design, results and discussion of using DL models on NASA C-MAPSS dataset. And finally, conclusion with major findings and possible future research directions is given in the fifth section.

Related work

Various studies have formulated prognostics of RUL based on data-driven methods that monitor certain degradation of a component through sensor readings and applying such prognostics practically to maintenance planning. Since each component or system has different patterns of degradation, maintenance planning considers RUL prognostics specific to each component and suggests individualized maintenance activities, in this case the engine. For instance, a study²⁹ presented an alarm-based PdM system for aviation engines with ambiguous RUL prognosis. The system adjusts dynamically to update RUL forecasts and sound warnings under specified thresholds using convolutional neural networks (CNNs) for RUL estimation and integer linear programming for scheduling. The strategy, which has been applied to a fleet of 20 aircraft, limits engine breakdown expenses to 7.4% of overall maintenance expenses. With 13.6 engine breakdowns over 10 years, prices increased by 24.3% when compared to ideal RUL prognostics. The outcomes show the effectiveness of adaptive scheduling in maximizing dependability and minimizing maintenance expenses. Moreover, to estimate RUL in prognostics, the research in the study by Li et al.³⁰ suggests using a deep convolutional neural network (DCNN) method, which does not require physical deterioration models or expert knowledge. For feature extraction, raw sensor readings are normalized as direct inputs using a temporal window technique. The suggested method is experimentally evaluated using the C-MAPSS dataset for accurate aero-engine prognostics. When compared to conventional ML techniques, the suggested DCNN model performs better in estimating RUL. The findings show how DL may be used to improve system reliability and maintenance scheduling in industrial prognostics and health management (PHM).

A probabilistic RUL prognostic deep reinforcement learning (DRL) approach for predictive airplane maintenance was described in the study by Lee and Mitici,³¹ where CNNs and Monte Carlo dropout are used to forecast RUL for uncertainty quantification. By reducing reliance on strict deterioration limits, DRL optimizes maintenance activities. Compared to a mean RUL replacement policy, the technique reduces overall maintenance costs for turbofan aircraft engines by 29.3%. Engine life waste is limited to 12.81 cycles, and 95.6% of unplanned maintenance is avoided. The results demonstrated the effectiveness of DRL in PdM planning that is both cost-effective and adaptive. Likewise, Wang et al.³² described a Bayesian-optimized multilayer perceptron (MLP) and a random forest (RF) for feature selection in their RUL prediction approach for turbofan engines. While an exponential smoothing technique lowers noise, the RF algorithm captures the most crucial elements determining an engine’s lifespan. Bayesian optimization was used to train this MLP model, which improved prediction accuracy and parameter selection. The method’s capacity to forecast RUL more precisely than other approaches is demonstrated by assessments conducted on the NASA C-MAPSS dataset, which indicate a 6.1% drop in RMSE compared to other methods. This performed noticeably better in intricate multioperational situations than both classical and ML-based approaches.

The development of RUL prediction models focused on NASA’s C-MAPSS dataset was continued by Isbilen et al.,³³ who explored both ML and DL models, improving accuracy through hybrid strategies such as ensemble learning and similarity-based approaches. While ML models showed practical strengths, DL models achieved the highest accuracy. In the study by Peng et al.,¹¹ a turbofan engine RUL prediction model using an improved echo state network with attention integrated for online adaptive RUL estimation and an improved stacked sparse autoencoder for deep feature extraction was developed. This model demonstrated a 75% accuracy improvement over traditional approaches, achieving an RMSE of 10.14 and NASA score of 197. Its design focused on stable predictions, robust feature extraction, and noise reduction. Similarly, Solis et al.³⁴ proposed a stacked DCNN model for estimating a turbofan engine’s RUL. To extract a low-dimensional feature vector from raw data, the methodology uses a first DCNN. A second DCNN then uses these traits to estimate RUL. Bayesian approaches tested on the NASA C-MAPSS dataset are used to optimize the model. The NASA score was 0.64 and the RMSE was 6.24, both of which were better than conventional methods. The approach proved to be reliable and accurate in RUL prediction, placing third in the 2021 PHM Conference Data Challenge. To address the drawbacks of RNNs, such as gradient explosion and the failure to consider spatial data, Peng et al.³⁵ introduced a spatiotemporal attention-based method for RUL prediction of turbofan engines. This model recovers the relationships of spatial and temporal features by combining temporal position encoding with a multihead attention mechanism. With RMSE values of 11.07 (FD001), 18.10 (FD002), 10.73 (FD003), and 17.00 (FD004), the suggested solution outperformed current approaches in evaluations conducted on NASA’s C-MAPSS dataset. Its goal was to increase accuracy and stability, especially in multioperational scenarios. As a result, the method improves PdM by using DL to generate a more accurate RUL estimate. New probabilistic evaluation criteria for RUL prediction about turbofan engines are presented in the study by de Pater and Mitici.²⁹ To forecast RUL distributions, the current study uses a CNN in conjunction with Monte Carlo dropout. The accuracy, sharpness, and reliability are assessed using Continuous Ranked Probability Score (CRPS), weighted CRPS, Coverage, and Reliability Score. RMSE = 12.76 (FD001) and 18.03 (FD004) on the NASA C-MAPSS dataset provide precise and trustworthy RUL estimates. By incorporating uncertainty quantification, the suggested metrics offer a comprehensive evaluation format for probabilistic RUL prognostics and grant improvements over conventional RMSE-based evaluations.

Recently, a comprehensive survey³⁶ has reviewed CNN, LSTM, and hybrid attention models for RUL in aero-engines, providing an exhaustive overview of up-to-date data-driven approaches to RUL prediction of aero-engines, comparing traditional statistical, ML, and DL models. It highlights the supremacy of hybrid DL models, especially ones combining CNN, LSTM, and attention mechanisms, to address complex degradation patterns. The paper reinforces the importance of proper data pre-processing and the right training protocols to enhance prediction accuracy with widely used datasets like NASA C-MAPSS. Overall, it offers valuable insights toward constructing PdM systems in aerospace engineering. In line with the theme of PdM through DL, this work³⁶ extends another level by focusing on hyperparameter optimization as a solution to improve CNN-LSTM performance, thereby offering complementary insights to prior research that has been focused on architectural innovation. Building on this, another evaluation³⁷ explains the design and vital specifications of a new improved multistage LSTM with clustering (ILSTMC) model for the RUL prediction of aircraft engines. By integrating K-means clustering into LSTM, the predictive accuracy of multistage applications will be improved. When combined with NASA’s C-MAPSS dataset, ILSTMC gives a mean reduction of 0.85% RMSE in the final stage, compared to LSTM, and an average reduction of 1.87% per stage. The life-cycle-wise accuracies have improved by an average of 0.59% and 1.84% between the same stages. Results thus exhibit the capability of ILSTMC to reduce prediction errors with improved performance than LSTM, RNN, and Linear Predictions (LP) techniques, acting as a very reliable platform for PdM in civil aviation. While several dual-branch prognostic frameworks have been explored in prior research, such as the Siamese-Attention Augmented Model³⁸ and the EMD-LSTM Dual-Branch approach,³⁹ their architectural formulations differ fundamentally from the design adopted in this work. In existing dual-branch methods, two transformed or parallel temporal sequences, such as paired signal streams or Empirical Mode Decomposition (EMD)-derived temporal components, are commonly processed and combined by interaction-heavy fusion mechanisms, including cross-attention or recurrent fusion. In contrast, the TDIM architecture proposed in this paper integrates two heterogeneous information pathways: a transformer-encoded temporal sequence that captures the dynamic evolution of sensors and a nontemporal statistical representation that summarizes long-term degradation within each window. These streams are encoded using different computational blocks tailored to their respective roles and later combined through a lightweight late-fusion module that avoids cross-branch attention. In this way, it reduces computational complexity while preserving complementary information, making TDIM different from prior models with a dual-branch structure, operating solely on temporal inputs. Recent research has also extended toward more advanced multibranch architectures and health indicator driven prognostics which horizontally expanded subnetworks to learn supplementary degradation features from parallel temporal views, enhancing robustness and fault tolerance in RUL prediction for both aero-engines and rotating machinery.^40,41 Similarly, comprehensive reviews on Health Indicator (HI)-dependent RUL prediction methodologies^42,43 focus on the integration of domain-driven degradation indicators with DL models for improved interpretability and stability across various types of machinery. These works together underscore the increasing importance of hybrid and multibranch frameworks that balance data-driven feature extraction with physically meaningful health trends, an idea very much aligned with the motivation of the proposed TDIM.

In addition to the studies discussed above, various DL based state-of-the-art prognostic models have gone one step further toward improving RUL prediction through significant architectural innovations. Take, for instance, sensor-aware capsule networks that enhance multivariate feature extraction by modeling hierarchical part-whole relationships and improving robustness under sensor noise or failure.⁴⁴ On the other hand, multiscale cross-channel attention networks leverage multiresolution temporal embeddings and cross-sensor attention to capture heterogeneous degradation rates across engine subsystems, thus overcoming single-scale temporal model limitations.⁴⁵ Complementing these, conformal-prediction-based frameworks introduce uncertainty-aware RUL estimation by generating statistically reliable prediction intervals rather than point estimates and thus improve decision confidence in safety-critical environments.⁴⁶ In addition, comprehensive surveys published during the years 2023–2024 reveal emerging trends such as hybrid attention mechanisms, multibranch fusion architectures, and health-indicator-dependent learning, all of which reflect a shift toward models integrating domain knowledge with learned representations. All these recent works together signal the rapid evolution of DL for aero-engine prognostics and further reinforce the need for models with the capability of jointly capturing fast transient sensor behavior and slow, monotonic degradation patterns a goal directly inspiring the design of the proposed TDIM framework.

Although numerous models have demonstrated success on the NASA C-MAPSS dataset, limitations persist in effectively capturing both short- and long-range temporal dependencies, motivating the proposed dual-input framework.

Materials and methods

This section describes the dataset, the suggested DL architectures, and experimental procedures employed for predicting the RUL of turbofan engines. The research uses the publicly available NASA C-MAPSS FD001 dataset, which artificially simulates engine degradation in real operational conditions. This section initially presents information on the application context of the dataset. Second, the methodology with a developed TDIM is proposed. Finally, the performance evaluation strategy is presented, including the metrics, training setup, and benchmarking strategy employed to examine the predictive accuracy and generalizability of the models.

Application and dataset

In the context of PdM, precise RUL estimation of key components like aircraft turbofan engines is vital for reducing unexpected downtime, optimizing maintenance planning, and improving operational safety. The aim of this research is centered on the use of RUL prediction for turbofan engines using the NASA C-MAPSS dataset, a popularly used benchmark in prognostics research. It contains run-to-failure time-series data from several engines, each exposed to different fault modes and conditions. Every engine starts from a healthy state, which gradually deteriorates, with the failure showing up toward the end of the trajectory. Engine ID, time cycles, three operating settings, and 21 sensor measurements that include a portion of degradation are among the 26 features in the collection. There are four data subsets, FD001, FD002, FD003, and FD004, with unique operating and fault conditions, and we have used the FD001 operating condition, as depicted in Table 1. Each of the subsets has one training set, in which measurements are collected up to the engine failure, and one test set. In the test set, sensor recordings are cut off, and the target is to forecast the RUL at that time for each engine. In all instances, every engine undergoes a unique amount of initial wear. Over time, the health of an engine deteriorates as it comes close to failing. The objective is to forecast the number of operational cycles remaining until failure in the test set, that is, the number of operational cycles from the last cycle that the engine will keep running.

Table 1.

FD001 dataset description.

Attribute	Description
Dataset name	FD001
Source	NASA C-MAPSS
Number of units (engines)	100
Operational conditions	1
Fault modes	1 (single fault mode)
Data files	train_FD001.txt, test_FD001.txt, RUL_FD001.txt
Task	RUL prediction
Data type	Multivariate time-series
Features per time step	26 (1 engine ID, 1 cycle, 3 operational settings, 21 sensor readings)
Sensor measurements	21 sensors (e.g., temperature, pressure, fan speed, etc.)
Format	CSV-like, space-delimited text files
Objective	Learn degradation patterns and predict RUL based on historical sensor data

NASA C-MAPSS: NASA’s Commercial Modular Aero-Propulsion System Simulation; RUL: remaining useful life.

For this research, FD001 operating condition of the NASA C-MAPSS dataset was used for RUL estimation, which comprises of training set $(21, 000 rows * 26 columns)$ , a testing dataset $(13, 000 rows * 26 columns)$ , and RUL labels $(1, 000 rows * 1 column)$ . Hence, this study utilizes the FD001 subset of the C-MAPSS dataset, a widely recognized standard for RUL prediction tasks.

The FD001 dataset simulates a fleet of turbofan engines operating under a single operating condition with a consistent failure mode. It contains run-to-failure trajectories of 100 engines, each represented as multivariate time-series data. Each engine instance is monitored across up to 362 cycles, and failure is defined at the final recorded cycle. The data include 21 sensor measurements and 3 operational settings. For pre-processing, sensor noise and irrelevant features were filtered based on correlation and variance analysis, retaining the most informative sensor channels. The RUL target is constructed by assigning a maximum RUL cap (e.g., 125 cycles) and then linearly degrading until the end of life for each engine. This capping reduces target imbalance and enhances model convergence. The dataset is split into a training set (full trajectories) and a test set (partial sequences with provided ground truth RUL values).

Furthermore, the correlation heatmap of the NASA C-MAPSS turbofan engine deterioration dataset is illustrated in Figure 1, which represents the correlations among sensor readings, operating parameters, and engine cycle data. The majority of sensor readings show only modest connections with one another, suggesting that each one adds something special to the failure prediction models. A strong correlation has been found between “Corrected fan speed (rpm)” and “Corrected core speed (rpm),” suggesting that these predictors were redundant. This observation is critical for feature selection since highly correlated variables might cause computation redundancy and decrease model efficiency. The heatmap employs a color gradient, with dark blue denoting high correlations and red for low correlations. The analysis is helpful in the selection of important features and in reducing multicollinearity for better RUL prediction in PdM tasks.

Figure 1.

Correlation heatmap of sensor readings and operational parameters from the NASA Turbofan Engine Degradation dataset.

The NASA C-MAPSS sensor readings and operating parameters are summarized in Table 2. It includes descriptive statistics for each parameter, including count, mean, standard deviation, minimum, and quartiles. These features include rotational speeds via rpm for physical fan core speed and corrected speeds, engine and cycle indices, different temperature readings for Fan, Low-Pressure Compressor (LPC), High-Pressure Compressor (HPC), and Low-Pressure Turbine (LPT) outlet temperatures, pressure values for fan inlet, bypass-duct, HPC outlet, and HPC static pressure, and operational efficiency indicators, such as fuel–air ratio, engine pressure ratio, and bypass ratio. Thus, this dataset function accurately detects engine behavior patterns in terms of degradation. Based on this statistical assessment, the next phase involves data pre-processing, where noisy, constant, or less informative signals are filtered, and the data are reshaped for time-series modeling through windowing and RUL labeling.

Table 2.

Statistical summary of sensor readings and operational settings from the NASA Turbofan Engine Degradation dataset.

Feature	Count	Mean	SD	Min	25%	50%	75%	Max
Engine	20,631.0	51.506	29.228	1.000	26.000	52.000	77.000	100.000
Cycle	20,631.0	108.808	68.889	1.000	52.000	104.000	156.000	362.000
setting_1	20,631.0	−0.000009	0.00219	−0.0087	−0.0015	0.0000	0.0015	0.0087
setting_2	20,631.0	0.000002	0.00293	−0.0006	−0.0002	0.0000	0.0003	0.0006
setting_3	20,631.0	100.000	0.00000	100.000	100.000	100.000	100.000	100.000
Fan inlet temp (-R)	20,631.0	518.670	0.00000	518.670	518.670	518.670	518.670	518.670
LPC outlet temp (-R)	20,631.0	642.689	5.0005	641.210	642.325	642.640	643.000	644.530
HPC outlet temp (-R)	20,631.0	1590.523	6.1312	1571.040	1586.260	1590.100	1594.380	1616.910
LPT outlet temp (-R)	20,631.0	1408.934	9.0007	1382.250	1402.360	1408.040	1414.555	1441.490
Fan inlet pressure (psia)	20,631.0	14.620	≈0	14.620	14.620	14.620	14.620	14.620
Bypass-duct pressure (psia)	20,631.0	21.610	0.0014	21.600	21.610	21.610	21.610	21.610
HPC outlet pressure (psia)	20,631.0	553.677	8.8509	549.850	552.810	553.440	554.010	556.060
Physical fan speed (rpm)	20,631.0	23,898.065	7.0985	2387.900	2388.050	2388.090	2388.140	2388.560
Physical core speed (rpm)	20,631.0	9065.243	20.283	9021.730	9053.100	9060.660	9069.420	9244.590
Engine pressure ratio (P50/P2)	20,631.0	1.300	0.00000	1.300	1.300	1.300	1.300	1.300
HPC outlet static pressure (psia)	20,631.0	47.541	2.6709	46.850	47.350	47.510	47.700	48.530
Ratio fuel flow to Ps30 (pps/psia)	20,631.0	521.414	7.3755	518.690	520.960	521.480	521.950	523.300
Corrected fan speed (rpm)	20,631.0	2388.096	7.1919	2387.880	2388.040	2388.090	2388.140	2388.560
Corrected core speed (rpm)	20,631.0	8143.753	1.9076	8099.940	8133.245	8140.540	8148.310	8293.720
Bypass ratio	20,631.0	8.424	3.7505	8.3249	8.4149	8.4389	8.4656	8.5484
Burner fuel–air ratio	20,631.0	0.030	≈0	0.0300	0.0300	0.0300	0.0300	0.0300
Bleed enthalpy	20,631.0	393.206	1.5487	388.000	392.000	393.000	394.000	400.000
Required fan speed	20,631.0	2388.000	0.0000	2388.000	2388.000	2388.000	2388.000	2388.000
Required fan conv. speed	20,631.0	100.000	0.0000	100.000	100.000	100.000	100.000	100.000
HPT cool air flow	20,631.0	38.262	1.4076	38.1400	38.7000	38.9300	38.9500	39.4300
LPT cool air flow	20,631.0	23.289	1.0265	22.8942	23.2218	23.2796	23.3668	23.6184

SD: standard deviation.

Method: a TDIM

The FD001 subset of the NASA C-MAPSS dataset is utilized for RUL estimation. For each unit, the RUL is taken to be the maximum cycle of the unit minus the present cycle, clipped up to a maximum value (e.g., 125). Fourteen informative sensor signals that are appropriate are chosen based on statistical criteria like correlation and variance analysis, and for normalization, min–max scaling is utilized. Pre-processing involves imputation of missing values, normalization of sensor readings, and conversion of the time-series data to fixed-length sequences with a sliding window scheme. Each resulting sequence is subsequently labeled with the corresponding RUL value, allowing supervised learning dataset formation. This pre-processing scheme ensures data uniformity, facilitates temporal modeling, and is ready for input into future DL architectures.

Moreover, to enhance the model’s ability to capture both temporal dynamics and overall degradation trends, statistical feature engineering is applied to each time-series window. For any given fixed-size window of size W = 30 cycles, statistical aggregations of mean, standard deviation, minimum, and maximum are calculated over all of the chosen sensor channels. A window length of 30 cycles was selected based on the meaningful changes in sensor trends that, in general, develop over medium-range horizons according to the temporal behavior of the FD001 degradation trajectories. A smaller window size, such as 15–20 cycles, is not sufficient to capture the gradual evolution of degradation, whereas a larger one, such as more than 40 cycles, will introduce redundant information and dilute the model’s sensitivity with respect to localized variations in health. Previous RUL turbofan studies using CNN-LSTM and transformer architectures also report the best performance within the range of 25–35 cycles, further justifying 30 cycles as a balanced selection that captures transient dynamics without unnecessary temporal redundancy.

These composite descriptors are then combined with raw sequence inputs in order to enable dual-feature modeling, where the model learns about temporal changes and global summary trends at the same time. Trend-type features such as slopes are also extracted to measure directional sensor signal changes over time, enhancing the model’s capacity to identify degradation paths. This mixture of trend-based and statistical characteristics enhances the representation of the input and aids in stronger RUL prediction. Dual-input DL architectures, TDIM, are being designed to capture the temporal dynamics as well as the global statistical aspects of sensor data. These models use sequential representations and aggregated information to improve RUL prediction accuracy.^38,39 The subsections define this model in detail.

Proposed TDIM. A TDIM is proposed in order to properly exploit both the temporal patterns of sensor signals and their statistical trend of degradations. The complete architecture is described in Figure 2 and its implementation pipeline is as follows:

(a) Input stage: Every engine path from the C-MAPSS dataset is split into sliding windows. Two views are extracted from every window:

• A multivariate time-series sequence: a short segment (30 time steps) of 14 selected sensors, preserving how the signals evolve over time.

• An aggregated feature set: a compact summary of the same window, obtained by calculating statistical descriptors (mean, standard deviation, minimum, maximum) for each sensor, providing 56 features.

(b) Model stage (TDIM): The model handles the following two inputs in parallel:

• The sequence of time series is projected into a higher-dimensional space and processed with two stacked transformer encoder layers, each having four-head self-attention. This allows the model to learn relationships between various sensors and time steps. The whole sequence is then reduced to a single representation of 64 dimensions through a temporal pooling operation.

• The statistical characteristics are passed through a small feedforward network that compresses them into a 32-dimensional temporal pooling operation.

• Finally, the two representations are concatenated together to produce a single 96-dimensional feature vector that integrates rich temporal dynamics with overall statistical trends.

(c) Output stage: The model outputs a single scalar value, representing the predicted RUL of the engine at any given cycle window. This balance between capturing both short- and long-term degradation patterns is achieved by the dual-input structure of the model. While the transformer encoder captures rapid temporal variations and transient dependencies among sensor readings, the statistical input branch encodes slowly evolving degradation trends aggregated over each time window. By fusing these complementary features in the late fusion layer, the model learns both fast changing sensor dynamics and gradual health deterioration simultaneously. This ensures a holistic and temporally balanced understanding of the engine’s condition for more stable and accurate RUL prediction.

Figure 2.

Architecture of the proposed TDIM. TDIM: Transformer-based dual-input model.

The model’s mathematical formulation is expressed in Equations (1)–(6). Given an input sequence $X \in R^{T \times d}$ , where $T$ is the window length and $d$ is the number of selected signals, the data are first linearly projected:

Z = X W_{e} + b_{e}

(1)

where $W_{e}$ and $b_{e}$ depict the learnable projection parameters of the input projection layer.

After obtaining the projected representation $Z ϵ R^{d_{model} \times d_{K}}$ , the transformer derives the query, key, and value matrices through three independent learnable linear projections:

Q = Z W_{Q}, K = Z W_{K}, V = Z W_{V}

(2)

where $W_{Q}, W_{K}, W_{V} ϵ R^{d_{model} \times d_{K}}$ are the projection parameters associated with each attention head. These matrices form the inputs to the self-attention mechanism and can be computed as follows:

Attention (Q, K, V) = softmax (Q K^{T} / \sqrt{d_{k}}) V

(3)

where $Q, K, and V$ are query, key, and value matrices derived from the input, respectively. The output is pooled (e.g., mean pooling) to get a global sequence representation $h_{seq}$ .

Simultaneously, aggregated features $x_{agg}$ are passed through a feedforward layer:

h_{agg} = ReLU (W_{a} x_{agg} + b_{a})

(4)

where, $h_{agg}$ depicts hidden representation of aggregated features; $x_{agg}$ denotes aggregated statistical features (mean, standard deviation, min, and max); and $b_{a}$ denotes learnable parameters of the attention layer.

The two vectors are concatenated:

h = [h_{seq}; h_{agg}]

(5)

where $h_{seq}$ depicts encoded sequence representation after pooling; $h_{agg}$ denotes hidden representation of aggregated features and passed through fully connected layers to predict RUL:

\hat{y} = FC (h)

(6)

where FC depicts a fully connected (dense) layer for RUL regression

The model, as illustrated in Figure 2, can learn both high-level statistical summaries and raw temporal patterns because of its dual-input approach, which enhances generalization. The encoded features of both branches are concatenated and used as input for a fully connected regression layer to predict the RUL. This architecture allows the model to utilize both the dynamic evolution of sensor measurements and the context information in the statistical summaries, providing strong performance across different degradation conditions.

Hyperparameter settings.Table 3 summarizes the key hyperparameters chosen for the implementation of the proposed DL architectures, which are subsequently used in the algorithmic framework described in the section.

Table 3.

Hyperparameters used for the proposed TDIM.

S. no.	Hyperparameter	Value/range	Type
1	Layers of transformer	2	Discrete
2	Attention heads (nhead)	4	Discrete
3	Transformer d_model	64	Discrete
4	Aggregated feature size	32	Discrete
5	Aggregation statistics	Mean, SD, min, max	Categorical
6	Sliding window size	30	Discrete
7	Learning rate	0.001	Continuous
8	Batch size	64	Discrete
9	Early stopping patience	5	Discrete
10	Optimizer	Adam	Categorical
11	Scheduler step size	10	Discrete
12	Scheduler gamma	0.5	Continuous

TDIM: Transformer-based dual-input model; SD: standard deviation.

The aforementioned hyperparameters were identified by empirical exploration, informed by previously published RUL studies on the NASA C-MAPSS dataset. Among the configuration tested, a window size of 30 cycles emerged as the best compromise that captured sufficient degradation context without redundant long-term information. Similarly, increasing the embedding dimension or the number of attention heads beyond 64 and 4, respectively, did not provide significant gains but added to computational cost. Also, two layers of transformers provided a sufficient model of temporal dependencies without overfitting. A moderate learning rate of 0.001 with the Adam optimizer and early stopping guaranteed stable convergence. These settings have collectively allowed the validation performance to be consistent over multiple runs, proving that the predictive capability of the model is insensitive to modest variations in hyperparameters. The next section presents the algorithmic framework, describing the key steps and procedures adopted to achieve efficient RUL estimation.

RUL prediction. A pseudocode that encapsulates the suggested RUL prediction framework with TDIM is given in Algorithm 1. It describes major steps such as data pre-processing, feature building, model training, and validation on the C-MAPSS FD001 dataset.

Algorithm 1 RUL prediction framework.
Input: C-MAPSS FD001 dataset: train_FD001, test_FD001, RUL_FD001 Output: Predicted RUL and evaluation metrics 1. Initialize hyperparameters WINDOW_SIZE, MAX_RUL, BATCH_SIZE, EPOCHS $d_model = 64$ , $nhead = 4$ , $num_layers = 2$ , learning rate, scheduler, patience. 2. Load and preprocess data (a) Read dataset files (b) Compute RUL = max(cycle) − current cycle (c) Clip RUL to MAX_RUL (d) Normalize selected sensor features (cycle; clip to MAX_RUL) 3. Generate training samples Slide a window of length WINDOW_SIZE within each engine trajectory: • $X_{seq} \in R^{WINDOW_SIZE \times 14} \leftarrow raw sensor window$ • $X_{agg} \in R^{56} \leftarrow per - sensor {mean, std, \min, \max} {ofX}_{seq}$ • $Y \in R \leftarrow RUL at window end$ • $Aggregate all tuples (X_seq, X_agg, Y), split into train (80 %) and validation (20 %) sets$ 4. Define TDIM • Sequence branch: $\begin{matrix} X_{seq} \to Linear (14 \to d_{model}) \to TransformerEncoder (num_layers, d_{head}) \\ \to temporal mean pooling \to h_{seq} \in R^{24} \end{matrix}$ • Aggregation branch: $X_{agg} \to MLP (56 \to 64 \to 32, ReLU) \to h_{agg} \in R^{32}$ • Fusion head: $\begin{matrix} concat (h_{seq}, h_{agg}) \in R^{96} \to Linear (96 \to 64) \to ReLU \\ \to Linear (64 \to 1) \to \hat{y} (scalar RUL) \end{matrix}$ 5. Train the model • For each epoch: – For each mini-batch: * Forward pass $\to \hat{y}; loss \leftarrow SmoothL 1 (\hat{y}, Y)$ ; * Backpropagate; Adam step; zero gradients. – Evaluate validation loss; StepLR update; apply early stopping on plateau. 6. Evaluate each model (a) Compute MAE, RMSE, $R^{2}$ , Accuracy (b) Visualize: predictions, residuals, error curves 7. Compare model performance and conclude

Algorithm 1 RUL prediction framework.

Input: C-MAPSS FD001 dataset: train_FD001, test_FD001, RUL_FD001
Output: Predicted RUL and evaluation metrics
1. Initialize hyperparameters
WINDOW_SIZE, MAX_RUL, BATCH_SIZE, EPOCHS

d_model = 64

nhead = 4

num_layers = 2

, learning rate, scheduler, patience.
2. Load and preprocess data
(a) Read dataset files
(b) Compute RUL = max(cycle) − current cycle
(c) Clip RUL to MAX_RUL
(d) Normalize selected sensor features (cycle; clip to MAX_RUL)
3. Generate training samples
Slide a window of length WINDOW_SIZE within each engine trajectory:
•

X_{seq} \in R^{WINDOW_SIZE \times 14} \leftarrow raw sensor window

•

X_{agg} \in R^{56} \leftarrow per - sensor {mean, std, \min, \max} {ofX}_{seq}

•

Y \in R \leftarrow RUL at window end

•

Aggregate all tuples (X_seq, X_agg, Y), split into train (80 %) and validation (20 %) sets

4. Define TDIM
• Sequence branch:

\begin{matrix} X_{seq} \to Linear (14 \to d_{model}) \to TransformerEncoder (num_layers, d_{head}) \\ \to temporal mean pooling \to h_{seq} \in R^{24} \end{matrix}

• Aggregation branch:

X_{agg} \to MLP (56 \to 64 \to 32, ReLU) \to h_{agg} \in R^{32}

• Fusion head:

\begin{matrix} concat (h_{seq}, h_{agg}) \in R^{96} \to Linear (96 \to 64) \to ReLU \\ \to Linear (64 \to 1) \to \hat{y} (scalar RUL) \end{matrix}

5. Train the model
• For each epoch:
– For each mini-batch:
* Forward pass

\to \hat{y}; loss \leftarrow SmoothL 1 (\hat{y}, Y)

;
* Backpropagate; Adam step; zero gradients.
– Evaluate validation loss; StepLR update; apply early stopping on plateau.
6. Evaluate each model
(a) Compute MAE, RMSE,

R^{2}

, Accuracy
(b) Visualize: predictions, residuals, error curves
7. Compare model performance and conclude

Hence, the proposed algorithm efficiently integrates sequence modeling and feature aggregation to enhance RUL prediction accuracy. The algorithm takes advantage of the self-attention ability of the transformer to model short- and long-term dependencies in sensor measurements. The modular structure enables flexible composition of feature fusion and DL modules. The method shows strong performance on the C-MAPSS FD001 dataset and can be applied to other industrial PdM tasks.

Performance evaluation

To measure the performance of the RUL prediction model, key evaluation metrics used to measure accuracy and generalization ability are as depicted through Equations (7)–(10):

(a) R-squared ( $R^{2}$ ): It is used to measure how good the model is at explaining variation in the dataset, where measures closer to 1 show the model’s greater predictive power:

R^{2} = 1 - \frac{\sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}}{\sum_{i = 1}^{n} {(y_{i} - \bar{y})}^{2}}

(7)

(b) Mean squared error (MSE): It plays an important role on greater errors as it is based on the square. These measurements give a holistic view of model reliability and efficiency in actual real-world PdM scenarios:

MSE = \frac{1}{n} \sum_{i = 1}^{n} {(y_{i} - {\hat{y}}_{i})}^{2}

(8)

(c) RMSE: It is utilized to quantify the difference between predicted and real RUL values, where lower values show a more accurate model. It is also utilized to quantify the difference between predicted and real RUL values, where lower values show a more accurate model:

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {({\hat{y}}_{i} - y_{i})}^{2}}

(9)

(d) Accuracy: Accuracy at $\pm 10$ cycles is a measure of the percentage of RUL estimates that are contained in a tolerance band of ±10 time cycles from the actual RUL. It is a tolerance-metric measure of assessment applied to gauge the practical reliability of a model in PdM applications:

{Accuracy}_{\pm 10} = \frac{1}{N} \sum_{i = 1}^{N} (| R \hat{U} L_{i} - RU L_{i} | \leq 10) \times 100

(10)

Since the FD001 dataset employs clipped RUL targets (maximum RUL = 125 cycles) to mitigate extreme target imbalance. All the evaluation metrics, including $R^{2}$ , MSE, RMSE, and accuracy, were computed on the consistently truncated ground truth labels. This ensures fair and consistent model comparison across all samples. As both the numerator and denominator of the $R^{2}$ computation operate in the same transformed target space, no correction to the formula is required. Preliminary validation further indicated that $R^{2}$ values computed before and after truncation differ only marginally, confirming that the standard formulation remains valid for the clipped targets.

To train the proposed architectures, fixed hyperparameters were employed from previous experimentation. The sequence length of the input was fixed at 30, with clipping of RUL to 125 to reduce target skewness. Training was done for a maximum of 50 epochs with a batch size of 64. An embedding size of 64 was utilized with two encoder layers and four attention heads for TDIM. The Adam optimizer with a learning rate of 0.001 was used, together with the StepLR scheduler and early stopping. The SmoothL1 loss function was adopted for its robustness to outliers. No hyperparameter tuning was automated; however, the empirically selected values yielded stable training and consistent performance on the C-MAPSS FD001 dataset.

Experimental results

This section depicts the results of multiple experiments undertaken to assess the efficacy of the suggested DL architectures for RUL prediction. The evaluation was conducted utilizing many criteria, including MAE, RMSE, $R^{2}$ , and accuracy within a $\pm 10$ -cycle range. Experiments utilized the NASA C-MAPSS FD001 dataset, focusing on comparing the suggested models to traditional baselines and assessing their prediction dependability.

Experimental design and setup

All experiments were conducted using the FD001 subset of the NASA C-MAPSS dataset. Sensor readings were normalized using min–max scaling, and sequences were windowed using a fixed size of 30 cycles. Model training was executed on Graphical Processing Unit (GPU)-enabled hardware to accelerate computational efficiency. Optimization was performed using the Adam optimizer with an initial learning rate of 0.001. A StepLR scheduler was applied with step_size = 10 and gamma = 0.5 to reduce the learning rate periodically. Early stopping was used to prevent overfitting, with a patience of 15 epochs. The models were trained for a maximum of 100 epochs with a batch size of 64. Table 4 describes all experimental platform specifications, including local system configuration, cloud computing environment, and supporting software libraries used for model development and evaluation.

Table 4.

Experimental platform and software specifications.

Platform	Specifications
Local system	Intel Core i5-1240P (12th Gen) @ 1.70 GHz, 8-GB RAM
Cloud environment	Google Colab with NVIDIA Tesla T4 GPU (16 GB)
GPU runtime type	CUDA enabled T4 GPU (Google Colab backend)
Operating system	Windows 11
Python version	Python 3.11
Libraries used	NumPy, Pandas, Scikit-learn, TensorFlow
Development tools	Google Colab

The experimental design of this study is structured to systematically examine the performance and generalization capability of the proposed DL architectures for RUL prediction of turbofan engines under realistic and volatile operational situations. The design emphasizes fair, reproducible, and open benchmarking in the following domains:

(a) Application baselines: Standard models from the prior research, including K-nearest neighbors, support vector machines (SVMs), RFs, and a baseline LSTM model, were employed as reference points.^47–50 These represent established approaches for RUL prediction and facilitate a performance benchmark against the proposed PdM models.

(b) Methodology baselines: An advanced architecture, namely, the TDIM, was designed and systematically evaluated.^19,31,51 These models serve as a consistent point of reference for state-of-the-art DL methods to demonstrate the performance gain achieved.

Following the implementation and training of the suggested DL models, the following section provides and discusses the performance results. The results are presented based on traditional evaluation measures like RMSE, MAE, R², and cumulative accuracy (within ±10 cycles).

Results and comparative studies

This section presents the experimental results and performance evaluation of the proposed PdM framework for turbofan engines, specifically addressing the problem of RUL estimation under realistic operational conditions. The focus is on analyzing the predictive accuracy, error distribution, and the overall impact of RUL forecasts on maintenance scheduling decisions. Consequently, the study explores DL architectures that not only aim to minimize error but also introduce two additional evaluation metrics: MAE and accuracy within ±10 cycles to comprehensively assess predictive reliability. The proposed dual-input architecture utilized both temporal and statistical features very effectively. Visualization of prediction outcomes further confirmed the excellence of the transformer model. It displayed stable prediction curves very closely following true RUL values under different engine instances, with less fluctuation and smaller lag.

The results underscore the advantage of modern architectures like transformer Dual Input, which not only deliver superior prediction performance but also maintain competitive inference times, thereby making them highly suitable for real-time PdM applications. In order to substantiate the real-time feasibility of the TDIM, inference latency measurements were carried out. The model yielded an average inference time of 1.47 ms per sample (±0.07 ms), which equates to approximately 20,877 samples per second. This level of responsiveness strongly underlines its deployability in real-time use cases where fast, cycle-wise updates of RUL estimates are deemed necessary. These results highlight the model’s suitability for high-frequency maintenance monitoring, especially in IIoT contexts with critical latency constraints. The cumulative accuracy curve in Figure 3 shows that the model performs well in estimating RUL with a greater tolerance. At $\pm 10$ cycles, the transformer dual model shows the accuracy when predictions are allowed to deviate from the true RUL by a certain threshold.

Figure 3.

Cumulative accuracy curve for the TDIM showing the proportion of predictions within increasing RUL error thresholds. TDIM: Transformer-based dual-input model; RUL: remaining useful life.

Actual RUL values and model-predicted values are compared in Figure 4. The model performs well since the great majority of predictions are really close to the perfect diagonal. Across the working range, the transformer dual model shows how closely the predictions match the true values. The tight clustering around the red line suggests that the model has adequate predictive performance across the range of RUL values.

Figure 4.

Scatter plot of true versus predicted RUL using TDIM with the red line indicating perfect predictions. TDIM: Transformer-based dual-input model; RUL: remaining useful life.

The distribution of absolute errors in RUL estimations based on the TDIM is seen in the error boxplot in Figure 5. The transformer dual model boxplot shows the distribution of prediction errors. The box represents the interquartile range (IQR), the horizontal line inside is the median error, and the circles represent outliers. Most errors are low, though some larger outliers exist ( $> 20$ ), which is common in real-world predictions.

Figure 5.

Boxplot of absolute prediction errors for TDIM on RUL estimation. TDIM: Transformer-based dual-input model; RUL: remaining useful life.

Overall, a compact box and short whiskers suggest low variance and robust predictions. The running total of prediction errors for the TDIM is displayed in the cumulative error plot in Figure 6. In the transformer dual model, it represents the IQR, the horizontal line inside is the median error, and the circles represent outliers. The distribution of absolute errors in TDIM is shown by this histogram in Figure 7. The TDIM shows the frequency distribution of prediction errors. Most errors lie between 0 and 10, with a peak around 3, indicating the model often predicts quite close to the true RUL. There is a long right tail, which shows a few larger errors, suggesting the presence of some outliers or challenging samples.

Figure 6.

Cumulative error curve of TDIM showing deviation between true and predicted RUL values. TDIM: Transformer-based dual-input model; RUL: remaining useful life.

Figure 7.

Absolute error distribution of TDIM predictions for RUL. TDIM: Transformer-based dual-input model; RUL: remaining useful life.

The true and expected RUL differences for 200 samples are displayed in the residual plot for TDIM in Figure 8. Residuals show unbiased forecasts with no discernible trend or heteroscedasticity, oscillating around zero. Based on the TDIM, the plots of the actual and predicted RUL for 200 samples are shown in Figure 9. Figure 9 depicts the actual and predicted RUL for 200 samples using the TDIM. The predicted curve largely follows the actual trend of the true RUL through most of the engine trajectories, hence reflecting a strong predictive capability. Within the early operational cycles where degradation patterns are minimal or not observable, the model predicts high RUL values that are close to the capped maximum at approximately 125 cycles, which agrees with the expected healthy state of the engines. As degradation wears on, the model gradually reduces its predictions, tracking the actual decline in RUL with little lag. Minor deviations, such as regions in which the predicted RUL remains flat while the true RUL decreases, represent transient phases before evidence of sufficient degradation becomes available. This behavior in general confirms the robustness of TDIM to handle both early and late life cycle phases effectively, distinguishing healthy and degraded states along the operational timeline of the engines.

Figure 8.

Residual plot showing the difference between actual and predicted RUL using TDIM. TDIM: Transformer-based dual-input model; RUL: remaining useful life.

Figure 9.

True versus predicted RUL using TDIM. TDIM: Transformer-based dual-input model; RUL: remaining useful life.

In order to validate the effectiveness of the proposed model, its performance was benchmarked against multiple established baseline methods reported in the literature, as summarized in Table 5. These baselines include LSTM, XGBoost, Support Vector Regression (SVR), RF, MLP, gated recurrent unit (GRU), and hybrid models such as autoregressive integrated moving average SVMs (ARIMA-SVMs). DL models, particularly CNN-LSTM and DNN-RNN, provide higher accuracy, with RMSEs below 15 in certain circumstances. The TDIM significantly outperforms all baselines with an RMSE of 3.45 and $R^{2}$ of 0.99. Therefore, in evaluating the improvement of the proposed DL architecture, both application-level and methodology-level baselines have been considered. Table 3 compares the results of published methods, the application baselines, with the results of the baseline models trained under our unified setup. Compared to recent related works, including GRU (with RMSE ∼20.55), Deep Neural Networks (DNN) (RMSE 8.79, $R^{2} ~$ 0.85), and CNN-LSTM (RMSE ∼14.98), the proposed TDIM achieves the lowest RMSE of 3.45 and the highest $R^{2}$ of 0.99. This substantial reduction of RMSE, as a direct measure of prediction error, is recognized as the most important indicator of predictive performance. It indicates a clear improvement over both the traditional ML and DL approaches, which justifies the soundness of the proposed dual-input attention mechanism. Though comparative baselines are dominated by single model architectures, they are the most widely adopted benchmarks within recent RUL literature. These ensure that the comparisons are both fair and meaningful. Moreover, the TDIM itself naturally exhibits an ensemble like behavior through its novel design of dual input fusion, fusing the temporal sensor dynamics and aggregated statistical representation in one go. This allows the model to learn complementary degradation patterns with no added complexity regarding explicit ensemble training. All experiments and comparative evaluations were conducted using the FD001 subset of the NASA C-MAPSS dataset to ensure a consistent and fair benchmarking environment. However, future research directions may involve extending TDIM to an ensemble-based formulation for further improvements upon generalization and robustness across diverse operating conditions. Therefore, this contrast justifies the necessity of the proposed DL framework, as the internal benchmarking of the models also falls short of the required predictive reliability for deployment.

Table 5.

State-of-the-art comparative analysis of DL techniques for RUL prediction.

Method	Ref./year	RMSE	Window size	Optimizer	Learning rate	Batch size	Epochs
ARIMA-SVM	⁵²/2019	39.68	—	—	—	—	—
Transformer + TCNN	⁵³/2021	12.31	Full sequence	Adam	—	—	—
MLP GRU	²⁶/2022	37.56 20.55	50	Adam	1e-3	512	—
DNN-RNN	⁵⁴/2022	8.789	100	Adam	—	—	—
CNN-LSTM + CBAM	²⁷/2022	5.50	400	Adam	1e-2	6	400
CNN-LSTM-Attention	³⁶/2024	14.45	40	Adam	—	128	25
CNN-LSTM	⁵⁵/2024	14.98	∼20 (GA tuned)	GA-optimized	GA-optimized	GA tuned	GA tuned
LSTM	⁵⁶/2024	27.33	N/A	XGBoost	0.1	—	—
TMSCNN	⁵⁷/2024	10.26	40	Adam	1e-4	64	—
MLP	⁵⁸/2024	37.56	—		—	—	—
MILP	²⁰/2024	11.43	30	Adam	1.5e-3	—	—
TDIM	Proposed work	3.21	30	Adam	1e-3	64	50

TCNN: temporal convolutional neural network; CBAM: convolutional block attention module; TMSCNN: Transformer-based multiscale convolutional neural network; MILP: mixed integer linear programming; DL: deep learning; RUL: remaining useful life; ARIMA-SVM: auto regressive integrated moving average-support vector machine; GRU: gated recurrent unit; MLP: multilayer perceptron; GRU: gated recurrent unit; RNN: recurrent neural network; CNN: convolutional neural network; LSTM: long short-term memory; TDIM: Transformer-based dual-input model; GA: genetic algorithm.

Ablation studies

A systematic ablation study was designed to analyze the contribution of individual architectural components. Several configurations were created by selectively removing or modifying critical blocks, including the statistical fusion mechanism, and dual-input structure. In order to assess the efficiency of each component in the proposed TDIM, a systematic ablation study was performed. A number of model versions were built by selectively eliminating or altering key architectural elements, such as the statistical combination mechanism and the dual-input architecture.

Model m2 (Transformer Only): It eliminates the aggregated statistical input stream and uses raw sensor sequences only. While this model produced fairly robust performance, it always lagged behind compared to the full TDIM (m1), indicating the supplementary contribution of statistical features to long-range temporal modeling.

Model m3 (Hybrid Transformer + BiLSTM): It replaces the transformers attention-based encoder with a BiLSTM subnetwork. The hybrid architecture yielded impaired performance, reflecting architectural incompatibility and redundancy between sequential and attention modules.

Model m1 (TDIM): It uses both raw sensor sequences and aggregated features with a dual-input attention mechanism, performed the best on all metrics.

These results confirm that the TDIM suggested that the dual-input and fusion structure is essential for improving prediction accuracy and model stability. To provide fair, repeatable, and impartial evaluation across all model configurations, all experiments were conducted using consistent hyperparameters, data partitions, learning rate schedules, and early termination conditions. As seen by ongoing increases in RMSE, $R^{2}$ , and prediction accuracy, the results shown in Table 6 further validate the TDIM’s efficacy and industrial application. An ablation research was carried out to obtain a better understanding of the impact of key architectural elements on model behavior. Three parameters were analyzed: the number of transformer encoder layers (1–4), the size of the input window (20, 25, 30, and 35), and the presence or absence of aggregated trend-based features (Agg = True/False). To determine the impact of each design decision, three model variations—m1 (transformer dual input), m2 (transformer only), and m3 (hybrid transformer + BiLSTM)—were assessed. This study makes it evident the way the reliability and predictive power of the RUL estimation framework are affected by feature fusion, network depth, and temporal window length.

Table 6.

Results of ablation study showing the impact of different architectural variants on RUL prediction performance using NASA FD001 dataset.

Window	Layers	Agg features	RMSE	MAE	$R^{2}$	Acc (%)	Window	Layers	Agg features	RMSE	MAE	$R^{2}$	Acc (%)	Window	Layers	Agg features	RMSE	MAE	$R^{2}$	Acc (%)
m3 (Hybrid + transform BiLSTM)							m2 (Transformer only model)							m1 (Transformer dual input)
20	1	True	18.81	14.13	0.580	60.72	20	1	True	9.28	5.93	0.951	84.09	20	1	True	8.41	5.99	0.960	84.87
20	1	False	15.61	11.06	0.711	71.06	20	1	False	13.17	8.13	0.901	73.14	20	1	False	8.28	5.78	0.960	87.01
20	2	True	18.33	13.66	0.601	63.22	20	2	True	5.19	3.86	0.984	96.11	20	2	True	4.11	3.25	0.990	97.80
20	2	False	15.40	11.15	0.719	72.06	20	2	False	6.36	3.92	0.977	92.27	20	2	False	4.38	3.56	0.989	97.50
20	3	True	18.83	13.98	0.579	60.86	20	3	True	6.46	4.11	0.977	92.94	20	3	True	11.67	7.30	0.923	75.80
20	3	False	15.48	10.85	0.716	71.74	20	3	False	8.97	5.48	0.954	83.63	20	3	False	6.41	4.37	0.977	92.08
20	4	True	19.14	14.61	0.565	58.98	20	4	True	43.88	36.77	−0.101	10.44	20	4	True	21.79	14.04	0.727	58.52
20	4	False	15.93	11.16	0.699	71.78	20	4	False	16.81	10.77	0.838	64.48	20	4	False	7.91	4.75	0.964	87.26
25	1	True	19.23	14.54	0.572	58.70	25	1	True	7.51	5.88	0.968	85.69	25	1	True	6.58	5.06	0.976	90.13
25	1	False	15.84	11.72	0.745	70.40	25	1	False	13.14	8.31	0.902	75.30	25	1	False	5.68	4.27	0.981	94.57
25	2	True	18.01	13.52	0.625	61.49	25	2	True	6.17	4.70	0.979	94.21	25	2	True	3.98	2.49	0.995	99.78
25	2	False	15.37	11.26	0.730	71.20	25	2	False	6.09	3.42	0.985	94.60	25	2	False	3.61	3.01	0.993	99.26
25	3	True	17.96	13.92	0.627	59.47	25	3	True	6.08	3.36	0.985	96.42	25	3	True	20.35	13.64	0.768	57.46
25	3	False	15.97	10.62	0.741	74.64	25	3	False	5.45	3.06	0.989	96.28	25	3	False	3.47	2.73	0.993	99.15
25	4	True	18.16	13.83	0.619	61.44	25	4	True	5.48	3.79	0.983	93.19	25	4	True	18.78	12.75	0.801	59.36
25	4	False	15.94	11.18	0.742	69.88	25	4	False	7.80	4.79	0.965	86.88	25	4	False	11.70	7.45	0.922	76.01
30	1	True	18.08	14.11	0.632	56.57	30	1	True	6.05	4.73	0.979	91.32	30	1	True	6.15	4.64	0.979	91.47
30	1	False	15.38	10.80	0.767	71.48	30	1	False	12.11	7.75	0.916	73.49	30	1	False	6.39	4.89	0.976	89.93
30	2	True	15.29	11.52	0.734	68.42	30	2	True	5.16	4.39	0.980	96.33	30	2	True	3.21	2.68	0.994	99.15
30	2	False	18.03	12.21	0.737	70.74	30	2	False	5.37	3.63	0.983	93.85	30	2	False	4.05	3.47	0.991	99.06
30	3	True	18.15	14.01	0.629	57.44	30	3	True	5.42	4.20	0.983	91.55	30	3	True	3.59	2.94	0.993	99.09
30	3	False	15.55	11.01	0.762	71.68	30	3	False	5.22	3.67	0.985	94.64	30	3	False	3.60	3.03	0.993	99.29
30	4	True	18.36	14.41	0.621	55.74	30	4	True	42.32	36.91	−0.021	12.05	30	4	True	19.17	12.47	0.786	58.92
30	4	False	15.58	12.13	0.741	71.58	30	4	False	43.63	37.39	−0.076	11.31	30	4	False	4.89	3.90	0.986	95.95
35	1	True	18.33	14.20	0.632	58.28	35	1	True	5.61	3.69	0.988	97.26	35	1	True	5.85	4.63	0.980	92.15
35	1	False	15.31	10.46	0.775	72.08	35	1	False	11.84	7.76	0.919	73.94	35	1	False	5.60	4.39	0.982	93.43
35	2	True	17.90	13.90	0.649	57.32	35	2	True	5.55	3.82	0.988	98.31	35	2	True	3.71	2.19	0.996	99.68
35	2	False	15.51	11.82	0.750	70.15	35	2	False	5.32	3.24	0.989	96.32	35	2	False	3.34	2.84	0.994	99.53
35	3	True	18.40	14.23	0.629	56.54	35	3	True	5.42	3.71	0.989	98.13	35	3	True	5.14	4.19	0.984	94.63
35	3	False	15.38	10.94	0.741	72.79	35	3	False	5.37	3.47	0.989	97.11	35	3	False	3.54	2.52	0.995	99.59
35	4	True	18.24	14.46	0.635	53.65	35	4	True	44.14	37.23	−0.124	11.15	35	4	True	19.20	13.36	0.787	58.04
35	4	False	15.39	10.90	0.773	70.98	35	4	False	5.39	2.52	0.993	98.89	35	4	False	42.13	36.47	−0.034	11.67

RUL: remaining useful life; R²: R-squared; MSE: mean squared error; RMSE: root mean squared error; BiLSTM: bidirectional long short-term memory; Acc: accuracy; Agg: aggregated.

According to the ablation results, the suggested model m1 performs the best. It performs better than other versions by effectively fusing dual-input fusion with global attention. With observable performance reductions seen when attention, fusion, or transformer modules are eliminated or modified, the study emphasizes the vital significance of every architectural element. Beyond numerical improvements, the attention heads in the TDIM also provide insight into the physical process of degradation. Examination of the learned attention weights shows that sensors such as high-pressure compressor outlet temperature (T48), corrected core speed (Nc), and fuel flow ratio (Wf) consistently attract higher attention in later life cycles, reflecting known degradation-sensitive parameters in turbofan engines. In contrast, early-cycle attention is dominated by sensors reflecting stable operating conditions such as fan speed (Nf) or LPC inlet pressure (P2). Such a progressive attention shift from healthy state indicators to degradation sensitive variables implies that the model learns meaningful physical relationships rather than purely statistical patterns, hence improving the interpretability and engineering trustworthiness of the TDIM framework.

Conclusion

This study presents an advanced DL architecture, TDIM, for RUL prediction using the NASA C-MAPSS FD001 dataset. An advanced DL architecture for RUL prediction on the NASA C-MAPSS FD001 dataset is proposed and tested: a TDIM. The motivation behind this work lies in addressing the need for accurate, real-time RUL predictions to reduce unexpected equipment failures and maintenance costs. The outcomes show that DL, especially attention-based transformer models, has tremendous potential in PdM tasks. By facilitating early failure detection and accurate RUL estimations, such models play a crucial role in operational efficiency, cost reduction, and safety in industrial settings. Our experimental findings highlight that the TDIM significantly outperforms the baseline models. These results validate the potential of transformer-based approaches for developing robust, real-time PdM solutions in industrial environments. The proposed model, the TDIM architecture, shows promise in enabling condition-based maintenance strategies that are more proactive, cost-effective, and capable of improving overall operational reliability.

While the models show promising results in the C-MAPSS dataset, the direction of further research can be toward generalizability and practical applicability. In this research, a controlled benchmark represented by the FD001 subset was considered, given the least complicating operating condition and a single fault mode to validate the efficiency of the proposed TDIM architecture. This popular subset is chosen in most RUL prediction works because baseline performances are established before generalizing them to more complicated subsets. Future work will involve extending analysis to the remaining C-MAPSS subsets (FD002-FD004), which contain multiple operating and fault conditions, to further validate model robustness and adaptability. Testing the models on real-world industrial data with sensor noise, missing values, and variability over real operations, which is not captured in the simulation data from C-MAPSS, is also necessary. Further advances in prediction accuracy and stability may be obtained by ensemble learning or hybrid models that combine strengths of the BiLSTM and transformer architectures. Also, extensions to the framework can be developed for multisensor fusion, transfer learning for domain adaptation, and online learning for continual model updates that enhance real-world applicability. Integration with Explainable Artificial Intelligence (XAI) techniques will improve the interpretability and trust in these models, while edge or fog deployment can enable real-time, scalable PdM. The TDIM framework thus advances the quest for reliable, interpretable, and intelligent PdM systems and has strong potential in IIoT applications.

Footnotes

Declaration of conflicting interests

The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The authors received no financial support for the research, authorship, and/or publication of this article.

ORCID iD

Deepam Goyal

Data availability statement

The dataset used in this study is the C-MAPSS Turbofan Engine Degradation Simulation Dataset, provided by the Prognostics Center of Excellence at NASA Ames Research Center and publicly available at:

References

Meddaoui

Hain

Hachmoud

. The benefits of predictive maintenance in manufacturing excellence: a case study to establish reliable methods for predicting failures. Int J Adv Manuf Technol 2023; 128(7): 3685–3690.

Hansen

Kulahci

Nielsen

. A primer on predictive maintenance: potential benefits and practical challenges. Qual Eng 2024; 36(3): 638–649.

Kanu

Ogu

Egbumokei

, et al. Enhancing asset management in gas distribution predictive maintenance and data-driven decision making. World Scientific News 2025; 201: 149–160.

Er-Ratby

Kobi

Sadraoui

, et al. The impact of predictive maintenance on the performance of industrial enterprises. SN Comput Sci 2025; 6(1): 73.

Molkeda

Małysiak-Mrozek

Ding

, et al. From corrective to predictive maintenance—a review of maintenance approaches for the power industry. Sensors 2023; 23(13): 5970.

Hasan

Alam

. Evolution and insight in industrial internet of things (IIoT): importance and impact. In: Trust-based communication systems for internet of things applications. New Jersey, USA: Wiley, 2022, pp. 159–193.

Mohammed

Selvarajan

Kumar

, et al. An analytical framework for the industrial internet of things (IIoT): importance, recent challenges, and enabling technologies. In: Gouse

Shitharth

Kumar

Sangeetha

Alaa

Vatchala

, et al. (eds.) Industry automation: the technologies, platforms and use cases. River Publishers, 2024, pp. 1–23.

Peter

Pradhan

Mbohwa

. Industrial internet of things (IIoT): opportunities, challenges, and requirements in manufacturing businesses in emerging economies. Procedia Comput Sci 2023; 217: 856–865.

Viale

Daga

Fasana

, et al. Least squares smoothed k-nearest neighbors online prediction of the remaining useful life of a NASA turbofan. Mech Syst Signal Process 2023; 190: 110154.

10.

Schleichert

Bringmann

Kremer

, et al. Predictive maintenance: taking pro-active measures based on advanced data analytics to predict and avoid machine failure. München: Deloitte Analytics Institute, 2017.

11.

Peng

Chen

Gui

, et al. Remaining useful life prognosis of turbofan engines based on deep feature extraction and fusion. Sci Rep 2022; 12(1): 6491.

12.

Alhuqayl

Alenazi

Alabduljabbar

, et al. Improving predictive maintenance in industrial environments via IIoT and machine learning. Int J Adv Comput Sci Appl 2024; 15(4): 627–636.

13.

Cakir

Guvenc

Mistikoglu

. The experimental application of popular machine learning algorithms on predictive maintenance and the design of IIoT based condition monitoring system. Comput Ind Eng 2021; 151: 106948.

14.

Goundar

Bhardwaj

Nur

, et al. Industrial internet of things: benefit, applications, and challenges. In: Goundar

Avanija

Sunitha

Madhavi

Bhushan

(eds.) Innovations in the industrial internet of things (IIoT) and smart factory. Pennsylvania, USA: IGI Global Scientific Publishing, 2021, pp. 133–148.

15.

Guo

Wang

Yao

, et al. Rul prediction of lithium ion battery based on CEEMDAN-CNN BILSTM model. Energy Rep 2023; 9: 1299–1306.

16.

Hong

. Research study on cognitive iot platform for fog computing in industrial internet of things. J Internet Things Converg 2024; 10(1): 69–75.

17.

Somu

Dasappa

. An edge-cloud IIoT framework for predictive maintenance in manufacturing systems. Adv Eng Inform 2025; 65: 103388.

18.

Khujamatov

Reypnazarov

Khasanov

, et al. IoT, IIoT, and cyber-physical systems integration. In: Singh

Nayyar

Tanwar

Abouhawwash

(eds.) Emergence of cyber physical system and IoT in smart automation and robotics: computer engineering in automation. Cham, Switzerland: Springer, 2021, pp. 31–50.

19.

Zhou

Miu

Sun

, et al. Data-driven modeling of aero-engine performance degradation models. IEEE Access 2024; 12: 150020–150031.

20.

Wang

Chen

Zhao

, et al. Predictive maintenance scheduling for aircraft engines based on remaining useful life prediction. IEEE Internet Things J 2024; 11(13): 23020–23031.

21.

Ucar

Karakose

Kırımça

. Artificial intelligence for predictive maintenance applications: key components, trustworthiness, and future trends. Appl Sci 2024; 14(2): 898.

22.

Ajay

Krishnna

Jhajharia

. Predictive maintenance of NASA turbofan engines using traditional and ensemble machine learning techniques. In: Advances in distributed computing and machine learning: proceedings of ICADCML 2023, NIT Rourkela, Odisha, India, 15–16 January 2023, pp. 369–379. Singapore: Springer.

23.

Syuhada

. Performance analysis of long short-term memory (LSTM) model for remaining useful life prediction on turbofan engine. J Electron Technol Explor 2025; 3(1): 24–30.

24.

Manco

Polverino

Abbate

, et al. Evaluation of the impact of long short-term memory parameters in RUL prediction for aero engines. In: Summer school Francesco turco proceedings, Riviera dei Fiori, 7–9 September 2022, pp. 1–8.

25.

Tan

Zhang

Wei

, et al. A multi-model fusion framework for aeroengine remaining useful life prediction. Eng 2025; 6(9): 210.

26.

Azyus

Wijaya

Naved

. Determining RUL predictive maintenance on aircraft engines using GRU. J Mech Civ Ind Eng 2022; 3(3): 79–84.

27.

Wang

. An enhanced CNN-LSTM remaining useful life prediction model for aircraft engine with attention mechanism. PeerJ Comput Sci 2022; 8: e1084.

28.

Mutunga

Kimotho

Muchiri

, et al. Estimating the remaining useful lifetime of a turbofan engine using ensemble of machine learning algorithms. In: Proceedings of the Sustainable Research and Innovation Conference, JKUAT Main Campus, Kenya, 8–10 May 2019, pp.237–241.

29.

de Pater

Mitici

. Novel metrics to evaluate probabilistic remaining useful life prognostics with applications to turbofan engines. In: PHM society European conference, Vol. 7, pp. 96–109.

30.

Ding

Sun

. Remaining useful life estimation in prognostics using deep convolution neural networks. Reliab Eng Syst Saf 2018; 172: 1–11.

31.

Lee

Mitici

. Deep reinforcement learning for predictive aircraft maintenance using probabilistic remaining-useful-life prognostics. Reliab Eng Syst Saf 2023; 230: 108908.

32.

Wang

, et al. Remaining useful life prediction of aircraft turbofan engine based on random forest feature selection and multi-layer perceptron. Appl Sci 2023; 13(12): 7186.

33.

Isbilen

Bektas

Avsar

, et al. Improved machine learning models with a similarity-based approach for remaining useful life prediction. Aeronaut J 2025; 129(1332): 485–505.

34.

Solis-Martin

Galán-Páez

Borrego-Diaz

. A stacked deep convolutional neural network to predict the remaining useful life of a turbofan engine. arXiv preprint arXiv:211112689, 2021.

35.

Peng

Tang

, et al. A spatio-temporal attention mechanism based approach for remaining useful life prediction of turbofan engine. Comput Intell Neurosci 2022; 2022(1): 9707940.

36.

Deng

Zhou

. Prediction of remaining useful life of aero-engines based on cnn-lstm-attention. Int J Comput Intell Syst 2024; 17(1): 232.

37.

Liu

Lei

Pan

, et al. Prediction of remaining useful life of multi-stage aero-engine based on clustering and LSTM fusion. Reliab Eng Syst Saf 2021; 214: 107807.

38.

Ali

AMZAA

Ahmed

AJAM

Abdulaziz

AQMA

, et al. Remaining useful life estimation of aircraft engines using siamese attention-augmented quantum convolutional neural networks. In: 2024 5th International conference on computer engineering and application (ICCEA), Hangzhou, China, 12–14 April 2024, pp. 1366–1371. IEEE.

39.

Nabi

Naick

Almusawi

, et al. Predictive maintenance of aircraft engine using empirical mode decomposition based long short term memory. In: 2024 Second international conference on data science and information system (ICDSIS), Hassan, India, 17–18 May 2024, pp. 1–4. IEEE.

40.

Xia

, et al. SSL-MBC: self-supervised learning with multi-branch consistency for few-shot polsar image classification. IEEE J Sel Top Appl Earth Obs Remote Sens 2025; 18: 4696–4710.

41.

Liu

Guo

, et al. Binaryvit: Binary vision transformer for hyperspectral image classification. IEEE J Sel Top Appl Earth Obs Remote Sens 2025; 18: 20469–20486.

42.

Park

Kim

, et al. A novel training mechanism for health indicator construction and remaining useful lifetime (RUL) prediction. In: 2023 IEEE international conference on big data (BigData), Sorrento, Italy, 15–18 December 2023, pp. 809–818. IEEE.

43.

Yıldırım

Afşer

. Linear methods for predictive maintenance: the case of NASA C-MAPSS datasets. Appl Sci 2025; 15(18): 9945.

44.

Chen

Huang

, et al. Sensor-aware capsnet: towards trustworthy multisensory fusion for remaining useful life prediction. J Manuf Syst 2024; 72: 26–37.

45.

Zhang

Jiang

Huang

, et al. A multi-scale cross-channel attention network for remaining useful life prediction with variable sensors. IEEE Trans Instrum Meas 2025; 74: 3519710.

46.

Zhou

, et al. The emerging graph neural networks for intelligent fault diagnostics and prognostics: a guideline and a benchmark study. Mech Syst Signal Process 2022; 168: 108653.

47.

Stening

. Machine learning platform selection and utilization for predictive maintenance. Master’s Thesis, Aalto University, Finland, 2022.

48.

Le-Nguyen

Turgis

Fayemi

, et al. Exploring the potentials of online machine learning for predictive maintenance: a case study in the railway industry. Appl Intell 2023; 53(24): 29758–29780.

49.

Assagaf

Sukandi

Abdillah

, et al. Machine predictive maintenance by using support vector machines. Recent Eng Sci Technol 2023; 1(1): 31–35.

50.

Akyaz

Engin

. Machine learning based predictive maintenance system for artificial yarn machines. IEEE Access 2024; 12: 125446–125461.

51.

Wen

Chen

, et al. Remaining useful life prediction of IIoT-enabled complex industrial systems with hybrid fusion of multiple information sources. IEEE Internet Things J 2021; 8(11): 9045–9058.

52.

Ordóñez

Lasheras

Roca-Pardinas

, et al. A hybrid arima–svm model for the study of the remaining useful life of aircraft engines. J Comput Appl Math 2019; 346: 184–191.

53.

Wang

Cheng

Song

. Remaining useful life estimation of aircraft engines using a joint deep learning model based on TCNN and transformer. Comput Intell Neurosci 2021; 2021(1): 5185938.

54.

Zonta

Da Costa

Zeiser

, et al. A predictive maintenance model for optimizing production schedule using deep neural networks. J Manuf Syst 2022; 62: 450–462.

55.

Habib

Mohamed

. Enhancing predictive maintenance hyperparameter optimization and adopted strategies. In: 2024 IEEE international conference on mechatronics and automation (ICMA), Tianjin, China, 4–7 August 2024, pp. 153–158. IEEE.

56.

Melkumian

. Predictive maintenance analysis of turbofan engine sensor data. J Purdue Undergrad Res 2024; 14(1): 8.

57.

Liu

Zhang

Guo

, et al. Enhancing aircraft engine remaining useful life prediction via multiscale deep transfer learning with limited data. J Comput Design Eng 2024; 11(1): 343–355.

58.

Singh

Mallick

Harursampath

. Data-driven aircraft engine prognostics using probabilistic machine learning. In: Industry 4.0 and advanced manufacturing, proceedings of I–4 AM (ed Chakrabarti

Suwas

Arora

), IISc Banglore, India, 11–12 January 2024, Vol. 2, 2025, p. 61. Singapore: Springer Nature.