A hybrid prognostic approach for extra-large-scale bearing using enhanced CERLMDAN-KPCA and MLKELM-AE

Abstract

Fault prognostics and health management (PHM) is essential for ensuring the high reliability and extending the lifespan of extra-large-scale bearings. Within the PHM of extra-large-scale bearing, signal de-noising is the top priority due to the presence of weak fault characteristic, which is almost submerged by the strong background noise. Under the premise of accurate signal de-noising, the next critical aspect of PHM depends on remaining useful life (RUL) prediction, which provides guidance for the operation and maintenance of extra-large-scale bearing. In view of these two aspects, a new signal de-noising method is proposed through the combination of complete ensemble robust local mean decomposition with adaptive noise with kernel principle component analysis (CERLMDAN-KPCA). Subsequently, the implementation of RUL prediction is conducted using multi-layer kernel extreme learning machine based auto-encoder (MLKELM-AE). During the processes of signal de-noising and RUL prediction, parameter optimization is carried out to enhance signal decomposition ability and prediction performance. Experimental results demonstrate that MLKELM-AE, combined with CERLMDAN-KPCA-based signal de-noising, achieves superior RUL prediction accuracy for extra-large-scale bearing.

Keywords

extra-large-scale bearing signal de-noising CERLMDAN-KPCA RUL prediction MLKELM-AE

1. Introduction

Fault prognostics and health management (PHM) has been extensively employed to guarantee the safety and reliability of mechanical equipment. Given the critical role of extra-large-scale bearings as transmission components that endure low-speed heavy-load operation, the implementation of PHM for extra-large-scale bearing is imperative. A typical PHM system involves four fundamental steps: (i) data acquisition, (ii) feature extraction, (iii) the construction of health indicator, and (iv) remaining useful life (RUL) prediction (Zio 2022). The first three steps support RUL prediction to enhance prediction accuracy, which can guide the operation and maintenance of extra-large-scale bearing (Ferreira and Gonçalves, 2022). However, the operating condition under low speed and heavy load generates vibration signals contaminated with significant background noise (Liu et al., 2020). Consequently, the primary challenge within the PHM system lies in executing effective weak signal de-noising (Pan et al., 2021).

The extra-large-scale bearing consists of several rotating machinery and components, including cage, balls, ring, and outer and inner rings. Therefore, the faults of extra-large-scale bearing often manifest as multi-mode coupling, with various components interacting and causing non-linear and non-stationary vibration signals (Caesarendra and Tjahjowidodo, 2017). To address these complicated signals, multi-scale adaptive decomposition methods, such as empirical mode decomposition (EMD), ensemble EMD (EEMD), local mean decomposition (LMD), and robust LMD (RLMD), are highly suitable due to their purely digital-driven nature (Han et al., 2021; Huynh et al., 2021; Jia et al., 2023; Sarmadi et al., 2020). However, these methods are plagued by mode mixing and reconstruction errors (Zheng et al., 2014). By incorporating a noise-assisted approach, which involves adding adaptive Gaussian white noise at each decomposition stage, these issues can be mitigated (Zhao et al., 2024). Besides, it should be noted that noise amplitude and ensemble trials are crucial factors in noise-assisted methods, yet there is no standardized criterion for determining these hyperparameters (Zhan et al., 2019). Therefore, combining the noise-assisted approach with a parameter optimization method is the optimal choice for enhancing decomposition accuracy. Furthermore, considering that LMD-related functions are more proficient at preserving the amplitude and frequency variations within the raw vibration signal compared with EMD-related functions (Ali et al., 2023), a new self-adaptive signal decomposition method, named complete ensemble robust local mean decomposition with adaptive noise (CERLMDAN) along with parameter optimization, is introduced to handle the non-linearity and non-stationarity of extra-large-scale bearing vibration signals. After adaptive decomposition, another challenge in signal de-noising lies in selecting the appropriate low-frequency fault components owning to the low-speed operation condition. To the best of our knowledge, existing research primarily focuses on extracting fault components from short-term vibration signals for de-noising, which is better suited for signals with high rotational speeds and high-energy impact components (Lu et al., 2025; Peng et al., 2022). However, the energy of low-frequency fault components in extra-large-scale bearings is extremely low, making it challenging to identify and select those using traditional methods (Pan et al., 2024). To address this limitation, a novel fault component selection strategy is introduced, tailored to the trend of fault degradation in extra-large-scale bearings.

In addition to accurate signal de-noising, another critical aspect of PHM depends on RUL prediction, which can guide the operation and maintenance of extra-large-scale bearing. Both deep learning (DL) and machine learning (ML) have demonstrated promising results in the field of bearing RUL prediction (Bai et al., 2023; Schwendemann et al., 2021). Compared with ML, the primary core advantage of DL lies in its capacity to automatically learn high-level features from data through multi-layer neural networks (Dargan et al., 2020). However, the multi-layer structure necessitates the training of all hidden layers, leading to a greedy iterative adjustment of all parameters layer by layer (Wang et al., 2025). To address these drawbacks, Kasun et al. proposed an unsupervised learning algorithm named extreme learning machine based auto-encoder (ELM-AE), which combines strong feature expression capabilities of AE with efficient model training of ELM (Kasun et al., 2013). With the aid of layer-by-layer DL structure, a multi-layer ELM-AE can achieve feature expression without resorting to greedy learning and time-consuming calculation (Zhang et al., 2020). However, the main shortcomings of ELM include poor stability, poor robustness, and prone to overfitting. Through the replacement of the random mapping in ELM by stable kernel mapping, Huang et al. proposed kernel ELM (KELM) by the introduction of a kernel function (Huang 2014). This enhancement improved the stability and robustness compared with ELM (Liu et al., 2023). Building on this, a new RUL prediction method of extra-large-scale bearing is proposed based on multi-layer KELM-AE (MLKELM-AE). However, the randomness of kernel parameter and penalty coefficient can lead to insufficient stability and poor generalization in the MLKELM-AE model (Li et al., 2022). Similar to adaptive decomposition, parameter optimization is implemented for the enhancement of prediction ability.

Given the aforementioned challenges, a hybrid prognostic approach for extra-large-scale bearings is proposed, which leverages an enhanced CERLMDAN-KPCA and MLKELM-AE. CERLMDAN-KPCA-based signal de-noising is suited for weak low-frequency fault component extraction, especially in extra-large-scale bearing. After that, MLKELM-AE is then utilized to establish RUL prediction model. Additionally, the optimization of hyper parameters in CERLMDAN-KPCA and MLKELM-AE is carried out to enhance signal de-noising and RUL prediction accuracy. Experimental results demonstrate that RUL prediction using improved MLKELM-AE exhibits high accuracy, making it a suitable choice for extra-large-scale bearing prediction analysis.

The remainder of the paper is organized as follows: Section 2 provides an overview of the systematic approach. The signal de-noising, RUL prediction, parameter optimization, and the procedure of proposed method are shown in Section 3. In Section 4, the proposed method is verified using life-cycle experimental signals. The paper concludes with remarks and suggestions for future research directions in Section 5.

2. Methodology

2.1. CERLMDAN

By integrating the noise-assisted approach with RLMD to bolster its resistance to mode mixing and refine its decomposition outcomes, the new CERLMDAN method is introduced. The steps involved in the decomposition process of CERLMDAN are outlined as below:

Step 1: I times of Gaussian white noise $ε_{0} ω_{i} (t)$ are added into the original signal x(t) and get a series of I preprocessed sequences (i = 1, 2, I),

x_{i} (t) = x (t) + ε_{0} ω_{i} (t)

(1)

where

ε_{0}

represents the weight coefficient of Gaussian white noise

ω_{i} (t)

. I denotes the ensemble trials.

Step 2: Perform RLMD on all preprocessed x_i to obtain the first PF_i1 and take their average value as the PF₁ obtained by CERLMDAN, thus obtain the first residual sequence u₁.

{PF}_{1} (t) = \frac{1}{I} \sum_{i = 1}^{I} {PF}_{i 1} (t)

(2)

u_{1} (t) = x (t) - {PF}_{1} (t)

(3)

Step 3: Add Gaussian white noise into the residual sequence u₁ to construct I new sequences $u_{1} (t) + ε_{1} R_{1} (ω_{i} (t))$ . After RLMD of the I sequences, their average value is calculated to obtain the PF₂, and the difference is made to obtain u₂:

{PF}_{2} (t) = \frac{1}{I} \sum_{i = 1}^{I} R_{1} (u_{1} (t) + ε_{1} R_{1} (ω_{i} (t)))

(4)

u_{2} (t) = u_{1} (t) - {PF}_{2} (t)

(5)

where

R_{j} (\cdot)

is defined as the j^th PF function decomposed by RLMD. By analogy, the m^th residual sequence can be calculated:

u_{m} (t) = u_{m - 1} (t) - {PF}_{m} (t)

(6)

Step 4: Conduct I times RLMD on $u_{m} (t) + ε_{m} E_{m} (ω_{i} (t))$ to obtain the (m+1)^th PF sequence after CERLMDAN decomposition, as shown below:

{PF}_{m + 1} (t) = \frac{1}{I} \sum_{i = 1}^{I} R_{1} (u_{m} (t) + ε_{m} R_{m} (ω_{i} (t)))

(7)

Step 5: Repeat the above steps until residual sequence is either devoid of oscillations or constant, and the expression of the sequence x after CERLMDAN decomposition is as follows:

x (t) = u_{M} (t) + \sum_{m = 1}^{M} {PF}_{m} (t)

(8)

2.2. KPCA

In the realm of fault detection, squared prediction error (SPE) statistic in KPCA quantifies the discrepancy between each sample and the statistical model in terms of change trends, functioning as an indicator of external data alterations within the model. The formula for SPE is given as below:

SPE = {‖ φ (x), φ_{k} (x) ‖}^{2} = \sum_{i = 1}^{n} t_{i}^{2} - \sum_{i = 1}^{k} t_{i}^{2}

(9)

where k and n denote the number of principal component and feature dimension, respectively. t_i represents the projection on the i^th feature direction. φ(·) is defined as kernel function. Then, obtain the threshold of SPE through the following formula:

{SPE}_{\lim} = θ_{1} \cdot {[\frac{c_{α} h_{0} \cdot \sqrt{2 θ_{2}}}{θ_{1}} + 1 + \frac{θ_{2} h_{0} (h_{0} - 1)}{θ_{1}^{2}}]}^{1 / h_{0}}

(10)

where c_α represents the critical value corresponding to a Gaussian distribution at a specific confidence level (1-α)%. h₀ = 1-2θ₁θ₃/3θ₂²,

θ_{d} = \sum_{j = k + 1}^{n} λ_{j}^{d} (d = 1, 2, 3)

, λ_j denotes the j^th eigenvalue of the X covariance matrix.

2.3. KELM

Given N training samples, denoted as {( x _j, t _j)}, where x _j = [x_j1, x_j2,…, x_jn ]^T∈Rⁿ represents input data, and t _j = [t_j1, t_j2,…, t_jm ]^T∈R^m is set as corresponding target output data. The ELM network model, comprising K hidden nodes and the activation function g_i(x)), can be formulated as below:

t_{j} = \sum_{i = 1}^{L} β_{i} g (ω_{i} \cdot x_{j} + b_{i}), j = 1, 2, \cdot \cdot \cdot, N

(11)

where ω _i denotes the weight vector connecting the input nodes to the i^th hidden layer node, b _i is the bias term associated with the i^th hidden layer node, and β _i represents the output weight vector connecting the output layer to the hidden layer. The simplification of equation (11) can be implemented as below:

H β = Y

(12)

where Y represents the expected output and H denotes the hidden layer nodes output. H = [h ( x ₁)^T, …, h ( x _N)^T]^T, h( x ) = [g ( ω ₁ x + b ₁),···, g ( ω _L x + b _L)]. Based on the theory of generalized inverses, the solution to equation (12) is given by the following equation:

β = H^{+} Y

(13)

where H ⁺ represents the generalized inverse of the output matrix H of hidden layer.

In KELM, a kernel matrix Ω is introduced to replace the random matrix HH ^T of ELM framework:

Ω_{ELM} = H H^{T} : Ω_{ELM (i, j)} = K (x_{i}, x_{j})

(14)

The kernel function is denoted as K ( x _i, x _j). When dealing with datasets without prior knowledge, the radial basis function kernel (RBF) is typically employed:

K (x_{i}, x_{j}) = \exp (- \frac{{‖ x_{i} - x_{j} ‖}^{2}}{2 σ^{2}})

(15)

where σ denotes the width of the kernel function, and the output of KELM is provided as below:

\begin{array}{l} f (x) = h (x) H^{T} {(\frac{I}{C} + {H H}^{T})}^{- 1} Y = \\ [\begin{array}{l} K (x, x_{1}) \\ ⋮ \\ K (x, x_{N}) \end{array}] {(\frac{I}{C} + Ω_{ELM})}^{- 1} Y \end{array}

(16)

where I denotes the unit matrix and C serves as the penalty coefficient, balancing the proportion between empirical and structural risk.

2.4. AE

AE is a neural network model that consists of an encoder and a decoder (Wang et al., 2016). The process of AE can be divided into the following two steps: Encoder and Decoder. The primary objective of AE is to reduce the reconstruction error to the greatest extent possible, which can learn a compact and efficient representation of the input data.

2.5. MFO

Moth-flame optimization (MFO) is a novel optimization approach based on swarm intelligence, inspired by the lateral orientation behavior exhibited by nocturnal moths (Mirjalili 2015). Since the MFO algorithm is a population-based algorithm, the set of moths and flames is represented in a matrix M and F as follows:

M = [\begin{array}{l} m_{1, 1} m_{1, 2} \dots \dots m_{1, d} \\ m_{2, 1} m_{2, 2} \dots \dots m_{2, d} \\ ⋮ ⋮ ⋮ ⋮ ⋮ \\ m_{n, 1} m_{n, 2} \dots \dots m_{n, d} \end{array}], F = [\begin{array}{l} F_{1, 1} F_{1, 2} \dots \dots F_{1, d} \\ F_{2, 1} F_{2, 2} \dots \dots F_{2, d} \\ ⋮ ⋮ ⋮ ⋮ ⋮ \\ F_{n, 1} F_{n, 2} \dots \dots F_{n, d} \end{array}]

(17)

where n is the number of moths and d is the number (dimension) of variables.

To accurately replicate the behavior of moths, the position update of each moth corresponding to the flame can be defined as below:

M_{i} = S (M_{i}, F_{j})

(18)

where S represents the spiral function. M _i and F _j signify the i^th moth and the j^th flame, respectively.

The main updating mechanism of moths is logarithmic spiral shown as below:

S (M_{i}, F_{j}) = D_{i} \cdot e^{b t} \cdot \cos (2 π t) + F_{j}

(19)

where D_i represents the distance between the i^th moth and the j^th flame, D_i = | F _j- M _i|. b is a constant that defines the shape of a logarithmic spiral, and t denotes a random number chosen from the interval [r, 1], where r is the convergence constant that undergoes a linear decrease from −1 to −2 as the process of iterative update progresses. Hence, an adaptive mechanism is introduced to dynamically adjust the number of flames:

f l a m e_{n o} = round (N - l \cdot \frac{N - 1}{L})

(20)

where N and l denote the maximum number of flames and current iteration times, respectively. L signifies the total number of maximum iterations.

3. Hybrid prognostic approach using enhanced CERLMDAN-KPCA and MLKELM-AE

3.1. CERLMDAN-KPCA

Through signal decomposition by CERLMDAN, vibration signals of extra-large-scale bearing can be divided into several PFs ordered from high frequency to low frequency. Following this decomposition, a new strategy for selecting fault-related components is introduced, leveraging statistical detection through KPCA. The energy of low-frequency components occupied by fault characteristics tends to increase over time, while that of high-frequency components associated with background noise remains constant. In view of this point, the new selection strategy evaluates the trend of each decomposed functions throughout the entire operation life rather than focusing on short-term signals. The detailed CERLMDAN-KPCA procedure is outlined as below:

Step 1: Collect entire life vibration signal x_n(t) (n = 1, 2, N). The first segment x₁(t) represents the normal working signal without faults.

Step 2: Decompose each segment into M PFs through CERLMDAN, and set PF_mn as the m^th (m = 1, 2, M) PF of x_n(t).

Step 3: Divide each PF_mn into a matrix K_mn.

Step 4: K_m1 is regarded as the normal KPCA model, and project K_mn into K_m1 to obtain SPE_mn. Calculate the root mean square of SPE_mn, R_mn, and take S_mn (S_mn = R_mn –R_m1) as evaluation indicator of change trend.

Step 5: Define the weighted cumulative value (WCV) of S_hk as a choice criteria aimed at bringing to light the general trend of each PF.

{WCV}_{m} = \sum_{n = 2}^{N} (S_{m n} \times S_{m n} / \sum_{i = 1}^{M} S_{i n})

(21)

where

\sum_{i = 1}^{M} S_{i n}

denotes the additive value of M PFs decomposed from x_n(t). The square of S_mn serves to accentuate the evolving tendency of the m^th PF.

Step 6: Choose fault-related PFs with lager WCV, and then reconstruct entire life-cycle vibration signals x(t)^R. Figure 1 demonstrates the detailed signal de-noising procedure.

3.2. Parameter optimization of CERLMDAN

How to balance the ensemble trials I and noise amplitude ε in CERLMDAN affects both the decomposition precision and decomposition efficiency. A smaller ε may fail to alter the extreme point distribution, leading to an uneven scale of extreme point. Conversely, a larger ε can result in an increased decomposition number and heightened computational complexity. Besides, the larger the number of ensemble trials I is, the better the decomposition effect will be, but causing the computational complexity. As a result, an enhanced CERLMDAN methodology is proposed using MFO to improve its adaptive decomposition capability.

Step 1: Initialize parameters of MFO, such as number of moth, maximum iteration times, and logarithmic spiral shape constant.

Step 2: Utilize the adaptive mechanism in equation (20) to decrease the number of flames.

Step 3: Fitness function is designated using the mean envelope spectrum entropy (MESE), as outlined in equation (22), and utilize MFO to optimize the parameters ε and I. Evaluate the MESE for each moth to determine the optimal one that exhibits the best performance, which is referred to as the flame.

MESE = \frac{1}{k} \sum_{p = 1}^{k} H_{e} (p)

(22)

Figure 1.

Flow chart of proposed CERLMDAN-KPCA.

The calculation formula of envelope spectrum entropy H_e of decomposed PF is as follows:

H_{e} = - \sum_{j = 1}^{N} q_{j} \log_{2} (q_{j})

(23)

q_{j} = g (j) / \sum_{j = 1}^{N} g (j)

(24)

where k and N represent the number of decomposed PFs and data samples, respectively. q_j denotes the normalized value of the envelope spectrum of decomposed PFs, which is defined as g(j).

Step 4: Adjust the moth position using equation (19) and subsequently obtain its corresponding fitness value. Then, reorganize the sequence of flames based on the optimal solution identified.

Step 5: Repeat Steps 2∼4 to initiate the next generation, continuing this process until the iteration count satisfies the algorithm criteria or the fitness function fails to improve beyond the global optimum.

Step 6: Find the optimal set of (ε, I) and construct the signal decomposition CERLMDAN accordingly.

3.3. MLKELM-AE

Through integrating KELM with AE, the reconstruction of input signals is accomplished by KELM-AE, as illustrated in Figure 2(a). Hidden layer output can generate encoded representation input. By stacking multiple KELM-AE layers, the multi-layer structure is capable of extracting high-level representations from the input features. Consequently, a novel RUL prediction model, termed MLKELM-AE, is introduced. This model consists of multi-layer KELM-AE, followed by a prediction layer utilizing KELM. Figure 2(b) shows the detailed structure diagram of MLKELM-AE.

Figure 2.

Structure diagram: (a) KELM-AE and (b) MLKELM-AE.

For KELM-AE with k layers, the j^th layer weight output is presented as below:

β_{j} = {(\frac{I}{C} + Ω_{ELM})}^{- 1} X_{j}, j = 1, 2, . . ., k

(25)

where X _j is the j^th layer input. The j^th layer output can be obtained through X _j and β _j, which is taken as the (j+1)^th layer input (Cheng et al., 2019). The k^th layer output of KELM-AE serves as the input for the prediction layer KELM, and the final output of the whole MLKELM-AE is obtained by equation (16). The MLKELM-AE prediction model is mainly divided into high-level features extraction of samples by multi-layer network and RUL prediction, which is shown as follows:

Step 1: Parameter initialization: this includes penalty coefficient and kernel parameter.

Step 2: Multi-layer feature extraction: this calculates the hidden layer weights and outputs of KELM-AE layer by layer, and determines all network weights in the multi-layer model.

Step 3: RUL prediction: this sets the last layer output of KELM-AE as the input for prediction layer KELM and output RUL.

The training process of traditional deep learning involves unsupervised pre-training and supervised fine-tuning, requiring the simultaneous training of all hidden layers through a layer-by-layer greedy iterative adjustment of all parameters. This approach not only exhibits high time complexity but also allows for the layer-wise propagation of systematic bias. In contrast, the proposed MLKELM-AE is composed of multiple independent KELM-AE layers. In a single KELM-AE, the weights and output are solely dependent on its input, thus eliminating the need for fine-tuning. Furthermore, the multi-layer KELM-AE architecture mitigates the random fluctuations in model output caused by the random matrix H in ELM-AE. Consequently, the MLKELM-AE model proposed in this study for extra-large-scale bearing RUL prediction enhances the mapping capability for nonlinear features while offering superior efficiency and speed.

3.4. Parameter optimization of MLKELM-AE

The uncertainty of kernel parameter σ and penalty coefficient C in MLKELM-AE leads to a prediction model with reduced generalization performance and unsatisfactory stability. A smaller σ value makes the RBF kernel behave similarly to a polynomial kernel, while a larger σ value causes it to resemble a linear kernel. A larger C value placing a greater emphasis on empirical risk minimization. Consequently, σ and C significantly influence the capability of RUL prediction model. To tackle this challenge, we propose an enhanced MLKELM-AE methodology using the MFO algorithm. Similar to MFO-CERLMDAN, the updated process of MFO-MLKELM-AE can be generalized as outlined below:

(1) Update the Step 3 of MFO-CERLMDAN: Fitness function is designated using the mean squared error.

(2) Update the Step 6 of MFO-CERLMDAN: Find the optimal set of (σ, C) and construct the RUL prediction model MLKELM-AE accordingly.

3.5. The procedure of hybrid prognostic approach

The proposed method comprises two essential steps. Initially, MFO-CERLMDAN-KPCA-based signal de-noising is implemented to enhance the fault characteristic information of extra-large-scale bearing. Subsequently, MFO-MLKELM-AE is utilized to predict RUL. The step-by-step executive process is detailed as follows:

Step 1: Gather entire life vibration signals from the extra-large-scale bearing.

Step 2: Implement signal de-noising by MFO-CERLMDAN-KPCA to the entire vibration signals.

Step 3: Extract multi-domain features, followed by normalizing the entire dataset and dividing it into training and testing sets.

Step 4: The training set is used to train the MFO-MLKELM-AE model, with the extracted multi-domain features serving as input. Then output the optimal MLKELM-AE extra-large-scale bearing RUL prediction.

Step 5: Utilize the trained MFO-MLKELM-AE model to predict the RUL. The procedure depicting the proposed method is presented in Figure 3.

4. Experimental verification and analysis

4.1. Extra-large-scale bearing test rig

As shown in Figure 4, the extra-large-scale bearing test rig employs three bidirectional hydraulic cylinders, namely, Cylinder 1, Cylinder 2, and Cylinder 3, for loading. Among them, Cylinder 1 and Cylinder 2 apply loads in opposite directions with different magnitudes, achieving the application of axial force and overturning moment. Cylinder 3, on the other hand, applies radial load to the tested bearing through a top flange. The tested extra-large-scale bearing is rotated by the accompanied bearing, driven by hydraulic motor Cylinder 4.

Figure 4.

Test rig for extra-large-scale bearing.

Figure 3.

Procedure of the proposed RUL prediction method.

The QNA-730-22 extra-large-scale bearing is chosen for this test, classified as single-row internal-tooth four-point contact bearing. Considering the different characteristics of signals, the sample rate of vibration signal is set at 2048 Hz, and the rest is set at 10 Hz. The accelerated life test, illustrated in Figure 4, lasted for 11 days in total. At the conclusion of the test, the tested extra-large-scale bearing had undergone severe failure, causing it to seize. The outer ring raceway exhibited extensive fatigue spalling and wear. Additionally, some of the balls suffered fatigue fractures, while the inner ring raceway displayed severe pitting corrosion, as illustrated in Figure 4.

To demonstrate the process of deterioration, Figure 5 presents the vibration signal throughout extra-large-scale bearing lifecycle. From Figure 5, it is evident that vibration signal progressively intensifies as faults initiate and worsen. Therefore, the vibration signal serves as an indicator of the extra-large-scale bearing health status throughout its entire lifecycle. However, the presence of significant background noise obscures the subtle fault characteristics.

Figure 5.

Life-cycle vibration signal.

As depicted in Figure 6, the graph showcases the entire life-cycle trend of alterations in grease temperature and driving torque. It can be noted that the change tendencies of the two are essentially in accord. Hence, the characteristic curves of temperature and driving torque can also indicate the process of performance deterioration of extra-large-scale bearing.

Figure 6.

Life-cycle characteristic signals: (a) temperature and (b) driving torque.

Additionally, it is worth noting that vibration signals are more sensitive to faults compared with temperature and torque signals. Nevertheless, the advent of large data volumes, variability, and diversity brings new opportunities for advancing signal processing techniques. Consequently, data fusion emerges as a viable option for enhancing RUL prediction, and its superiority will be demonstrated in the following sections.

4.2. Signal de-noising

Firstly, the procedure commences by segmenting the raw signal into 11 parts based on the test duration, followed by computing the corresponding PFs through MFO-CERLMDAN. Next, divide each PF into a matrix with multiple dimensions, with the normal matrix functioning as the reference for the healthy state in the KPCA model. The SPE statistics are derived through the projection of the multi-dimensional matrix into the healthy state KPCA model. Subsequently, calculate the WCV of decomposed PFs, and PFs with larger WCV are selected. Finally, reconstruct the life-cycle signals by utilizing the selected PFs. Figure 7(a) depicts the de-noised effect obtained through the application of MFO-CERLMDAN-KPCA.

Figure 7.

De-noised life-cycle vibration signal: (a) MFO-CERLMDAN-KPCA and (b) CERLMDAN-KPCA.

To verify the effectiveness of parameter optimization, we utilize CERLMDAN-KPCA with empirical value (ε = 0.2 and I = 100) as a benchmark for comparison. Figure 7(b) depicts the de-noised results obtained using CERLMDAN-KPCA. From Figure 7, the following insights can be derived: (1) Both MFO-CERLMDAN-KPCA and CERLMDAN-KPCA effectively eliminate considerable white noise contamination in life-cycle vibration signals. (2) Compared with CERLMDAN-KPCA, MFO-CERLMDAN-KPCA significantly reduces reconstruction errors with suitable decomposition parameters, making fault characteristics clearer from an overall trend observation. Therefore, MFO-CERLMDAN-KPCA stands as the optimal method for signal de-noising of extra-large-scale bearing.

4.3. RUL prediction

The performance degradation of large-size, low-speed extra-large-scale bearings, from normal operation to failure, can span months or even years. While feature extraction allows for the monitoring of this temporal degradation, different features exhibit varying sensitivities to specific faults during distinct stages (Bhavsar et al., 2022). Furthermore, the efficacy of analytical methods often varies across different operational phases. To maximize the preservation of fault information, this study extracts comprehensive features from both the time domain and time-frequency domain to holistically reflect the health condition of the bearing.

Time-domain features are primarily categorized into dimensional and dimensionless indicators. Dimensional features (e.g., kurtosis, mean, variance, and root mean square) characterize the impact energy of the vibration signal. In contrast, dimensionless features (e.g., waveform index, peak index, margin index, and skewness) are widely utilized for condition monitoring and fault recognition in rotary equipment, as they are independent of signal amplitude. Consequently, eight representative time-domain features are selected. To mitigate information loss under non-stationary conditions, time-frequency analysis is employed to map one-dimensional time-domain signals into a two-dimensional time-frequency representation. As a representative time-frequency method, wavelet packet transform (WPT) decomposes signals into various frequency bands with adaptive time-frequency resolution. In this paper, WPT is utilized to divide the vibration signal into eight frequency bands and calculate the energy spectrum of each band, yielding eight time-frequency domain features.

After feature extraction, these different domain features are used as the input vector of the prediction model at time t and the corresponding real RUL as model output. The aim is to build the relationship between the features and RUL of extra-large-scale bearing. The entire life-cycle multi-domain features are divided into 660 samples, with the training and testing dataset allocated at a ratio of 2:1. Based on preliminary experiments conducted in our lab, we set the number of layers in the KELM-AE to 2, resulting in satisfactory prediction performance with the involved methods. For parameter optimization, MFO is employed, with the C and σ for MLKELM-AE optimized within ranges of 0.1 to 1000.0 and 0.01 to 100.00, respectively.

To quantitatively assess the impact of feature extraction within the MLKELM-AE, three strategies for combining features are employed in RUL prediction: (i) Strategy 1: time-domain features; (ii) Strategy 2: time-frequency domain features; and (iii) Strategy 3: time-domain and time-frequency domain features. Figure 8 presents the RUL prediction results obtained from testing datasets using Strategies 1, 2, and 3. To quantitatively evaluate the performance of each method in RUL prediction, the mean absolute error (MAE), root mean square error (RMSE), and mean absolute error ratio (MAER) are calculated for comparative analysis. These indicators provide a comprehensive view of prediction accuracy from different perspectives. However, their similarities lie in that the smaller RMSE, MAE, and MAER are, the lower predicted error is, signifying a more robust prediction capability. The mathematical formulations for RMSE, MAE, and MAER are provided below:

MAE = \frac{1}{n} \sum_{i = 1}^{n} | y (i) - \hat{y} (i) |

(26)

RMSE = \sqrt{\frac{1}{n} \sum_{i = 1}^{n} {[y (i) - \hat{y} (i)]}^{2}}

(27)

MAER = \frac{1}{n} \sum_{i}^{n} | \frac{y (i) - \hat{y} (i)}{y (i)} |

(28)

where n represents the number of samples, y(i) denotes the actual value, and

\hat{y} (i)

denotes the predicted value.

Figure 8.

RUL prediction results of different strategies: (a) Strategy 1, (b) Strategy 2, and (c) Strategy 3.

Table 1 summarizes the detailed prediction outcomes of three strategies. From Figure 8 and Table 1, it is evident that MLKELM-AE with Strategy 3 demonstrates a higher level of prediction precision compared to the other two strategies. This can be attributed to the fact that multi-domain features encapsulate more exhaustive and credible characteristic information compared with single-domain features, thus better characterizing the life degradation process of extra-large-scale bearing.

Table 1.

RUL prediction results of different strategies.

Input strategies	MAE	RMSE	MAER
Strategy 1	0.566	0.7793	0.1627
Strategy 2	1.1138	1.8583	0.4405
Strategy 3	0.2866	0.4971	0.0794

To further demonstrate the effectiveness of parameter optimization for signal de-noising, we calculate the RUL prediction results using de-noised vibration signals processed by both MFO-CERLMDAN-KPCA and CERLMDAN-KPCA. For simplicity, only Strategy 3 is selected as the input feature. Figure 9 and Table 2 illustrate the prediction results. Prediction results based on CERLMDAN-KPCA perform worse than those using MFO-CERLMDAN-KPCA. Therefore, through both visualization analysis (Figure 9) and quantitative analysis (Table 2), the proposed parameter optimization for signal decomposition is essential for RUL prediction on such low signal-to-noise ratio vibration signals.

Figure 9.

RUL prediction results of different de-noising methods: (a) CERLMDAN-KPCA and (b) MFO-CERLMDAN-KPCA.

Table 2.

RUL prediction results of different de-noising methods.

De-noising methods	MAE	RMSE	MAER
CERLMDAN-KPCA	0.3736	0.5303	0.1759
MFO-CERLMDAN-KPCA	0.2866	0.4971	0.0794

To explore the parameter optimization performance of MFO proposed in this research, particle swarm optimization (PSO), genetic algorithm (GA), and bat algorithm (BA) are employed as comparison algorithms. Figure 10 and Table 3 depict the RUL prediction results of four models. The prediction accuracy of MFO-MLKELM-AE is superior to that of other optimization models. Therefore, the proposed MFO method can achieve the optimal fitness value with ensuring the prediction accuracy, demonstrating superior stability and robustness. Furthermore, to verify the effectiveness of parameter optimization, random values are assigned to C and σ. The RUL prediction accuracy is shown in Figure 11 and Table 4. RUL prediction accuracy based on MFO-MLKELM-AE significantly surpasses that utilizing non-optimized MLKELM-AE. Consequently, the combined values of C and σ exert a notable influence on the capability of MLKELM-AE, verifying the necessity of using swarm intelligence algorithms to optimize the hyperparameters.

Figure 10.

RUL prediction results of different optimization methods: (a) PSO, (b) GA, (c) BA, and (d) MFO.

Table 3.

RUL prediction results of different optimization methods.

Optimization methods	MAE	RMSE	MAER
PSO-MLKELM-AE	0.3394	0.5208	0.1053
GA-MLKELM-AE	0.3396	0.5278	0.1032
BA-MLKELM-AE	0.3095	0.5248	0.0951
MFO-MLKELM-AE	0.2866	0.4971	0.0794

Figure 11.

RUL prediction results without parameter optimization: (a) Strategy 1, (b) Strategy 2, and (c) Strategy 3.

Table 4.

RUL prediction results without parameter optimization.

Parameter optimization	Strategy 1			Strategy 2			Strategy 3
Parameter optimization	MAE	RMSE	MAER	MAE	RMSE	MAER	MAE	RMSE	MAER
MLKELM-AE	0.6729	0.8909	0.2229	2.0931	2.6211	2.0106	0.5604	0.7395	0.1794
MFO-MLKELM-AE	0.566	0.7793	0.1627	1.1138	1.8583	0.4405	0.2866	0.4971	0.0794

According to Section 3.3 mentioned above, the development of MLKELM-AE is rooted in ELM-based theory, particularly in its multi-layer structure and kernel mapping. Furthermore, we evaluate the effectiveness of MLKELM-AE by comparing it with the multi-layer extreme learning machine based auto-encoder (MLELM-AE), single-layer ELM, and KELM models. The MLELM-AE consists of multiple ELM-AE layers followed by a prediction layer ELM. To ensure a fair comparison, MFO is employed to optimize the weights, biases in both MLELM-AE and ELM, as well as the kernel parameter and penalty coefficient in KELM. The results, displayed in Figure 12 and Table 5, demonstrate that the proposed MFO-MLKELM-AE demonstrates superior prediction accuracy compared with the other three traditional models. Obviously, the multi-layer structure boosts the accuracy of MFO-MLELM-AE and MFO-MLKELM-AE compared with MFO-KELM and MFO-ELM. Additionally, incorporating the kernel function in KELM improves the prediction accuracy of MFO-MLKELM-AE and MFO-KELM over MFO-MLELM-AE and MFO-ELM, respectively. In conclusion, the combination of multi-layer structure and kernel mapping enhances the ability of MFO-MLKELM-AE model to extract multi-dimension nonlinear features, leading to better generalization and robustness.

Figure 12.

RUL prediction results of ELM-based methods: (a) MFO-ELM, (b) MFO-MLELM-AE, (c) MFO-KELM, an d (d) MFO-MLKELM-AE.

Table 5.

RUL prediction results of ELM-based methods.

ELM-based models	MAE	RMSE	MAER
MFO-ELM	1.0047	1.1959	0.3171
MFO-MLELM-AE	1.0044	1.2977	0.2669
MFO-KELM	0.5631	0.7507	0.1643
MFO-MLKELM-AE	0.2866	0.4971	0.0794

To further evaluate the influence of multi-domain features on various ELM-based prediction models, we performed a comparative analysis of the performance of time-domain and time-frequency features when utilized as inputs for these models. The RUL prediction results, listed in Table 6, demonstrate that multi-domain features significantly surpass single-domain features in terms of MAE, RMSE, and MAER. Among both single-domain and multi-domain feature sets, MFO-MLKELM-AE model exhibits superior prediction accuracy compared with MFO-MLELM-AE, MFO-KELM, and MFO-ELM models. These findings suggest that the integration of multi-domain features with MFO-MLKELM-AE model, as introduced in this paper, offers an effective approach for RUL prediction of extra-large-scale bearing.

Table 6.

RUL prediction results of ELM-based methods with different strategies.

ELM-based models	Strategy 1			Strategy 2			Strategy 3
ELM-based models	MAE	RMSE	MAER	MAE	RMSE	MAER	MAE	RMSE	MAER
MFO-ELM	1.1131	1.3193	0.3371	1.884	2.4256	1.8465	1.0047	1.1959	0.3171
MFO-MLELM-AE	1.0404	1.4343	0.3323	1.988	2.5041	1.817	1.0044	1.2977	0.2669
MFO-KELM	0.7367	0.9112	0.2374	1.1117	1.7079	0.5548	0.5631	0.7507	0.1643
MFO-MLKELM-AE	0.566	0.7793	0.1627	1.1138	1.8583	0.4405	0.2866	0.4971	0.0794

Apart from vibration signals, damage to extra-large-scale bearings can manifest through other characteristic signals such as grease temperature and driving torque. To fulfill the requirement for signal diversity in RUL prediction and leverage multi-sensor information effectively, multiple characteristic indicators are integrated to comprehensively reflect the operational status of extra-large-scale bearing. A RUL prediction model is established using 16-dimensional vibration features combined with 2-dimensional auxiliary parameters as model input (Strategy 4). The RUL prediction results, illustrated in Figure 13 and Table 7, demonstrate that the inclusion of auxiliary parameters enhances the prediction outcomes for all four ELM-related prediction models. This indicates that the auxiliary parameters contribute additional characteristics of extra-large-scale bearing from various perspectives, ultimately improving the accuracy of RUL prediction model.

Figure 13.

RUL prediction results utilizing Strategy 4: (a) MFO-ELM, (b) MFO-MLELM-AE, (c) MFO-KELM, and (d) MFO-MLKELM-AE.

Table 7.

RUL prediction results of Strategy 4.

ELM-based models	Strategy 3			Strategy 4
ELM-based models	MAE	RMSE	MAER	MAE	RMSE	MAER
MFO-ELM	1.0047	1.1959	0.3171	0.5415	0.6628	0.2612
MFO-MLELM-AE	1.0044	1.2977	0.2669	0.6947	0.9111	0.2285
MFO-KELM	0.5631	0.7507	0.1643	0.1404	0.1971	0.061
MFO-MLKELM-AE	0.2866	0.4971	0.0794	0.058	0.0905	0.018

4.4. Evaluation of different models

Merely comparing ELM-related models may not be sufficient to conclusively demonstrate their superiority in RUL prediction. Therefore, it is imperative to take into account classical ML techniques such as back propagation (BP), least squares support vector regression (LSSVR), and DL methods like deep belief network (DBN). For a fair comparison, DBN along with two-layer restricted Boltzmann machine is utilized for feature representation (Pan et al., 2023). Following this, DBN-ELM and DBN-KELM, which incorporate prediction layers such as ELM and KELM, are selected. Additionally, kernel parameter and penalty coefficient of LSSVR, weights, and basis of BP are optimized using MFO. Hence, MFO is employed to determine the parameters of ELM and KELM within the DBN-ELM and DBN-KELM models. The prediction results are presented in Figures 14 and 15 and Tables 8 and 9. Four key conclusions can be drawn: (1) Notably, models employing Strategy 4 exhibit superior prediction accuracy compared with those using Strategy 3. (2) LSSVR demonstrates superior prediction ability compared with BP in both Strategy 3 and Strategy 4. Compared with MLKELM-AE, LSSVR appears visually significant deviations in the later stages of the service life. (3) DBN-ELM falls behind DBN-KELM in RUL prediction. Moreover, the feature representation via DBN fails to enhance prediction performance compared with KELM-AE structure with the same layer. (4) MLKELM-AE stands out as the most accurate model among the ones evaluated. The primary reason for the outstanding capability of proposed method is attributed to MLKELM-AE, which effectively captures the fault information during the extra-large-scale bearing degradation process. Meanwhile, data fusion plays a crucial role in retaining sufficient fault information.

Figure 14.

RUL prediction results of ML algorithms: (a) MFO-BP+Strategy 3, (b) MFO-BP+Strategy 4, (c) MFO-LSSVR+Strategy 3, and (d) MFO-LSSVR+Strategy 4.

Figure 15.

RUL prediction results of DL algorithms: (a) DBN-ELM+Strategy 3, (b) DBN-ELM+Strategy 4, (c) DBN-KELM+Strategy 3, and (d) DBN-KELM+Strategy 4.

Table 8.

RUL prediction results of ML algorithms.

ML	Strategy 3			Strategy 4
ML	MAE	RMSE	MAER	MAE	RMSE	MAER
MFO-BP	1.7525	2.1074	0.5189	1.337	1.7223	0.4754
MFO-LSSVR	0.3103	0.5279	0.2276	0.1031	0.3214	0.1312
MFO-MLKELM-AE	0.2866	0.4971	0.0794	0.058	0.0905	0.018

Table 9.

RUL prediction results of DL algorithms.

DL	Strategy 3			Strategy 4
DL	MAE	RMSE	MAER	MAE	RMSE	MAER
DBN-ELM	1.1448	1.4439	0.3604	0.724	0.9447	0.1967
DBN-KELM	0.6665	1.1292	0.1486	0.3296	0.7954	0.0752
MFO-MLKELM-AE	0.2866	0.4971	0.0794	0.058	0.0905	0.018

In summary, the primary areas of analysis in the hybrid prognostic approach encompass the following aspects:

(1) Data-Centric Prognostic Approach: The model input holds paramount importance in data-driven prediction. By utilizing a combination of CERLMDAN-KPCA for signal de-noising and multi-domain feature extraction, it is capable of gathering as much fault information as feasible, thus guaranteeing superior model data input quality.

(2) Model-Centric Prognostic Approach: The multi-layer structure of KELM-AE, equipped with a prediction layer, allows for the consideration of both high-level feature representation and prediction accuracy.

Therefore, the integration of an effective data-centric approach with a model-centric prediction method underscores the high performance of this methodology, making it particularly suited for the prognostic analysis of extra-large-scale bearing.

5. Summary and conclusion

This paper presents a hybrid prognostic approach for extra-large-scale bearing using enhanced CERLMDAN-KPCA and MLKELM-AE. MFO-CERLMDAN-KPCA-based signal de-noising, coupled with multi-domain feature extraction, effectively characterizes the life degradation process of extra-large-scale bearing, outperforming CERLMDAN-KPCA and single-domain features. Additionally, auxiliary parameters such as grease temperature and driving torque enhance model accuracy, providing deeper insights into operation condition of extra-large-scale bearing. Four ELM-related models—ELM, MLELM-AE, KELM, and MLKELM-AE—are analyzed in this study. Experimental results demonstrate that the proposed MLKELM-AE, with its multi-layer structure and kernel mapping, surpasses the other three models. In terms of parameter optimization effect, MFO-MLKELM-AE performs better than BA-MLKELM-AE, GA-MLKELM-AE, and PSO-MLKELM-AE, especially better than non-optimization MLKELM-AE. When compared with other DL techniques like DBN-ELM and DBN-KELM, as well as ML methods such as BP and LSSVR, MLKELM-AE exhibits superior RUL prediction capabilities.

Future work will extend the proposed framework to diverse network architectures to evaluate its generalizability across various deep learning paradigms. Additionally, comparative analyses will be conducted to investigate the method adaptability to other rotating machinery, accounting for variations in dynamics, load conditions, and measurement sensitivity. These efforts will further validate the generalization capability of the proposed approach. Ultimately, the development of an online monitoring system is planned to deploy this methodology in real-world scenarios, thereby advancing proactive maintenance capabilities for extra-large equipment.

Footnotes

ORCID iD

Yubin Pan

Funding

The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by funding from the National Natural Science Foundation of China (52205106) and Natural Science Foundation of Jiangsu Province (BK20210547).

Declaration of conflicting interests

The authors declared no conflicts of interest in relation to the research, authorship, and/or publication of this article.

References

Ali

Prasad

Xiang

, et al. (2023) Ensemble robust local mean decomposition integrated with random forest for short-term significant wave height forecasting. Renewable Energy 205: 731–746. https://doi.org/10.1016/j.renene.2023.01.108

Bai

Noman

, et al. (2023) Diversity entropy-based Bayesian deep learning method for uncertainty quantification of remaining useful life prediction in rolling bearings. Journal of Vibration and Control 29(21-22): 5053–5066. https://doi.org/10.1177/10775463221129930

Bhavsar

Vakharia

Chaudhari

, et al. (2022) A comparative study to predict bearing degradation using discrete wavelet transform (DWT), tabular generative adversarial networks (TGAN) and machine learning models. Machines 10(3): 176. https://doi.org/10.3390/machines10030176

Caesarendra

Tjahjowidodo

(2017) A review of feature extraction methods in vibration-based condition monitoring and its application for degradation trend estimation of low-speed slew bearing. Machines 5(4): 21. https://doi.org/10.3390/machines5040021

Cheng

Zhao

Wang

, et al. (2019) Multi-label learning with kernel extreme learning machine autoencoder. Knowledge-Based Systems 178: 1–10. https://doi.org/10.1016/j.knosys.2019.04.002

Dargan

Kumar

Ayyagari

, et al. (2020) A survey of deep learning and its applications: a new paradigm to machine learning. Archives of computational methods in engineering 27: 1071–1092. https://doi.org/10.1007/s11831-019-09344-w

Ferreira

Gonçalves

(2022) Remaining useful life prediction and challenges: a literature review on the use of machine learning methods. Journal of Manufacturing Systems 63: 550–562. https://doi.org/10.1016/j.jmsy.2022.05.010

Han

Wang

, et al. (2021) Roller bearing fault diagnosis based on LMD and multi-scale symbolic dynamic information entropy. Journal of Mechanical Science and Technology 35: 1993–2005. https://doi.org/10.1007/s12206-021-0417-3

Huang

(2014) An insight into extreme learning machines: random neurons, random features and kernels. Cognitive computation 6: 376–390. https://doi.org/10.1007/s12559-014-9255-2

10.

Huynh

ANL

Deo

Ali

, et al. (2021) Novel short-term solar radiation hybrid model: long short-term memory network integrated with robust local mean decomposition. Applied Energy 298: 117193. https://doi.org/10.1016/j.apenergy.2021.117193

11.

Jia

Wang

Jiang

, et al. (2023) Weak fault detection of rolling element bearing combining robust EMD with adaptive maximum second-order cyclostationarity blind deconvolution. Journal of Vibration and Control 29(9-10): 2374–2391. https://doi.org/10.1177/10775463221080229

12.

Kasun

LLC

Zhou

Huang

, et al. (2013) Representational learning with ELMs for big data. IEEE Intelligent Systems 28(6): 31–34.

13.

Tang

Yang

(2022) A new hybrid prediction model of air quality index based on secondary decomposition and improved kernel extreme learning machine. Chemosphere 305: 135348. https://doi.org/10.1016/j.chemosphere.2022.135348

14.

Liu

Zhang

Carrasco

(2020) Vibration analysis for large-scale wind turbine blade bearing fault detection with an empirical wavelet thresholding method. Renewable Energy 146: 99–110. https://doi.org/10.1016/j.renene.2019.06.094

15.

Liu

Peng

, et al. (2023) Reliable composite fault diagnosis of hydraulic systems based on linear discriminant analysis and multi-output hybrid kernel extreme learning machine. Reliability Engineering & System Safety 234: 109178. https://doi.org/10.1016/j.ress.2023.109178

16.

Zhou

, et al. (2025) Damage detection of thin plates by fusing variational mode decomposition and spectral entropy. Structural Health Monitoring 24(1): 481–495. https://doi.org/10.1177/14759217241239989

17.

Mirjalili

(2015) Moth-flame optimization algorithm: a novel nature-inspired heuristic paradigm. Knowledge-based systems 89: 228–249. https://doi.org/10.1016/j.knosys.2015.07.006

18.

Pan

Wang

Chen

, et al. (2021) Nonstationary signal de-noising method of slow-speed large-size slewing bearing using robust local mean decomposition. International Conference on Intelligent Equipment and Special Robots (ICIESR 2021) 12127: 638–642.

19.

Pan

Wang

Chen

, et al. (2023) Fault recognition of large-size low-speed slewing bearing based on improved deep belief network. Journal of Vibration and Control 29(11-12): 2829–2841. https://doi.org/10.1177/10775463221085856

20.

Pan

Wang

Chen

, et al. (2024) A new fault component selection strategy based on statistical detection for slewing bearing weak signal de-noising. Transactions of the Institute of Measurement and Control 46(11): 2222–2239. https://doi.org/10.1177/01423312241234409

21.

Peng

Ouyang

Gui

, et al. (2022) A multi-indicator fusion-based approach for fault feature selection and classification of rolling bearings. IEEE Transactions on Industrial Informatics 19(8): 8635–8643. https://doi.org/10.1109/tii.2022.3220905

22.

Sarmadi

Entezami

Daneshvar Khorram

(2020) Energy-based damage localization under ambient vibration and non-stationary signals by ensemble empirical mode decomposition and mahalanobis-squared distance. Journal of Vibration and Control 26(11-12): 1012–1027. https://doi.org/10.1177/1077546319891306

23.

Schwendemann

Amjad

Sikora

(2021) A survey of machine-learning techniques for condition monitoring and predictive maintenance of bearings in grinding machines. Computers in Industry 125: 103380. https://doi.org/10.1016/j.compind.2020.103380

24.

Wang

Yao

Zhao

(2016) Auto-encoder based dimensionality reduction. Neurocomputing 184: 232–242. https://doi.org/10.1016/j.neucom.2015.08.104

25.

Wang

Ding

Xiao

, et al. (2025) A multi-layer network perspective on green finance and clean energy industry synergistic development and mutual reinforcement: mechanism analysis, correlation effect and enhancement path. Renewable Energy 240: 122209. https://doi.org/10.1016/j.renene.2024.122209

26.

Zhan

Zhang

, et al. (2019) Fault feature extraction and diagnosis of rolling bearings based on enhanced complementary empirical mode decomposition with adaptive noise and statistical time-domain features. Sensors 19(18): 4047. https://doi.org/10.3390/s19184047

27.

Zhang

Xiao

, et al. (2020) Non-iterative and fast deep learning: multilayer extreme learning machines. Journal of the Franklin Institute 357(13): 8925–8955. https://doi.org/10.1016/j.jfranklin.2020.04.033

28.

Zhao

Huang

Xiao

, et al. (2024) Week-ahead hourly solar irradiation forecasting method based on ICEEMDAN and TimesNet networks. Renewable Energy 220: 119706. https://doi.org/10.1016/j.renene.2023.119706

29.

Zheng

Cheng

Yang

(2014) Partly ensemble empirical mode decomposition: an improved noise-assisted method for eliminating mode mixing. Signal Processing 96: 362–374. https://doi.org/10.1016/j.sigpro.2013.09.013

30.

Zio

(2022) Prognostics and health management (PHM): where are we and where do we (need to) go in theory and practice. Reliability Engineering & System Safety 218: 108119. https://doi.org/10.1016/j.ress.2021.108119