Uncertainty-driven dynamic ensemble framework for rotating machinery fault diagnosis under time-varying working conditions

Abstract

Multi-scale ensemble learning combines different scales of feature resolution, thereby improving fault diagnostic accuracy. However, the effectiveness of different information scales in characterizing fault features under time-varying speed conditions varies with speed. It is difficult for existing ensemble strategies to ensure the effectiveness of feature information when ensemble multi-scale feature information is involved. Accordingly, we propose an uncertainty-driven dynamic ensemble Bayesian convolutional neural network (DEBCNN) framework. The uncertainty of the results of different scale models was used to dynamically determine their weights in the ensemble framework, which reduced the influence of irrelevant features on the diagnostic results. By employing the proposed dynamic ensemble strategy, the ensemble framework can utilize fault feature information corresponding to different rotational speeds in the final diagnostic results. Experiments on motor and bearing datasets illustrate the superiority of this strategy over other techniques. This study provides useful insights for further research in the field of fault diagnosis of rotating machinery at time-varying speeds.

Keywords

Rotating machinery fault diagnosis dynamic ensemble Bayesian convolutional neural network time-varying working condition

1. Introduction

Rotating machinery, such as motors and bearings, has been utilized extensively in the industrial sector and is a crucial component of manufacturing machinery for the aerospace, wind energy, transportation, and rail industries (Liu et al., 2023; Zhu et al., 2023). However, in the real world, owing to complex working conditions, continuous operation, and other factors, rotating machinery is prone to malfunction. If not diagnosed and repaired immediately, these malfunctions may seriously impact the system (Osornio-Rios et al., 2023; Xu et al., 2024).

Working state recognition, which analyzes the signal characteristics of rotating machines during operation, has been widely used in electric motors, bearings, gearboxes, and other rotating machines. Gangsar and Tiwari summarized in detail the application of signal analysis in motor fault diagnosis (Gangsar and Tiwari, 2020). Ruan et al. accurately diagnosed various types of bearing faults by combining signal processing and artificial intelligence algorithms (Ruan et al., 2023). Zhou and Tang used vibration signal processing and wavelet neural networks to predict the remaining useful life (Zhou and Tang, 2023). Feng reviewed signal processing-based gear wear monitoring, prediction, and health management technologies in detail and showed that vibration-based gear wear monitoring and remaining life prediction technologies have good application prospects (Feng et al., 2023a). Feng used cyclic correntropy and Wasserstein distance to construct gearbox health indicators, combined with a gated recurrent unit (GRU) network to achieve vibration-based prediction of gear health (Feng et al., 2023b).

Under time-varying speed conditions, the signals of rotating machines are no longer periodic, and the spectral features related to the current state of the machine become ambiguous, which further increases the difficulty of fault diagnosis in rotating machines. Therefore, fault diagnosis under time-varying speed conditions has become a popular research topic. Traditionally, order tracking (Fyfe and Munck, 1997) and time–frequency analysis can improve diagnostic accuracy under time-varying speed conditions. Wang et al. proposed computed order tracking (COT) and variational modal decomposition based on the fault diagnosis of rolling bearings under variable speed conditions (Wang et al., 2017). Yu et al. proposed a multi-synchrosqueezing transform (MSST) for machine fault diagnosis under highly time-varying speeds (Yu et al., 2019). However, order tracking requires the extraction of current rotational speed information, time–frequency analysis has a parameter selection process in the middle, and both order tracking and time–frequency analysis require significant computational effort, which makes it difficult to meet the requirements in the context of big data.

Deep learning methods, such as convolutional neural networks (CNNs) (Gong et al., 2024; Tarek and Sameh, 2024), deep belief networks (DBNs) (Pan et al., 2023), and auto-encoders (AE) (Zhao et al., 2023), have been successfully applied to industrial fault diagnosis. However, achieving good diagnostic performance with deep learning methods typically requires large amounts of independent and identically distributed data, which are difficult to obtain in real-world scenarios, especially under variable-speed conditions. During the fault feature learning phase, the nonstationarity of fault features under varying speeds complicates the learning of the fault models. Furthermore, in the diagnosis phase, discrepancies between the learned model features and features to be diagnosed can lead to overfitting. Therefore, improving the ability of models to learn fault features from non-identically distributed data will further improve the diagnostic accuracy of deep-learning-based fault diagnosis methods under time-varying speed conditions.

Ensemble learning integrates multiple learners to create a combined model with higher accuracy than any single model. Wang et al. used ensemble learning to diagnose unbalanced fault data conditions and improved the ability of the model to handle unbalanced data (Wang et al., 2023a). Ye et al. used ensemble learning to integrate multi-sensor data for bearings, which improved the ability of the model to learn fault features under noisy conditions and improved the ability of the model to diagnose faults in such environments (Ye et al., 2024). Li et al. used ensemble learning to enable the model to learn latent fault features from data across different domains, thereby improving the accuracy of cross-domain fault diagnosis (Li et al., 2021). As a result, ensemble learning effectively addresses the degradation of fault diagnostic accuracy caused by overfitting individual models when dealing with non-independent and identically distributed data.

The ensemble strategy is crucial to the success of ensemble learning because it directly determines the contribution of different submodels to the final diagnostic results. Commonly used voting methods in ensemble learning include majority voting (Imane et al., 2023), weighted voting (Wang et al., 2023c), and Dempster–Shafer (DS) evidence theory (Wang et al., 2023b). Zhou et al. (2023) employed a weighted-average strategy to combine two ResNet18 models with two shallow CNN models for fault diagnosis in rotating machinery (Zhou et al., 2023). Xu et al. combined transmission vibration signals from both the time and frequency domains using weighted voting to enhance the robustness and generalization of transmission fault diagnosis (Xu et al., 2022). Wang et al. applied the DS evidence theory in fault diagnosis using multiple sensors and selected key information from multiple sensor features to diagnose the final fault type (Wang et al., 2023b).

However, most of the ensemble strategies mentioned above rely on the accuracy of the submodels to determine their weights in the ensemble model. The main reasons for the degradation of diagnostic performance under time-varying speed conditions is the fact that the model learns fault features from ambiguous data and the learned features deviate from those of the samples to be diagnosed. These two issues lead to uncertainty in the fault diagnosis results under time-varying conditions, which, in turn, reduces the diagnostic accuracy. To address this issue, this study proposes a multi-scale dynamic ensemble strategy based on model uncertainty for the effective diagnosis of rotating machinery faults at time-varying speeds. The main contributions of this study are as follows:

(1) We provide a new perspective by transforming the fault diagnosis problem under variable-speed conditions into an uncertainty-reduction problem. The proposed model could be applied to a range of time-varying speed intervals.

(2) The uncertainty in the diagnostic results for different time-varying speed samples was measured using Bayesian principles and Monte Carlo sampling.

(3) We propose an uncertainty-based dynamic ensemble strategy for the fault diagnosis of rotating machinery under time-varying speed conditions, which addresses the uncertainty of the submodels and improves fault diagnosis performance through more fundamental uncertainty reduction.

2. Methodology

2.1. Bayesian CNNs

The optimum parameters in the standard CNN model are the point estimates of the corresponding parameters, which represent fixed values. These values are obtained through forward propagation of the training samples and backward error updating. Because the parameters are fixed, the uncertainty of the decision cannot be reflected after multiple inputs into the trained model, leading to several diagnoses of the same value and, ultimately, the same decision. While the training optimization objective of the Bayesian CNN (BCNN) model shifts from finding the best parameters to finding the best parameter distribution interval, also known as the posterior distribution $p (λ | D)$ , the model parameters of the BCNN model use probability distributions rather than conventional point estimates (Wang and Yeung, 2016; Zhou et al., 2022).

p (λ | D) = \frac{p (D | λ) p (λ)}{p (D)} = \frac{p (D | λ) p (λ)}{\int_{λ} p (D | λ) p (λ) d λ}

(1)

The likelihood function $p (D | λ)$ represents the probability of the parameter $λ$ if the observed data $D$ are known. $p (D)$ is the prior distribution, which represents the prior belief about the distribution of the parameter before the observed data $D$ . Both $p (D | λ)$ and $p (λ)$ are easy to handle. However, the computation of $p (D)$ is difficult and requires an ensemble over the entire parameter space, because it is the probability of observing the given data in all possible parameter configurations: $\int_{λ} p (D | λ) p (λ) d λ$ . The above equation becomes more challenging to apply when dealing with multidimensional parameter spaces or sophisticated models because the integrals may be so intricate that they cannot be mathematically calculated.

2.2. Variational inference

Variational distribution $q_{φ} (λ)$ , also known as approximation distributions, are generated by variational inference to increase the posterior distribution close to the original parameter $p (λ | D)$ . The Kullback–Leibler (KL) scatter, denoted as follows, measures the discrepancy between the variational distribution and the initial posterior distribution (Li et al., 2022).

K L [q (x) ‖ p (x)] = E_{q (x)} [\log \frac{q (x)}{p (x)}] = \int q (x) \log \frac{q (x)}{p (x)} d x

(2)

Therefore, the KL between the variational distribution $q_{φ} (λ)$ and the true posterior $p (λ | D)$ is

K L [q_{φ} (λ) ∥ p (λ | D)] = K L [q_{φ} (λ) ∥ p (λ)] - E_{q_{φ} (λ)} [\log p (D | λ)] + \log p (D)

(3)

where

K L [q_{φ} (λ) ∥ p (λ)]

, also known as the complexity cost, expresses the similarity between

q_{φ} (λ)

and

p (λ)

E_{q_{φ} (λ)} [\log p (D | λ)]

is called the likelihood cost, which describes how well the data fit the model. The opposite of the sum of the first two terms in the above equation is called the evidence lower-bound objective (ELBO)

E L B O = {E_{q_{φ}}}_{(λ)} [\log p (D | λ)] - KL [q_{φ} (λ) ‖ p (λ)]

. Therefore, the goal of minimizing the KL between the variational distribution and the real posterior translates to maximizing the ELBO process. The following is a definition of the goal function of the distribution parameter

φ

discovered through the variational learning of the weight

q_{φ} (λ)

\begin{array}{l} φ^{*} = \underset{φ}{\arg \min} {KL [q_{φ} (λ) ∥ p (λ)] - {E_{q_{φ}}}_{(λ)} [\log p (D | λ)]} \\ = \underset{φ}{\arg \max} E L B O \end{array}

(4)

Assuming that the parameters are independent of each other, the loss function of the network can be obtained using the Monte Carlo sampling approximation, as follows:

L_{E L B O} (D, φ) \approx \sum_{m = 1}^{M} \log p (D | λ^{(m)}) - \log q_{φ} (λ^{(m)}) + \log p (λ^{(m)})

(5)

where

M

denotes the number of samples from

q_{φ} (λ)

2.3. Dynamic ensemble strategy

The BCNN parameter distribution $λ$ follows the distributional interval of $q_{φ} (λ)$ . When the diagnostic task is performed after the model has been trained, the parameter $λ^{(m)}$ , obtained by sampling $M$ times from the parameter interval $q_{φ} (λ)$ , forms $M$ different networks, where $m = 1, 2, 3 \dots M$ . After inputting the sample $x_{i}$ to be diagnosed into the network consisting of parameter $λ^{(m)}$ , the prediction probability of the current network for different fault types is obtained as $p ({\hat{y}}_{i}^{(m)} | x_{i}, λ^{(m)})$ . Assuming that there are a total of $K$ fault types, the predicted probability for the kth fault type can be obtained as

p ({\hat{y}}_{i}^{(m, k)} | x_{i}, λ^{(m)})

(6)

where m represents the mth sample,

x_{i}

represents the ith sample, and

k

represents the kth fault. After

M

samples, the prediction probability for the kth fault type is obtained as follows:

p ({\hat{y}}_{i}^{k} | x_{i}) = \frac{1}{M} \sum_{m = 1}^{M} p ({\hat{y}}_{i}^{(m, k)} | x_{i}, λ^{(m)})

(7)

The sampling results avoid the influence of individual experiments on the diagnostic results and render the predicted probabilities of different fault types more reliable. The uncertainty in the diagnostic results is reflected in the difference between the predicted probabilities of the correct and incorrect fault labels. When it is difficult for the model to classify the samples, there is a proximity of the predicted probabilities between different fault types, which leads to a higher overall entropy value of the predicted probabilities. However, when the model has an outstanding predicted probability of a certain fault type, the predicted probabilities of the other fault types are extremely small, which leads to a lower overall entropy value of the predicted probabilities. Thus, the uncertainty of the diagnostic results can be measured using the total entropy of the sampled individual networks.

\begin{array}{l} E [\hat{y} | x_{i}] = - \sum_{k = 0}^{K} p ({\hat{y}}_{i}^{k} | x_{i}) \log (p ({\hat{y}}_{i}^{k} | x_{i})) \\ = - \sum_{k = 0}^{K} (\frac{1}{M} \sum_{m = 1}^{M} p ({\hat{y}}_{i}^{(m, k)} | x_{i}, λ^{(m)})) \log (\frac{1}{M} \sum_{m = 1}^{M} p ({\hat{y}}_{i}^{(m, k)} | x_{i}, λ^{(m)})) \end{array}

(8)

The uncertainty of all diagnostic samples can be measured as the sum of the entropy values of all samples:

E = \sum_{i = 1}^{I} E [\hat{y} | x_{i}]

(9)

where I represents the sample size. Normally, the higher the entropy value, the higher the degree of uncertainty, the lower the validity of its diagnostic results, and the lower the weight it should have in the ensemble model. Therefore, in the ensemble strategy proposed in this study, the inverse of the entropy was used as the weight:

w_{l} = \frac{1}{E}

(10)

where l represents the lth submodel and

l = 1, 2 . . . L

. After obtaining the weights

w_{l}

of the different submodels, the final prediction of the ensemble model, the dynamic ensemble BCNN (DEBCNN), was obtained by combining the predictions of the submodels. For the sample

x_{i}

, the final prediction in the ensemble model is

\begin{array}{l} P^{k} = \sum_{l = 1}^{L} w_{l} \times p_{l} ({\hat{y}}_{i}^{k} | x_{i}) = \sum_{l = 1}^{L} \frac{1}{E} \times p_{l} ({\hat{y}}_{i}^{k} | x_{i}) \\ = \sum_{l = 1}^{L} \frac{1}{- \sum_{k = 0}^{K} (\frac{1}{M} \sum_{m = 1}^{M} p_{l} ({\hat{y}}_{i}^{(m, k)} | x_{i}, λ^{(m)})) \log (\frac{1}{M} \sum_{m = 1}^{M} p_{l} ({\hat{y}}_{i}^{(m, k)} | x_{i}, λ^{(m)}))} \\ \times \frac{1}{M} \sum_{m = 1}^{M} p_{l} ({\hat{y}}_{i}^{(m, k)} | x_{i}, λ^{(m)}) \end{array}

(11)

where $P^{k}$ represents the final prediction probability of the ensemble model for the fault-type $k$ , $l$ represents the lth sub-BCNN model, and $m$ represents the mth sampling.

The final predicted fault label is

L a b e l = \arg \max (P^{k})

Figure 1 shows a flowchart of the classification framework. Three CNNs, each with convolutional layer kernels of different sizes, are combined using an ensemble approach. The contribution of each classifier to the results is determined using its predicted uncertainty.

Figure 1.

Dynamic ensemble strategy.

2.4. Proposed structure for fault diagnosis

Figure 2 depicts the DEBCNN-based fault diagnosis framework for rotating machinery under time-varying speed conditions. This process consists of four main steps.

Step 1: Collect experimental data: The working conditions of the rotating machinery with different types of faults under time-varying speed conditions are simulated to collect vibration signals.

Step 2: Data split: Because signals are continuously collected, the continuous working signals are divided into multiple samples of a certain length.

Step 3: DEBCNN model training: Using the vibration signal as input, produce the corresponding outputs of each submodel. The contributions of the submodels in the ensemble model are determined based on the uncertainty of their results to complete the model output. Complete the model structure training and save the network parameters using repeated iterative optimization.

Step 4: Fault diagnosis and analysis of the results: Based on the saved ensemble model, the predicted probabilities for different fault types finally yield the diagnosis type, providing the corresponding magnitude of uncertainty.

Figure 2.

Running flow of the proposed framework.

3. Test verification

3.1. Case 1: Motor dataset

3.1.1. Dataset description

The data used in Case 1 were obtained from the laboratory bench shown in Figure 3, which consists of a motor, bearing, gearbox, and brake. Eight different motor health conditions were simulated in the experiment: healthy motor (HM), motor with rotor unbalance (MRU), motor with rotor eccentricity (MRE), motor with rotor bending (MRB), motor with bearing fault (MBF), motor with broken bar (MBB), motor with winding short circuit (MWSC), and motor with voltage imbalance (MVI).

Figure 3.

The test bench of the motor.

The acceleration sensor continuously sampled the motor-operating signals of different health states for 20 s at a sampling frequency of 25,600 Hz. During this acquisition period, the motor speed was accelerated to 3000 r/min and then decelerated to 600 r/min. Figure 4 shows time-domain graphs of the vibration signals recorded for various motor health states. Given that the gathered vibration data are long time series signals, a sliding segmentation technique was applied to obtain more training samples. A sliding window sampling method was used to divide the original continuous signal into multiple samples. Each sample had a length of 4,096, and the sliding window step was set to 1,024. After sliding segmentation, there were 3,976 samples overall and 497 examples of a single fault type.

Figure 4.

Vibration signals for the health state of eight different motors under variable speed conditions.

3.1.2. Results and discussion

Various comparative tests were conducted to verify the superiority of the proposed method. The comparison methods included existing multi-scale ensemble strategies and advanced time-varying speed fault diagnosis methods. The different comparison methods are as follows:

(1) EMD-MSE-DT: The number of intrinsic mode functions (IMFs) for empirical mode decomposition (EMD) is 10; the scale of multi-scale sample entropy (MSE) is 20; the decision tree (DT) is a C4.5 decision tree algorithm with 12 trials and gain ratio = 1.12 (Wu et al., 2016).

(2) RP-CNN: The recurrence plot (RP) image size is 128, and the CNN model has three network layers with 32, 64, and 128 convolutional kernels (Wang et al., 2020).

(3) STFT-EfficientNetv2: The short-time Fourier transform (STFT) image size is 128, and EfficientNetv2 mainly consists of mobile inverted bottleneck convolution (MBconv) and fused mobile inverted bottleneck convolution (Fused-MBconv) (Qu et al., 2022).

(4) Voting ensemble: Categories voted by submodels as final predictions.

(5) Averaging ensemble: Calculate the mean of the predictions of various models as the prediction of the ensemble model, with all models having the same weight in the ensemble model.

(6) C-CNN: A cascade-CNN (C-CNN) is constructed based on the CNN architecture and is composed of a dilated convolution layer and cascade structure (Wang et al., 2021).

(7) DS theory ensemble: The output generated by each CNN model is considered an evidence source, where the probability distribution of each category can be considered as an evidence distribution. These evidence distributions will be transformed into the probability mass function of the DS theory.

(8) MK-ResCNN: The multi-scale kernel residual CNN (MK-ResCNN) consists of three parallel ResCNN modules, each consisting of 64, 128, and 256 residual convolution blocks (Liu et al., 2020).

The diagnostic accuracy of different diagnostic methods and the variance of multiple experiments are shown in Figure 5. The EMD-MSE-DT method is constrained by parameters such as the number of IMFs in EMD and the scale used for MSE. Furthermore, the DT fault diagnosis model struggles to extract deep fault characteristics, resulting in the lowest performance in terms of accuracy and stability in multi-diagnosis scenarios. In contrast, RP and STFT feature maps effectively capture the distinctive characteristics of various fault types, and the integration of CNNs further enhances fault diagnosis accuracy by leveraging the advantages of two-dimensional (2D) images. The ensemble of multi-scale characteristics within the model significantly improved the precision of the fault diagnosis. Specifically, the combination of the C-CNN, DS evidence theory, and MK-ResCNN achieved a diagnostic accuracy of 98%. The accuracy and stability of multiple diagnoses can be further enhanced by optimizing a multi-scale ensemble strategy. The proposed method employs a dynamic ensemble strategy that adjusts the weights of various scales to adapt to different rotational speeds, thereby reducing the impact of features interfering with the diagnostic results while incorporating multi-scale features.

While focusing on the accuracy of diagnostic results, one should also emphasize their reliability. The horizontal coordinate in the Figure 6 indicates the uncertainty of the diagnostic results, and the vertical coordinate represents the density corresponding to the uncertainty. Compared with the other three strategies, the uncertainty of the proposed ensemble strategy is closer to zero. Under time-varying speed conditions, the difference in fault characteristics of the same fault type owing to varying speeds directly leads to uncertainty in the diagnostic results. A concentration of uncertainty near zero suggests a large gap between the probabilities of the correct and incorrect labels in the prediction results, thereby reducing the probability that the current diagnosis is incorrect. Consequently, the accuracy of the diagnostic results and the reliability of the corresponding results across multiple experiments improve.

Figure 5.

Diagnostic results of the proposed method and the comparison method: (a) accuracy and (b) variance.

Figure 6.

Comparison of the proposed method with the uncertainty of the diagnostic results of existing ensemble strategies.

3.1.3. Anti-noise performance test

To evaluate the performance of the proposed method, different levels of Gaussian white noise were randomly added to the raw signals of motors with various fault types. The magnitude of background noise is typically measured using the signal-to-noise ratio (SNR), which is defined as the ratio of the useful signal power to the noise power in the signal and is expressed in decibels (dB).

The results are shown in Figure 7. As the noise in the signal increases (SNR gradually decreases), the complexity of the signal gradually increases, and the original signal features are submerged, making it more difficult for the diagnostic model to extract fault features. However, the proposed method consistently achieves stable diagnostic results under different noise conditions, and the diagnostic accuracy of the DEBCNN remains close to 97% even at SNR = −6 dB.

Figure 7.

Diagnostic accuracy of the DEBCNN model on the motor dataset under different noise conditions.

To further compare the adaptability of different scale models under noisy conditions, single-scale, two-scale, three-scale, and four-scale models were built by setting up networks with convolutional kernel sizes of 3, 5, 7, and 9, respectively, and tested at SNR = −2 dB. The t-distributed stochastic neighbor embedding (T-SNE) results for the test sets of different models are shown in Figure 8. Compared with the multi-scale model, the single-scale model struggles to distinguish each fault type effectively. As the scale increases, the degree of differentiation among different fault types improves, as shown in Figure 8(c), where the fault types are well differentiated and the overlap between fault categories is reduced. Even in noisy environments, the three-scale ensemble model achieved diagnostic results similar to those under the five-noise condition. Increasing the number of scale features did not improve the ability of the model to discriminate between different fault types. Instead, the time required for the diagnostic process increases. Therefore, a three-scale dynamic ensemble model was used in this study.

Figure 8.

The t-distributed stochastic neighbor embedding (T-SNE) visualization results of different scale models: (a) single-scale; (b) two-scale; (c) three-scale; (d) four-scale.

3.2. Case 2: Bearing dataset

3.2.1. Dataset description

The University of Ottawa time-varying speed bearing fault dataset was gathered on an experiment bench, as shown in Figure 9 (Huang and Baddour, 2018). The motor powers the shaft, while the AC drives, controls, and adjusts its speed. The healthy and experimental bearings, both of type ER16 K, support the left and right sides of the shaft. Various experimental bearings were changed to simulate multiple health states, including the normal state (NOR), ball fault (BF), outer ring fault (OF), inner ring fault (IF), and mixed fault (MF) containing the above three fault types. The vibration signals of the experimental bearing at various health stages were recorded using an accelerometer (Type 623C01) at 200 KHz. The shaft speed is measured using incremental encoders. Within 10 s, the experimental bearings in various health stages finished simulating time-varying rotational speeds; hence, the accelerometers continually recorded vibration signals for 10 s.

Figure 9.

Bearing test platform of Ottawa University.

For each bearing health status, four time-varying rotational speed tests were simulated, and the entire experimental set is presented in Table 1. Acceleration, deceleration, acceleration after deceleration, and deceleration after acceleration are the four time-varying rotational speed tests. Each set of experiments was used to generate independent datasets. Time-domain images of a healthy bearing at four time-varying rotational speeds are shown in Figure 10.

Table 1.

Different operating mode settings for the Ottawa bearing fault dataset with five health conditions.

Health conditions	Category	Speed range (HZ)
Health conditions	Category	Data A	Data B	Data C	Data D
NOR	0	15.02-26.99	30.22-13.78	15.65-25.04-19.53	26.15-16.57-26.15
BF	1	13.02-27.90	27.90-14.74	14.48-22.18-13.95	26.16-16.57-23.05
OF	2	13.48-26.64	24.41-10.27	13.95-23.54-17.01	24.41-14.83-19.53
IF	3	13.48-28.73	26.15-11.96	15.02-22.18-13.48	23.05-15.65-23.05
CF	4	13.09-27.90	26.15-12.20	15.02-22.78-15.02	23.05-15.52-23.43

Figure 10.

Time-domain diagrams of normal-state bearings under four variable-speed conditions: (a) acceleration, (b) deceleration, (c) acceleration and then deceleration, and (d) deceleration and then acceleration.

The raw vibration signals were sample-sliced following 4,096 with 488 samples of various state categories in distinct datasets. Four tasks were performed during the experiment. One dataset was chosen for model training in each task, and the remaining three datasets were used for model testing, with the diagnostic procedure training data and testing data used under various working conditions. Table 2 shows particular tasks.

Table 2.

Detailed datasets for four tasks.

Diagnosis tasks	Train data	Test data
Task 1	Data A	Data B + Data C + Data D
Task 2	Data B	Data A + Data C + Data D
Task 3	Data C	Data A + Data B + Data D
Task 4	Data D	Data A + Data B + Data C

3.2.2. Results and discussion

The diagnostic accuracies of the proposed method and eight comparison methods for the four tasks are shown in Figure 11. Each method executed 10 diagnostic tests for different tasks to obtain the final average accuracy and variance.

Figure 11.

Diagnostic results of the proposed and compared methods in four tasks: (a) accuracy and (b) variance.

Compared with the motor dataset, although the bearing dataset had only five fault types, different time-varying speed working conditions were added. On the one hand, the difference between the acceleration and deceleration work process types increases the variance between the data used for model training, which makes it more difficult to extract fault features. On the other hand, owing to the differences in the data distribution during the testing of other work process data, the fault features extracted by the model during the testing phase further affect the effectiveness of the fault diagnosis. Because of its ability to filter relevant features from different resolution features and reduce interfering features, the proposed method provides higher diagnostic accuracy than other state-of-the-art fault diagnosis methods under time-varying speeds and multiple work process conditions. The superiority of the proposed strategy is further demonstrated by the volatility of multiple experimental results, with lower volatility indicating that the proposed method can effectively reduce the impact of different acceleration and deceleration processes on the diagnostic results.

Regarding the uncertainty of the diagnostic results, the ensemble results of the proposed ensemble strategy were close to zero for all four diagnostics tasks, as shown in Figure 12. For task b, the accuracies of the averaging and voting ensembles were 98.41% and 97.97%, respectively. However, the distribution of uncertainty for these two results is close to the same, which suggests that although the average ensemble has higher accuracy in general, there is a phenomenon similar to voting when it comes to a single experiment, because the voting ensemble is similar to struggling between the correct and wrong fault types, with insufficient confidence in the diagnostic results. The diagnostic accuracy reflects the proportion of correct fault types in all sample diagnostic results, whereas the uncertainty demonstrates the confidence of a particular diagnostic result in the model. The high accuracy and low uncertainty of the proposed ensemble strategy indicate that the model is stable, reliable, capable of making safe and accurate predictions in most cases, and effective in reducing the impact of time-varying speed on fault diagnostic results.

Figure 12.

Uncertainty distribution of the diagnostic results of the proposed ensemble and comparison strategies in the four diagnostic tasks.

3.2.3. Anti-noise performance test

Simulated interference noise with an SNR of −2 dB is added to the original signals of the four tasks; then the anti-noise interference performance of the DEBCNN model is tested in each of the four tasks. The DEBCNN achieved diagnostic accuracies of 98.72%, 98.74%, 98.28%, and 98.22% in the four noise-containing diagnostic tasks, as shown in Figure 13, and the diagnostic accuracies were close to those of the tasks with no noise added, indicating that the DEBCNN model has good noise resistance. The presence of noise further increases the difference in the effectiveness of multi-scale features in the task of diagnosing fault types; therefore, it is necessary to reduce the influence of interference features on the final diagnostic results and increase the weight of useful features when fusing multi-scale features using the dynamic ensemble strategy, which ensures that the influence of the noise interference component is reduced while combining multi-scale features.

Figure 13.

Diagnostic accuracy of the DEBCNN model under different noise conditions.

Figure 14.

T-SNE visualization results of different scale models in four tasks: (a) Task 1, (b) Task 2, (c) Task 3, and (d) Task 4.

From the T-SNE results in Figure 14, it can be observed that the multi-scale model had better noise resistance than the single-scale model for the four diagnostic tasks of the bearing dataset. Additionally, the diagnostic accuracies of the single-scale, two-scale, three-scale, and four-scale models were 93.83%, 95.67%, 98.72%, and 98.69%, respectively, for Task 1. It was difficult to distinguish the fault type composite faulty bearing and inner ring faulty bearing for the single-scale and two-scale models, whereas the three-scale model significantly improves this problem, reduces the misdiagnosis rate between different faults, and improves the accuracy of fault diagnosis. There is a certain coupling relationship between the inner ring fault and the composite fault, and it is difficult to extract the fault characteristics from a single scale only to achieve effective differentiation. The combination of different scale characteristics can improve the degree of differentiation of similar fault types.

4. Conclusion

This study proposes a dynamic ensemble strategy based on uncertain driving for the fault diagnosis of rotating machinery under time-varying speed conditions. Based on the principle of a BCNN, the uncertainty of the fault diagnosis results under time-varying speed conditions was measured. The robustness of the prediction results was improved through multi-scale features, and the uncertainties of different features were further used to determine their weights in the ensemble framework. This method can remove speed-sensitive features and retain fault features, thereby ensuring the reliability of the fault diagnosis results under time-varying speed conditions. The analysis results verified the effectiveness of the proposed strategy, and compared with traditional ensemble strategies, the proposed method has advantages in terms of diagnostic accuracy and corresponding uncertainty.

Future work will explore the performance of this methodology under multi-component coupled faults and small sample conditions.

Footnotes

Acknowledgments

The authors thank the editors and anonymous reviewers for their helpful comments.

Declaration of conflicting interests

The author(s) declare no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclose the receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China (51879056) and the Shandong Provincial Natural Science Foundation (ZR2023QE009). The author also thanks the mentor for suggesting changes to the content of the paper.

ORCID iDs

Renjie Zhu

Chong Yao

Yun Ke

References

Feng

, et al. (2023a) A review of vibration-based gear wear monitoring and prediction techniques. Mechanical Systems and Signal Processing 182: 109605.

Feng

, et al. (2023b) A novel vibration-based prognostic scheme for gear health management in surface wear progression of the intelligent manufacturing system. Wear 522(Special Issue: 24th International Conference on Wear of Materials): 204697.

Fyfe

Munck

EDS

(1997) Analysis of computed order tracking. Mechanical Systems and Signal Processing 11(2): 187–205.

Gangsar

Tiwari

(2020) Signal based condition monitoring techniques for fault detection and diagnosis of induction motors: a state-of-the-art review. Mechanical Systems and Signal Processing 144: 106908.

Gong

Zhi

Gao

, et al. (2024) IGFT-MHCNN: an intelligent diagnostic model for motor compound faults based decoupling and denoising of multi-source vibration signals. Journal of Vibration and Control: 10775463241245816.

Huang

Baddour

(2018) Bearing vibration data collected under time-varying rotational speed conditions. Data in Brief 21: 1745–1749.

Imane

Rahmoune

Zair

, et al. (2023) Bearing fault detection under time-varying speed based on empirical wavelet transform, cultural clan-based optimization algorithm, and random forest classifier. Journal of Vibration and Control 29(1–2): 286–297.

Song

Jia

, et al. (2021) Intelligent fault diagnosis by fusing domain adversarial training and maximum mean discrepancy via ensemble learning. IEEE Transactions on Industrial Informatics 17(4): 2833–2841.

Wang

(2022) A Bayesian deep learning approach for random vibration analysis of bridges subjected to vehicle dynamic interaction. Mechanical Systems and Signal Processing 170: 108799.

10.

Liu

Wang

Yang

, et al. (2020) Multiscale kernel based residual convolutional neural network for motor fault diagnosis under nonstationary conditions. IEEE Transactions on Industrial Informatics 16(6): 3797–3806.

11.

Liu

Cui

Wang

(2023) Rotating machinery fault diagnosis under time-varying speeds: a review. IEEE Sensors Journal 23(24): 29969–29990.

12.

Osornio-Rios

Zamudio-Ramírez

Jaen-Cuellar

, et al. (2023) Data fusion system for electric motors condition monitoring: an innovative solution. IEEE Industrial Electronics Magazine 17(4): 4–16.

13.

Pan

Wang

Chen

, et al. (2023) Fault recognition of large-size low-speed slewing bearing based on improved deep belief network. Journal of Vibration and Control 29(11–12): 2829–2841.

14.

Yang

Shen

, et al. (2022) Fault diagnosis of rolling bearing under time-varying speed conditions based on EfficientNetv2. Measurement Science and Technology 33(6): 065023.

15.

Ruan

Wang

Yan

, et al. (2023) CNN parameter design based on fault signal analysis and its application in bearing fault diagnosis. Advanced Engineering Informatics 55: 101877.

16.

Tarek

Sameh

(2024) Improved deep-learning rotor fault diagnosis based on multi vibration sensors and recurrence plots. Journal of Vibration and Control: 10775463241250367.

17.

Wang

Yeung

D-Y

(2016) Towards Bayesian deep learning: a framework and some existing methods. IEEE Transactions on Knowledge and Data Engineering 28(12): 3395–3408.

18.

Wang

Zhao

, et al. (2023b) Multisensor fault diagnosis via Markov chain and Evidence theory. Engineering Applications of Artificial Intelligence 126: 106851.

19.

Wang

Yang

Xiang

, et al. (2017) A hybrid approach to fault diagnosis of roller bearings under variable speed conditions. Measurement Science and Technology 28(12): 125104.

20.

Wang

D-F

Guo

, et al. (2020) Planetary-gearbox fault classification by convolutional neural network and recurrence plot. Applied Sciences 10(3): 932.

21.

Wang

Liu

, et al. (2021) Cascade convolutional neural network with progressive optimization for motor fault diagnosis under nonstationary conditions. IEEE Transactions on Industrial Informatics 17(4): 2511–2521.

22.

Wang

Liu

Jia

, et al. (2023c) Incipient fault diagnosis of analog circuit with ensemble HKELM based on fused multi-channel and multi-scale features. Engineering Applications of Artificial Intelligence 117: 105633.

23.

Wang

Zhang

, et al. (2023a) An ensemble method with DenseNet and evidential reasoning rule for machinery fault diagnosis under imbalanced condition. Measurement 214: 112806.

24.

T-Y

C-L

Liu

D-C

(2016) On multi-scale entropy analysis of order-tracking measurement for bearing fault diagnosis under variable speed. Entropy 18(8): 292.

25.

Bashir

Zhang

, et al. (2022) An intelligent fault diagnosis for machine maintenance using weighted soft-voting rule based multi-attention module with multi-scale information fusion. Information Fusion 86–87: 17–29.

26.

Teoh

Ibrahim

(2024) A deep learning approach for electric motor fault diagnosis based on modified InceptionV3. Scientific Reports 14(1): 12344.

27.

Yan

Jiang

, et al. (2024) MIFDELN: a multi-sensor information fusion deep ensemble learning network for diagnosing bearing faults in noisy scenarios. Knowledge-Based Systems 284: 111294.

28.

Wang

Zhao

(2019) Multisynchrosqueezing transform. IEEE Transactions on Industrial Electronics 66(7): 5441–5455.

29.

Zhao

Jia

Shao

(2023) A novel conditional weighting transfer Wasserstein auto-encoder for rolling bearing fault diagnosis with multi-source domains. Knowledge-Based Systems 262: 110203.

30.

Zhou

Tang

(2023) A wavelet neural network informed by time-domain signal preprocessing for bearing remaining useful life prediction. Applied Mathematical Modelling 122: 220–241.

31.

Zhou

Han

Droguett

(2022) Towards trustworthy machine fault diagnosis: a probabilistic Bayesian deep learning framework. Reliability Engineering & System Safety 224: 108525.

32.

Zhou

Yan

Huang

, et al. (2023) Hob vibration signal denoising and effective features enhancing using improved complete ensemble empirical mode decomposition with adaptive noise and fuzzy rough sets. Expert Systems with Applications 233: 120989.

33.

Zhu

Lei

, et al. (2023) A review of the application of deep learning in intelligent fault diagnosis of rotating machinery. Measurement 206: 112346.