Application of adversarial reciprocal point learning in open set fault diagnosis of rolling bearing

Abstract

In industrial scenarios, bearing operating conditions are complex and unknown faults may occur unexpectedly, which usually reflect some new modes. This can result in the recognition failure of traditional intelligent algorithms based on the closed set assumption. To address this issue of open set fault diagnosis (OSFD), an OSFD approach of rolling bearing is proposed based on adversarial reciprocal point learning (ARPL) and efficient multi-scale attention (EMA). First, ARPL is introduced to diagnose the bearing faults under open set scenarios, which considers the deep distribution of unknown classes in learners by using an adversarial mechanism, achieving better open set recognition ability. Then, the EMA is employed to improve the open set classification performance of the ARPL model by interacting with information without channel dimensionality reduction. Finally, the effectiveness of the proposed method is evaluated on the bearing datasets. The experimental results show that the proposed ARPL-EMA model can effectively identify the unknown faults and its OSFD performance is superior to the comparative methods.

Keywords

Open set fault diagnosis adversarial reciprocal point learning attention mechanism rolling bearing

1. Introduction

The rotatory machinery are widely used in the industrial fields, and the fault diagnosis of rotating machines has received extensive attention for many years because a reliable fault diagnosis system can considerably reduce the great loss resulted from the unplanned downtime (Chen et al., 2023b; Zhao et al., 2020). Rolling bearing is the most widely used and the most easily damaged mechanical component in the rotating machines. Its health condition directly influences the safe and reliable operation of the whole machine (Jiao et al., 2024). Therefore, the fault diagnosis of rolling bearing is of great importance to improve the safety and economy of the entire mechanical system.

With the widespread application of artificial intelligence, many intelligent fault diagnosis methods of rolling bearing have been developed during the past decades (Han et al., 2018; He et al., 2023). Among these intelligent approaches, the deep learning (DL) ones, which have powerful feature learning capability and nonlinear transformation ability, have gained great success in bearing fault classification and recognition (Men et al., 2024; Zhang et al., 2023b). However, most of the previously proposed intelligent fault diagnosis methods are under the closed set assumption that the training samples and testing samples share the same labels (Geng et al., 2020). But in the realistic industrial scenarios, due to the harsh environment and complex operation conditions of machinery, some new bearing fault types that never appeared may happen unexpectedly, so the diagnosis performance of the traditional DL-based methods will be greatly degraded when handling this open set fault diagnosis (OSFD) problem. Therefore, an effective OSFD method of rolling bearing is urgently required to accurately classify the known fault types and recognize the unknown fault types simultaneously.

To address the challenges of OSFD, diverse methods have been put forward and these methods can be mainly divided into two categories: discriminative and generative methods (Chen et al., 2024; Geng et al., 2020). The main goal of discriminative methods is to obtain explicit boundary information between the known and unknowns. Zhang et al. (2022) put forward an OSFD approach for rolling bearing based on an improved OpenMax and developed an open set convolutional neural network to accurately recognize new bearing fault classes. Yu et al. (2021) utilized the extreme value theory (EVT) to couple the deep model to realize known fault classification as well as the unknown fault detection, and further achieved cross-domain OSFD by designing the different weights distribution. In Zhang et al. (2023a), the distances of intra-class and inter-class samples were computed by means of multi-sample distances fusion, and generalized Pareto distributions of these two distance distributions were estimated by EVT to identify the unknown faults. Based on the subspace learning method, the literature (Tian et al., 2018) achieved the open set fault recognition by adaptively setting the threshold according to the test data. Mei et al. (2024) realized the rejection of unknown faults by means of variational encoder classifier and EVT. Zhang et al. (2024) constructed a discrimination framework to strengthen the model’s ability of recognizing the unknown samples, and used neighborhood clustering learning to accomplish cross-domain fault diagnosis. Chen et al. (2023a) developed a pair of discriminators based on EVT theory and Shannon entropy to decide whether the sample is unknown or not. The posteriori inference method was introduced by Wu et al. (2024) to obtain the open set recognition weight for successful identification of unknown classes. To deal with the issue of open set domain adaptation, Wang et al. (2024) adopted the feature clustering and separation strategy to distinguish known and unknown categories. By contrast, generative methods obtain the detection ability of unknown fault by generating negative samples to simulate the open set. In the study by Liu et al. (2023), the traditional generative adversarial network (GAN) was improved by a regularization module and used for the generation of open data, transforming the issue of open set into the pseudo-closed set problem. In Peng et al. (2022), the algorithm of soft Brownian offset sampling and shrinkage autoencoder were employed to generate negative samples, and the generated samples were utilized in the model training phase to achieve better unknown fault identification. Sun et al. (2023) utilized prototype network and reconstruction network for obtaining reconstructed signals, and calculated the correlation coefficients between reconstructed signals and original signals to recognize the unknown faults.

Although various discriminative and generative OSFD approaches have been developed for the successful open set fault recognition of rolling bearing, they may all exhibit some of their own limitations and most of them do not consider the deep distribution of unknown classes in learners, which will lead to potential open space risk (Chen et al., 2022). To handle the challenges of open set recognition (OSR), Chen et al. (2022) proposed a new OSR method called the adversarial reciprocal point learning (ARPL) to model the potential unknown space and estimate the unknown distribution from the open space. Different from the existing OSR approaches, ARPL designs a novel concept of reciprocal point to formulate the open space risk, which is contrary to the prototype of the known category and beneficial to the overlap reduction of known and unknown classes. Additionally, an adversarial mechanism between the reciprocal points and known classes is developed to generate confusing samples to improve the unknown distribution discriminative ability of the model. By means of this, ARPL can obtain excellent open set classification performance compared with traditional OSR methods. Based on this consideration, ARPL is introduced in this study to identify known and unknown faults of rolling bearing under open set scenarios.

On the other hand, whether the feature extractor is able to extract informative features has huge influence on the model’s diagnosis performance (Ren et al., 2023). The rolling bearing usually operates under complex conditions, and bearing vibration signal with strong noise increases the difficulty of feature extraction of the model. For this reason, it is necessary to further enhance the OSFD performance of ARPL from the point of view of feature extraction. Because of the ability of extracting features containing key information, different kinds of attention mechanisms have been adopted to improve the feature extraction capability of the models in the field of fault diagnosis (Tang et al., 2023; Xie et al., 2023). Efficient multi-scale attention (EMA) (Ouyang et al., 2023) is a newly-developed attention module which can effectively enhance the models’ feature representation ability. In comparison with the widely used coordinate attention (CA) (Tong et al., 2023) and convolutional block attention module (CBAM) (Li et al., 2022), EMA owns better feature extraction ability and is more efficient in term of computational overhead by using a universal convolution to avoid channel dimensionality reduction. In order to extract more key state information under complex and variable running conditions of rolling bearing, in this paper, the EMA module is incorporated into the ARPL model to further strengthen the bearing OSFD performance.

To sum up, a new open-set fault diagnosis method for rolling bearing is proposed in this paper based on the ARPL model with EMA improvement. The contributions of this study are as follows: First, the ARPL method is introduced for the open set fault identification of rolling bearing. Unlike the traditional OSR methods, ARPL models the deep distribution of unknown classes in learners to minimize the overlap of known distribution and unknown distributions using the reciprocal point and adversarial mechanism, which could make it achieve better bearing fault diagnosis performance under open set situations; Second, a new feature extractor module is constructed by incorporating EMA into the ARPL model to improve the feature extraction capability on key state information, further enhancing the open set recognition performance of ARPL; Third, extensive experiments on our own dataset and different OSFD tasks validate the enhanced performance and superiority of the proposed ARPL-EMA approach. The rest of this article is organized as follows. An overview of the proposed methodology is introduced in theoretical background section. The following section describes the proposed method of fault diagnosis. In experimental study Section, the effectiveness of the proposed method is verified on the experimental dataset of our own. The conclusion is given in conclusion Section.

2. Theoretical backgrounds

2.1. Adversarial reciprocal point learning

2.1.1. Reciprocal point

A set of n-labeled samples $D_{L} = {(x_{1}, y_{1}), \dots, (x_{n}, y_{n})}$ , with $N$ known classes, in which $y_{i} \in {1, \dots, N}$ is the label of $x_{i}$ , test set $D_{T} = {t_{1}, \dots, t_{u}}$ , ${1, \dots, N} \cup {N + 1, \dots, N + U}$ is the label of $t_{i}$ , $U$ is the number of unknown classes, and the deep embedded space of the class is denoted by $S_{K}$ , and the corresponding open space is denoted by $O_{K}$ . $O_{K}$ is divided into two subspaces: the positive open space $O_{k}^{p o s}$ from other known classes, and the remaining infinite unknown space $O_{k}^{n e g}$ is the negative open space, that is, $O_{k} = O_{k}^{p o s} \cup O_{k}^{n e g}$ .

Samples of $k$ classes $D_{L}^{k} \in S_{k}$ , samples from other known classes that are not $k$ classes $D_{L}^{\neq k} \in O_{k}^{p o s}$ , and samples $D_{U} \in O_{k}^{n e g}$ from $R^{d}$ in addition $D_{L}$ are defined as positive training classes, negative training classes, and potentially unknown classes, respectively.

The reciprocal point $p^{k}$ of the class $k$ is taken as a potential representation of the sample $D_{L}^{\neq k} \cup D_{U}$ . Therefore, the sample of $O_{K}$ should be closer to the reciprocal point $p^{k}$ than the sample of $S_{K}$ , with the formula

\max (ζ (D_{L}^{\neq k} \cup D_{U}, P^{k})) \leq d, \forall d \in ζ (D_{L}^{k}, P^{k})

(1)

where

ζ (\cdot, \cdot)

represents the distance set of all samples between the two sets.

The reciprocal point of the class is represented by an m-dimensional vector, optimized by a depth embedding function $C$ with parameters $θ$ . The distance $d (C (x), p^{k})$ between sample $x$ and reciprocal point $p^{k}$ can be calculated.

\begin{array}{l} d_{e} (C (x), P^{k}) = \frac{1}{m} \cdot {‖ C (x) - P^{k} ‖}_{2}^{2} \\ d_{d} (C (x), P^{k}) = C (x) \cdot P^{k} \\ d (C (x), P^{k}) = d_{e} (C (x), P_{i}^{k}) - d_{d} (C (x), P^{k}) \end{array}

(2)

In both spatial position and angular direction, every known class is opposite to its reciprocal point.

According to the nature of reciprocal points, the probability that the sample $x$ belongs to the class $k$ is proportional to the difference between $C (x)$ and reciprocal points $p^{k}$ , and the final probability can be denoted as

P (y = k ∣ x, C, P) = \frac{e^{γ d (C (x), P^{k})}}{\sum_{i = 1}^{N} e^{γ d (C (x), P^{k})}}

(3)

where $γ$ is a hyperparameter, which controls the conversion between distance and probability. The parameter $θ$ is learned by minimizing the negative logarithmic probability of the class $k$ .

L_{c} (x; θ, P) = - \log p (y = k ∣ x, C, P)

(4)

Therefore, there is an overlap between $S_{k}$ and $O_{k}$ , that is, there is an open space risk.

2.1.2. Open set loss

In multi-class OSR scenarios, the open spaces of multiple classes are united into one global open space $O_{G}$

O_{G} = \cap_{k = 1}^{N} (O_{k}^{pos} \cup O_{k}^{n e g})

(5)

where total open space risk can be limited by restricting open space risk in each known category.

To separate $S_{k}$ and $O_{k}$ , the formula is built as follows:

\max (ζ (D_{L}^{\neq k} \cup D_{U}, P^{k})) \leq R

(6)

Then, the open space risk can be bounded by constraining the aforementioned distance as

L_{o} (x; θ, P^{k}, R^{k}) = \max (d_{e} (C (x), P^{k}) - R, 0)

(7)

where $R$ is a learnable margin.

By combining equations (4) and (7) to address simultaneously the empirical classification risk and open space risk, the total loss function can be designated as

L (x, y; θ, P, R) = L_{c} (x; θ, P) + λ L_{o} (x; θ, P, R)

(8)

where $λ$ stands for the weight of the adversarial margin loss.

2.2. Instantiated adversarial enhancement

To further reduce the open space risk generated by the confusing unknown data, the ARPL uses a new training strategy to learn a confused generator to improve the discriminant ability of the classifier for various new distributions.

2.2.1. Confused generator learning

Different from the traditional GAN, the generator is employed to recover the confusing samples from $O_{G}$ rather than known samples from $S_{k}$ . Figure 1 shows the instantiated adversarial enhancement framework which mainly includes three parts: the classifier C, the discriminator D, and the generator G.

Figure 1.

The framework of the confused generator training.

Given the generator’s outputs $G (z)$ , ${z_{1}, z_{2} \dots, z_{n}}$ from a prior distributions $P_{p r i} (z)$ and known samples ${x_{1}, x_{2} \dots, x_{n}}$ , the real and generated samples can be discriminated by the optimized discriminator

\max_{D} \frac{1}{n} \sum_{i = 1}^{n} [\log D (x_{i}) + \log (1 - D (G (z_{i})))]

(9)

To deceive the discriminator, the generated samples are expected to be closer to the known classes

\max_{G} \frac{1}{n} \sum_{i = 1}^{n} [\log (D (G (z_{i})))]

(10)

The adversarial mechanisms between known classes and reciprocal points are introduced to confuse generators by creating samples close to each center $P^{k}$ of the open space. Then, the generator is optimized by the classifier

\max_{G} \frac{1}{n} \sum_{i = 1}^{n} [- \frac{1}{N} \sum_{k = 1}^{N} S (z_{i}, P^{k}) \cdot \log (S (z_{i}, P^{k}))]

(11)

where

S (z_{i}, P^{K}) = s o f t \max (d_{e} (C (G (z_{i})), P^{k}))

According to Shannon entropy (Ren et al., 2023), equation (11) can be maximized when all values equal. Finally, the generator is optimized by

\max_{G} \frac{1}{n} \sum_{i = 1}^{n} [\log (D (G (z_{i}))) + β \cdot H (z_{i}, P)]

(12)

where

β

controls the weight of information entropy loss and

H (z_{i}, P)

is the function of information entropy.

2.2.2. Classifier learning

Considering the generated unknown samples and the ultimate target of training a better feature space, the classifier C is optimized as

\min_{C} \frac{1}{n} \sum_{i = 1}^{n} [L (x_{i}, y_{i}) - β \cdot H (z_{i}, P)]

(13)

where

L

represents the total loss of ARPL.

Equation (13) shows that the known and generated samples are processed independently, which will result in inaccurate statistics because of the different distributions between the known and confusing samples. To disentangle this mixed distribution, Auxiliary Batch Normalization (ABN) is proposed to guarantee the normalization statistics obtained for the confused samples only. Specifically, Batch Normalization (BN) (Ioffe and Szegedy, 2015) uses the mean and variance calculated within each minibatch to normalize the input features, which should come from a single or similar distribution. As shown in Figure 1, ABN aids to disentangle mixed distributions by keeping separate BNs for features belonging to different domains, blocking effectively the negative influence of confusing samples on known class distinction. Finally, the discriminator and the classifier can be improved simultaneously with the confused generator.

2.3. Efficient multi-scale attention

Considering the complex and varying working conditions of rolling bearing and the difficulty in extracting discriminative features, the EMA attention mechanism is introduced to improve performance of ARPL model by enhancing the feature extraction ability on key state information. EMA is a cross-space learning method, and it can interact with information without channel dimensionality reduction and lighten the computational burden (Ouyang et al., 2023). Its architecture is presented in Figure 2.

Figure 2.

Efficient multi-scale attention architecture.

For the input feature map X, EMA divides it into G groups along the channel dimension. Each sub-feature group learns to obtain attention weights to strengthen the feature representation of different regions in the bearing feature image. The grouping process is designated as

X = [X_{0}, X_{i}, \dots, X_{G}], X_{i} \in R^{C / / G \times H \times W}

(14)

Subsequently, two 1D global average pooling operations are adopted to encode the channel along two spatial directions, respectively, in 1 × 1 branch, and only a single 3 × 3 kernel is stacked in 3 × 3 branch to capture multi-scale feature representation.

Following the output of the 1 × 1 and 3 × 3 branches, 2D global average pooling coding is utilized to adjust the channel weights, which can be described as

z_{c} = \frac{1}{H \times W} \sum_{j}^{H} \sum_{i}^{W} x_{c} (i, j)

(15)

After the above implementation, the matrix dot product operations are employed to fuse the information from the two branches to obtain the final output feature map of EMA.

3. The proposed bearing fault diagnosis method

In order to address the challenge facing by the traditional OSFD methods, a new OSFD approach for rolling bearing is developed based on the ARPL model with EMA improvement. The flow chart of the proposed method is given in Figure 3.

(1) The bearing vibration signals under different working conditions are acquired from the bearing experimental test rig with an accelerometer.

(2) The collected data is then transformed by converting one-dimensional vibration signals into two-dimensional grayscale images, which are separated into the training and testing sets.

(3) The training samples are learned by the EMA-improved ARPL model and the trained ARPL-EMA model is established.

(4) The known and unknown samples of different fault types in the testing sets are identified by the trained ARPL-EMA model to realize the open set fault classification of rolling bearing.

Figure 3.

Flow chart of the proposed method.

4. Experimental study

4.1. Dataset description and experiment settings

To validate the effectiveness of the proposed method, the bearing experimental data has been collected from our test bench in the laboratory. The test rig is illustrated in Figure 4, which consists of drive motor, motor controller, bearing seats and acceleration sensors. The test bearing is ER16k and the sampling frequency is 25.6 kHz. Three types of faults, inner race fault (IRF), outer race fault (ORF), and ball fault (BF) are introduced and two rotation speeds of 1800 r/min and 3000 r/min are simulated. Considering each fault type with different fault sizes, the dataset includes totally seven classes. The details of the experimental data are presented in Table 1, and the temporal waveforms of bearing vibration signals are given in Figure 5.

Figure 4.

The bearing test rig.

Table 1.

The description of our dataset.

Bearing state	Fault size (mm)	Speed (r/min)	Label
Normal	0	3000	0
Inner race fault	0.3	1800	1
Inner race fault	0.4	3000	2
Outer race fault	0.5	1800	3
Outer race fault	0.6	3000	4
Ball fault	0.5	1800	5
Ball fault	0.8	3000	6

Figure 5.

Time-domain waveforms of the bearing vibrations under seven different conditions.

To better assess the performance of the proposed approach, two types of OSFD tasks are set up and the details are shown in Table 2, in which Task A is the OSFD task containing only one unknown fault types while task B is the OSFD task including two unknown fault classes. For each task type, different tasks with randomly selected unknown fault classes are taken into account. For the dataset, each working condition contains 100 samples, among which 50 samples were randomly chosen for training and the remaining 50 samples for testing. Considering the training efficiency in the real applications, all the samples are segmented into the 1024-point slices, which are converted into gray maps of size 32 × 32 as input of the ARPL-EMA model. To avoid the effect of randomness, the mean of ten trials is treated as the final result.

Table 2.

Settings of the OSFD tasks.

Task	Training set	Testing set	Unknown
A1	0,1,3,4,5,6	0,1,2,3,4,5,6	2
A2	0,1,2,3,5,6	0,1,2,3,4,5,6	4
A3	0,1,2,3,4,5	0,1,2,3,4,5,6	6
B1	0,3,4,5,6	0,1,2,3,4,5,6	1,2
B2	0,1,2,5,6	0,1,2,3,4,5,6	3,4
B3	0,1,2,3,4	0,1,2,3,4,5,6	5,6

4.2. Comparative methods and evaluation metrics

To demonstrate the superiority of the proposed ARPL-EMA method, other three typical and extensively used OSR approaches are employed to compare with the proposed method, which are sparse representation based open set recognition (SROSR) (Zhang et al., 2016), Openmax (Bendale and Boult, 2016), and the extreme value machine (EVM) (Rudd et al., 2017), respectively. Additionally, to verify the feature extraction ability of EMA, the comparison results of ARPL with and without EMA improvement are also investigated.

Unlike the evaluation indexes of close set recognition, the identification performance of OSR on the unknowns also needs to be considered. Based on the previous studies (Geng et al., 2020; Chen et al., 2023a), two commonly utilized OSR evaluation metrics are adopted in this paper: the open set accuracy $A_{O}$ and Youden index J. The open set accuracy $A_{O}$ is a trivial extension of accuracy, which is used to assess the model’s ability of identifying unknown classes and it can denoted as

A_{o} = \frac{\sum_{i = 1}^{C} (T P_{i} + T N_{i}) + T U}{\sum_{i = 1}^{C} (T P_{i} + T N_{i} + F P_{i} + F N_{i}) + (T U + F U)} .

(16)

where TP, FN, FP, and TN represent true positives, false negatives, false positives and true negatives, respectively. TU and FU represent the true and false rejection for the unknown faults.

Youden index is a metric representing the ability of avoiding failure of the model. The larger J value indicates the model’s better ability to distinguish known and unknowns. Its definition is as follows:

J = R + S - 1

(17)

where

R = \frac{1}{C} \sum_{i = 1}^{C} \frac{T P_{i}}{T P_{i} + F N_{i}}, S = \frac{T N}{T N + F P}

In addition, the accuracy $A_{C}$ and F1 score metrics are employed for the closed set performance evaluation, which are, respectively, designated as

A = \frac{\sum_{i = 1}^{C} (T P_{i} + T N_{i})}{\sum_{i = 1}^{C} (T P_{i} + T N_{i} + F P_{i} + F N_{i})}

(18)

F 1 = 2 \times \frac{P \times R}{P + R}

(19)

where

P = \frac{1}{C} \sum_{i = 1}^{C} \frac{T P_{i}}{T P_{i} + F P_{i}}, R = \frac{1}{C} \sum_{i = 1}^{C} \frac{T P_{i}}{T P_{i} + F N_{i}}

4.3. Experimental results

To validate the performance of our proposed ARPL-EMA model for fault diagnosis of rolling bearing under open set scenarios, the experiments on the aforementioned dataset are conducted. The parameters of the ARPL model recommended in Chen et al. (2022) are used for analysis. Considering that the weight parameter

λ

in equation (8) has an effect on the open set identification performance, its influence on the diagnosis performance under different OSFD scenarios is investigated. The recognition results of the ARPL-EMA with different

λ

values are shown in Table 3, in which the highest metric values are bolded. From Table 3, it can be seen that the best performance can be obtained when

λ = 0.1

. Moreover, there are little difference between the results of

λ = 0.01

and

λ = 1

. In this study,

λ = 0.1

is selected.

Table 3.

The influence of parameter $λ$ on the diagnosis performance under different OSFD tasks.

$λ$	Metrics	A1	A2	A3	B1	B2	B3
0.001	$A_{c}$	95.84	94.25	94.66	93.84	93.48	92.25
	$A_{o}$	94.57	93.44	93.18	93.04	92.51	91.75
	F1	95.49	95.08	95.69	94.42	94.08	92.99
	J	93.56	93.12	93.79	92.49	92.67	91.05
0.01	$A_{c}$	97.63	97.84	97.50	96.60	96.32	96.80
	$A_{o}$	96.93	96.73	95.43	95.79	95.70	95.46
	F1	98.80	98.90	97.90	97.44	96.94	96.94
	J	95.64	96.05	95.70	93.61	94.64	93.71
0.1	$A_{c}$	99.43	99.81	99.70	99.19	99.74	99.32
	$A_{o}$	99.25	99.72	99.33	99.03	99.55	98.17
	F1	99.67	99.93	99.78	99.41	99.79	99.58
	J	99.11	98.93	98.57	98.27	99.04	97.78
1	$A_{c}$	96.81	97.87	96.74	95.98	96.02	95.46
	$A_{o}$	95.72	96.64	95.42	94.84	94.53	94.85
	F1	97.66	98.44	97.52	96.81	96.95	96.96
	J	95.04	95.52	94.45	94.05	94.45	94.01

Then, to demonstrate the superiority of the proposed method, the comparison experiments with the aforementioned three algorithms are carried out, and the experimental results are illustrated in Table 4. From Table 4, it can be observed that the proposed method outperforms the other three methods in all experiments under various diagnosis tasks. All the four metric values of the proposed ARPL-EMA are not lower than 97% with the lowest J of 97.78% under the task B3. It is noteworthy that compared with tasks A, the classification performance differences between the ARPL-EMA and other models are larger under tasks B, especially for the evaluation index of J. This means that the proposed method may achieve better performance with bigger number of unknown classes. To more intuitively demonstrate this comparison, the diagnosis results over different evaluation indices are given in Figure 6.

Table 4.

The fault identification results of different methods for all tasks.

Tasks	Metrics	SROSR	EVM	OpenMax	Proposed
A1	$A_{c}$	92.56	95.57	97.63	99.43
	$A_{o}$	91.15	93.50	97.12	99.25
	F1	93.64	96.97	98.10	99.67
	J	92.55	92.35	94.96	99.11
A2	$A_{c}$	93.21	95.71	97.95	99.81
	$A_{o}$	93.02	93.96	97.86	99.72
	F1	94.47	96.84	98.26	99.93
	J	91.78	92.26	95.68	98.93
A3	$A_{c}$	91.79	96.12	96.83	99.70
	$A_{o}$	90.54	94.88	96.71	99.33
	F1	92.74	97.22	97.69	99.78
	J	89.73	93.23	95.05	98.57
B1	$A_{c}$	89.85	91.15	92.51	99.19
	$A_{o}$	87.75	89.34	90.65	99.03
	F1	90.07	92.21	93.71	99.41
	J	85.98	88.78	89.49	98.27
B2	$A_{c}$	90.09	93.52	94.69	99.74
	$A_{o}$	88.23	92.41	93.45	99.55
	F1	91.37	94.69	95.21	99.79
	J	86.48	91.46	92.24	99.04
B3	$A_{c}$	89.56	91.54	91.66	99.32
	$A_{o}$	88.78	90.11	89.46	98.17
	F1	90.87	92.82	93.45	99.58
	J	86.24	87.41	88.94	97.78

Figure 6.

Comparison results of different methods over different evaluation metrics.

Furthermore, to investigate the influence of EMA on the OSFD performance of the proposed approach, the experiments based on the original ARPL model are also conducted. The comparison results with and without EMA improvement are given in Table 5, from which we can see that the proposed APRL-EMA can obtain better performance over all the A tasks and B tasks than the original ARPL. It can also be seen that the performance improvement on tasks B is more significant than that on tasks A, indicating that EMA has excellent ability of feature extraction especially for the tasks with more unknown fault classes. The possible reason is that the EMA uses a universal convolution to avoid channel dimensionality reduction and the introduction of EMA offers the ARPL model more discriminative fault information, thus improving the OSFD diagnosis performance.

Table 5.

Comparison results of ARPL with and without EMA improvement.

Methods	Metrics	Task A			Task B
Methods	Metrics	A1	A2	A3	B1	B2	B3
ARPL	$A_{c}$	98.64	98.69	98.57	97.14	98.43	96.95
	$A_{o}$	97.85	97.95	97.75	96.88	97.97	95.75
	F1	99.09	99.24	99.17	97.93	98.67	97.24
	J	96.23	96.79	96.55	95.19	96.54	94.81
ARPL + EMA	$A_{c}$	99.43	99.81	99.70	99.19	99.74	99.32
	$A_{o}$	99.25	99.72	99.33	99.03	99.55	98.17
	F1	99.67	99.93	99.78	99.41	99.79	99.58
	J	99.11	98.93	98.57	98.27	99.04	97.78

Finally, the t-distributed stochastic neighbor embedding (t-SNE) (Peng et al., 2022) algorithm is utilized for feature dimension reduction to visually demonstrate the feature extraction performance of EMA by visualizing the features. Figures 7 and 8 illustrate the visualization results with and without EMA on tasks A and B, respectively. In these figures, each color represents one fault class and the red color stands for the unknown faults. As it can be seen, compared with ARPL, the better clustering effect can be achieved by features extracted with ARPL-EMA. In addition, the features obtained by ARPL-EMA have less overlapping between the known and unknown classes than those extracted by APRL. This comparison results further verify the capability of EMA to extract key condition information.

Figure 7.

Visualization results of features extracted by ARPL and ARPL-EMA on A1–A3 tasks. (a) ARPL. (b) ARPL-EMA.

Figure 8.

Visualization results of features extracted by ARPL and ARPL-EMA on B1–B3 tasks. (a) ARPL. (b) ARPL-EMA.

5. Conclusion

In this study, a new OSFD method for rolling bearings is proposed based on ARPL with EMA improvement. Unlike the traditional open set recognition methods ignoring the deep distribution of unknown classes, ARPL can model the potential unknown space and unknown distribution with the concept of reciprocal point and adversarial mechanism, which achieves better OSFD performance. Moreover, EMA is adopted to improve the feature extraction capability of ARPL model by providing more abundant and discriminative state information. The experimental analysis indicates that the proposed method can effectively complete open set fault recognition of rolling bearing, and the comparison results show that the proposed ARPL-EMA exhibits the best open set identification performance compared with the SROSR, Openmax, and EVM models.

It should be noted that the proposed method only considers the working conditions of different rotation speeds for the experimental verification. The influence of working conditions of different operation loads needs to be further explored in the future. Meanwhile, the open set identification of bearing compound faults has not been tested. The following works will be concentrated on the open set diagnosis of compound faults, which is more practical in real industrial applications.

Footnotes

Acknowledgments

The authors are grateful for the support of the National Natural Science Foundation of China (No. 52205111) and the support of Shanghai Weichangmeng Intelligent Technology Co., Ltd.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: this work was supported by the National Natural Science Foundation of China (No. 52205111) and the support of Shanghai Weichangmeng Intelligent Technology Co., Ltd.

ORCID iD

Keheng Zhu

References

Bendale

Boult

(2016) Towards open set deep networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA, 27–June 30 2016, 1563–1572.

Chen

Peng

Wang

, et al. (2022) Adversarial reciprocal points learning for open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 44(11): 8065–8081.

Chen

Wang

, et al. (2023a) Open-set classification for signal diagnosis of machinery sensor in industrial environment. IEEE Transactions on Industrial Informatics 19(3): 2574–2584.

Chen

Zhang

Zhu

, et al. (2023b) An adaptive activation transfer learning approach for fault diagnosis. IEEE 28(5): 2645–2656.

Chen

Tao

Liu

, et al. (2024) Open-Set Fault Recognition and inference for rolling bearing based on open fault semantic subspace. IEEE Transactions on Instrumentation and Measurement 73: 1–11.

Geng

Huang

Chen

(2020) Recent advances in open set recognition: a survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 43(10): 3614–3631.

Han

Jiang

Zhao

, et al. (2018) Comparison of random forest, artificial neural networks and support vector machine for intelligent diagnosis of rotating machinery. Transactions of the Institute of Measurement and Control 40(8): 2681–2693.

Zou

Jin

, et al. (2023) Intelligent fault diagnosis of train bearing based on ISTOA-VMD and SE-WDCNN. Journal of Vibration and Control 30(15–16): 3572–3583.

Ioffe

Szegedy

(2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, Lille, France, 6 – 11 July 2015, 448–456.

10.

Jiao

Ding

, et al. (2024) Fault diagnosis of rolling bearing based on BP neural network with fractional order gradient descent. Journal of Vibration and Control 30(9-10): 2139–2153.

11.

Liu

(2022) Intelligent fault diagnosis of rolling bearings under imbalanced data conditions using attention-based deep learning method. Measurement 189: 110500.

12.

Liu

Deng

, et al. (2023) Transforming the open set into a pseudo-closed set: a regularized GAN for domain adaptation in open set fault diagnosis. IEEE Transactions on Instrumentation and Measurement 72: 1–12.

13.

Mei

Zhu

Liu

, et al. (2024) Conditional variational encoder classifier for open set fault classification of rotating machinery vibration signals. IEEE Transactions on Industrial Informatics 20: 3038–3049.

14.

Men

Tang

, et al. (2024) A new multi-modal time series transformation method and multi-scale convolutional attention network for railway wagon bearing fault diagnosis. Journal of Vibration and Control 10775463241276024.

15.

Ouyang

Zhang

, et al. (2023) Efficient multi-scale attention module with cross-spatial learning. In: ICASSP 2023-2023 IEEE international conference on acoustics, speech and signal processing (ICASSP), Rhodes Island, Greece, 4–10 June 2023: IEEE, 1–5.

16.

Peng

Xie

, et al. (2022) Open-set fault diagnosis via supervised contrastive learning with negative out-of-distribution data augmentation. IEEE Transactions on Industrial Informatics 19(3): 2463–2473.

17.

Ren

Wang

Shen

, et al. (2023) Dual classifier-discriminator adversarial networks for open set fault diagnosis of train bearings. IEEE Sensors Journal 23: 22040–22050.

18.

Rudd

Jain

Scheirer

, et al. (2017) The extreme value machine. IEEE Transactions on Pattern Analysis and Machine Intelligence 40(3): 762–768.

19.

Sun

Yang

Lin

(2023) An open set diagnosis method for rolling bearing faults based on prototype and reconstructed integrated network. IEEE Transactions on Instrumentation and Measurement 72: 1–10.

20.

Tang

Wang

Yang

, et al. (2023) An improved prototypical network with L2 prototype correction for few-shot cross-domain fault diagnosis. Measurement 217: 113065.

21.

Tian

Wang

Zhang

, et al. (2018) A subspace learning-based feature fusion and open-set fault diagnosis approach for machinery components. Advanced Engineering Informatics 36: 194–206.

22.

Tong

Liu

Zheng

, et al. (2023) Multi-sensor information fusion and coordinate attention-based fault diagnosis method and its interpretability research. Engineering Applications of Artificial Intelligence 124: 106614.

23.

Wang

Shi

Sun

, et al. (2024) Open-set domain adaptation via feature clustering and separation for fault diagnosis. IEEE Sensors Journal 24: 16347–16361.

24.

Shu

, et al. (2024) Unknown-class recognition adversarial network for open set domain adaptation fault diagnosis of rotating machinery. Journal of Intelligent Manufacturing 1–19.

25.

Xie

Liu

Ding

, et al. (2023) Self-attention metric learning based on multi-scale feature fusion for few-shot fault diagnosis. IEEE Sensors Journal 23: 19771–19782.

26.

Zhao

Zhang

, et al. (2021) Deep-learning-based open set fault diagnosis by extreme value theory. IEEE Transactions on Industrial Informatics 18(1): 185–196.

27.

Zhang

Patel

(2016) Sparse representation-based open set recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 39(8): 1690–1696.

28.

Zhang

Zhou

, et al. (2022) Intelligent bearing fault diagnosis based on open set convolutional neural network. Mathematics 10(21): 3953.

29.

Zhang

Nie

Shao

, et al. (2023a) Multi-sample-distances-fusion-and generalized-Pareto-distribution-based open-set fault diagnosis of rolling bearing. Nonlinear Dynamics 111(12): 11407–11428.

30.

Zhang

Zhou

Zhang

, et al. (2023b) A personalized federated learning-based fault diagnosis method for data suffering from network attacks. Applied Intelligence 53(19): 22834–22849.

31.

Zhang

Chen

, et al. (2024) Integrating intrinsic information: a novel open set domain adaptation network for cross-domain fault diagnosis with multiple unknown faults. Knowledge-Based Systems 299: 112100.

32.

Zhao

, et al. (2020) Deep learning algorithms for rotating machinery intelligent diagnosis: an open source benchmark study. ISA Transactions 107: 224–255.