Abstract
To solve the problems of insufficient samples and weak fault features of audio signals in the fault diagnosis of plunger pump, this paper proposes a fault diagnosis method of plunger pump based on audio signal combined with meta-transfer learning (MTL-PAFD). The method takes the audio signals of the plunger pump as samples, which are acquired by a single sensor. Through the Gammatone filter bank processing, the representation ability of the audio signal under strong noise interference is effectively improved. Then combined with meta-transfer learning, the few-shot fault diagnosis of plunger pump is realized. In addition, according to the actual needs of fault diagnosis of plunger pump, the test method of meta-transfer learning in fault diagnosis application is improved, which can process unknown fault classes adaptively. Experimental results show that MTL-PAFD has a fault diagnosis accuracy of 91.41% for seen classes. After fast adaptive learning, it can achieve an accuracy of 89.64% when identifying unseen fault classes.
1. Introduction
The reciprocating plunger pump is the core component of the hydraulic system. It has the advantages of high pressure and economic practicality, so it is widely used in various industrial production fields (Bie et al., 2021a). To ensure the regular operation and maintenance of the plunger pump, many researchers have researched the fault diagnosis of the plunger pumps. The conventional practice is to extract the feature of the vibration signal of the plunger pump (Gao et al., 2021), then combine the machine learning method (Bie et al., 2019) for pattern recognition. However, although this kind of traditional plunger pump fault diagnosis method has achieved good results, it has certain limitations in practical applications because of its strong dependence on the effect of feature extraction, and the generalization ability is difficult to guarantee.
Audio signal detection is a relatively novel fault diagnosis detection method, with the advantages of non-contact and fewer sensors. However, because audio signals are easily aliased during the propagation process, and the sound field of mechanical equipment such as plunger pumps is complex, it is difficult to analyze the mechanism of audio signals, resulting in certain limitations in the application of audio signals for fault diagnosis. With the development of artificial intelligence technology, data-driven provides feasibility for the application of audio signals in mechanical fault diagnosis (Bie et al., 2021b). For the fault diagnosis of the plunger pump, the direct use of the deep learning algorithm will lead to overfitting of the trained model and low accuracy in the actual test. This is because it is not easy to acquire enough fault audio signals.
To solve the above problems, few-shot learning (Zhang et al., 2021) represented by meta-learning (Wu et al., 2020) shows outstanding performance, making neural networks achieve similar performance to deep learning under few-shot conditions. Among them, metric-based meta-learning (Wang et al., 2021) and model-agnostic meta-learning (MAML) (Zhang et al., 2020; Li et al., 2021a) were mainly used in the field of mechanical fault diagnosis. MAML (Finn et al., 2017) is a representative meta-learning algorithm, making the model pay more attention to improving its fast learning ability. After slight gradient descent, the trained model can quickly adapt to new classes not seen in the training data. Therefore, the algorithm has a strong generalization ability (Li et al., 2021b). And unlike metric-based meta-learning methods such as matching networks (Vinyals et al., 2016), prototype networks (Snell et al., 2017), and relation networks (Sung et al., 2018) that require specific network models, MAML can theoretically be combined with almost any neural network. Meta-transfer learning (MTL) (Sun et al., 2019, 2020) achieved significant improvements based on inheriting the advantages of MAML. Since the classifier can be pre-trained on other domain data and then migrate to target data by MTL, this method is possible in applying a few-shot fault diagnosis.
In this paper, we propose a novel Fault diagnosis method of plunger pump based on audio signal combined with meta-transfer learning (MTL-PAFD). Gammatone Spectrogram of the audio signal built fault datasets of the plunger pump. It converts the fault diagnosis into a few-shot image classification, which effectively improves the representation ability of the audio signal of the plunger pump. Besides, a few-shot learning test method suitable for fault diagnosis is proposed. It enables the fault diagnosis model to process untrained faults adaptively.
The remainder of this paper is organized as follows. Section 2 briefly introduces the related background. In Section 3, the principles and steps of the proposed method are elaborated. The experiment and discussion for the fault diagnosis of the plunger pump are given in Section 4. Finally, Section 5 concludes this article.
2. Related background
2.1. Meta-transfer learning (MTL)
This method (Sun et al., 2019) introduces the idea of transfer learning into MAML. It no longer updates the parameters of all network models. Therefore, this method can be combined with a more powerful DNN (Huang et al., 2017). As shown in Figure 1, MTL first pre-trains a DNN on large-scale datasets. The convolutional layers are treated as feature extractors MTL flow chart.
Next, update the classifier
By freezing the feature extractor
2.2. Hard-task (HT) meta-batch
In this section, we introduce the specific implementation of HT. The task flow is shown in Figure 2. For each episode task The computation flow of hard task meta-batch.
However, the collected audio signal is a one-dimensional time series, and the fault features are not prominent. In addition, the actual test signal is a single audio sample without a fault label. Whether there is a new fault class still depends on human judgment, which cannot fully play the advantages of MTL. Therefore, to combine the MTL method with the fault diagnosis of the plunger pump, the adaptive improvement of the input sample and the test method is also required.
3. The proposed MTL-PAFD
The proposed MTL-PAFD method takes the Gammatone spectrogram of the audio signal of the plunger pump as the signal sample. We first train the few-shot fault classification model of the audio of the plunger pump through MTL. In addition, based on the classification results of multi-frame audio signals, the MTL test method is improved, which can adaptively process new fault classes of the plunger pump. The method consists of four steps: (1) feature extraction of the audio signal of the plunger pump; (2) building of the few-shot audio dataset of the plunger pump; (3) training of the audio fault diagnosis model of the plunger pump; (4) fine-tuning of the fault diagnosis model and test. The complete MTL-PAFD logic diagram is shown in Figure 3. MTL-PAFD logic diagram.
3.1. Feature extraction of the audio signal of plunger pump
During the operation of the plunger pump, the sound will be generated due to the vibration and friction of the machine. However, the sound change caused by the fault of the plunger pump is often fragile, and it is difficult to identify the fault directly from the audio signal. To improve the representation ability of audio signals, it is necessary to extract features from one-dimensional audio signals with unobvious information. Considering that in the reciprocating cycle of a plunger pump, some fault sounds are only a few milliseconds abnormal. It is difficult to accurately reflect the dynamic characteristics of the plunger pump only by the frequency domain information, so we used the time-frequency domain information as the feature. Besides, experienced maintenance personnel can judge whether there is a fault through the change of sound. Therefore, we introduced the Gammatone spectrogram, which simulates the acoustic characteristics of the human cochlea. Compared with the linear spectrogram and Mel spectrogram commonly used in audio signal processing, the Gammatone filter bank used in the Gammatone spectrogram has a more robust anti-noise performance (Du et al., 2021). It is more suitable in the case of strong background noise, such as the plunger pump. The impulse response of the Gammatone filter bank is as follows Gammatone filter bank.
After the Gammatone filter filtered the audio signal of the plunger pump, the Gammatone spectrogram was generated by the short-time Fourier transform (STFT) (Verstraete et al., 2017)
Figure 5 shows the spectrogram of the audio signal of the plunger pump in a normal state. The plunger pump’s audio signal has a wide range of frequencies, and there is substantial noise interference in the entire frequency band. After Gammatone filtering, the mid-frequency and low-frequency parts of the audio signal were significantly enhanced, and the transition of the high-frequency component was smoother. It made the practical information of the signal more prominent, which is similar to the human ear’s ability to distinguish strong and weak signals. The characters would be beneficial to subsequent fault diagnosis. Moreover, by converting the audio signal of the plunger pump into a Gammatone spectrogram, the fault diagnosis was transformed into the image classification. It could be processed by introducing a more mature image classification method. Spectrogram of the audio signal before and after filtering.
3.2. Building of the few-shot audio dataset of the plunger pump
Above all, data expansion is required due to the limited plunger pump fault audio signals that can be acquired. We cut the signal under each fault into multi-segment signal samples. For dynamic characteristics of the plunger pump can be expressed, it should be ensured that the signal after cutting each piece contains at least two reciprocating movements of the plunger pump. After converting the audio signal of each segment into a Gammatone spectrogram, we built the audio fault dataset of the plunger pump. Then, the training set
Furthermore, to apply the training mode of MTL, we randomly sampled the support set and the query set from various data classes. The smallest training unit is an N-way-K-shot task, whether a training set, a validation set or a test set. Given an N-way-K-shot task, N refers to the classes contained in the task, and K refers to the number of samples contained in each class. It is easy to understand that the smaller the K or the larger the N, the higher the accuracy of the few-shot classification. Since the plunger pump has few fault classes we can use, and the accuracy of fault identification needs to be improved as much as possible, we use the 5-way-5-shot task to sample the dataset. There was 5-way-5-shot support set for each task, and the samples of the query set came from the remaining samples in the five classes.
3.3. Training of the audio fault diagnosis model of the plunger pump
According to the MTL, we pre-trained a DNN network on a large-scale dataset as a feature extractor
The next step is to choose a DNN as the basic network model. It is not suitable to use a too deep network under the few-shot condition. Considering the network’s performance, we chose Resnet-12 as the basic network model. Following the MTL training method described in Section 2, after pre-training on the mini-ImageNet dataset, a new classifier
3.4. Adaptability improvement of the fault diagnosis model of the plunger pump
Although the fault diagnosis model trained in the above steps could identify faults, the input form of the model is still the N-way-K-shot task. Moreover, if the test signal belongs to a new untrained fault class, it is still need to provide the model with samples of the same class label for fast learning. It is challenging to meet the above conditions in the fault diagnosis implementation, so it is necessary to improve the trained model further. Therefore, we proposed a test method that can adaptively process new faults for the test and implementation of the MTL-PAFD strategy. The flow chart is shown in Figure 6. MTL-PAFD test flow chart.
When the input sample is an unknown fault class, it falls outside the distribution of the known sample. The maximum softmax value output by the classifier for the unknown sample is often lower than the known sample. Therefore, we can use this characteristic to distinguish them. When starting the test, first divided the acquired audio signal of the plunger pump into multi-frame signals. Then, the Gammatone spectrogram of the signal was extracted according to the method described in Section 3.1. Then, the Gammatone spectrogram of each frame was input to the trained fault classification model one by one for classification. If the classification results were consistent, it would be considered that the signal belonged to the trained fault class. The fault classification results with higher consistency would be directly output. If the classification results were inconsistent, the current signal would be regarded as an unseen class. Further, a new label number was assigned to the unseen class. After that, we combined training data with the unseen class data into new N-way-K-shot tasks and fine-tuned the network model through MTL. Finally, use the updated fault classification model to reclassify the signal and output the fault classification result.
Since the number of audio training samples of the plunger pump is balanced, the accuracy was directly used as the judgment standard for the consistency of the classification results. The judgment rules are as follows
To ensure that the test signal of unknown fault class is not misjudged as a known class,
4. Experiments and validation
4.1. Experimental settings
The working parameters of the plunger pump.

Experimental plunger pump platform.
Sample division of audio signals.
To ensure the consistency of experimental conditions, all experiments were carried out in the same hardware environment: Intel Xeon(R) Gold 6226R@2.90GHz 2.89 GHz CPU, 256 GB RAM, NVIDIA Quadro RTX 6000 GPU, Windows 10 Professional Edition, CUDA 11.1 and Pytorch 1.10.1.
4.2. Experiment of feature extraction of signals
To verify the validity of the adopted Gammatone spectrogram as the signal feature, the linear spectrogram without filter, the Kalman filter, the Median filter, the Mel filter bank, and the Gammatone spectrogram were used as the audio signal features, respectively. We built five few-shot datasets as described in Section 3.2. Taking plunger pitting as an example, the spectrograms of five kinds of plunger pump signals are compared as follows: Figure 8. Comparison of different filter effects.
Classification and comparison of audio signal features of plunger pump (%).
It can be seen from the experimental results that although the Kalman filter and median filter commonly used in time-domain signal noise reduction can effectively remove signal noise, the fault features are also lost. Therefore, the final fault classification accuracy is lower than the Mel filter bank. Meanwhile, the Gammatone spectrogram has a more robust anti-noise performance, which achieved the highest accuracy of fault classification in the validation set and test set, respectively.
4.3. MTL-PAFD model training
In Section 4.2, the classification model has been fine-tuned according to the MTL. To verify the improved test method of the plunger pump fault diagnosis model (the test results are given in Section 4.3), the plunger pump audio fault diagnosis model was retrained. The trained accuracy and Loss curves on set MTL-PAFD model performance. (a) Accuracy (b) Loss.
The time and memory consumption comparison results.
It can be seen that although the training time and memory consumption of MTL is higher than that of transfer learning, the validation accuracy is significantly higher than the other two methods due to its quick adaption. Compared with MAML, MTL improves the computational efficiency of the model while achieving higher accuracy.
4.4. Test of fault diagnosis of plunger pump
Classification consistency judgment results.
In Section 4.3, the
Figure 10 shows the final fault diagnosis results of the fine-tuned MTL-PAFD model, where the test signals of faults 0–8 are from the validation set of available classes in the training phase, and faults 9–13 are from the test set of unseen classes in the training phase. It can be seen that according to the proposed MTL-PAFD method, even if the test signals were classes that have not been seen in the training phase, the audio fault diagnosis model of the plunger pump could adaptively adjust the network structure and parameters. The fine-tuned network could identify new fault classes and achieved an average fault diagnosis accuracy of 89.64%. Fault diagnosis test accuracy confusion matrix.
5. Conclusion
In this paper, we proposed a fault diagnosis method of plunger pump based on audio signal combined with meta-transfer learning (MTL-PAFD), and the following conclusions were obtained: (1) A method for fault diagnosis of plunger pump using audio signal was proposed, which effectively utilized the advantage that audio signal can achieve non-contact measurement and requires fewer sensors. The method effectively improved the representation ability of the audio signal of the plunger pump under strong noise interference through the feature extraction of the Gammatone filter bank. Meanwhile, the problem of the weak fault feature of the audio signal was solved. (2) Combining meta-transfer learning with fault diagnosis. The audio fault diagnosis of the plunger pump was transformed into the image classification through the Gammatone spectrogram. Combined with the performance advantages of meta-transfer learning in few-shot classification, the lack of signal samples in audio fault diagnosis of plunger pump was solved. In the case of only identifying known fault classes, the accuracy of the plunger pump fault diagnosis model based on meta-transfer learning reached 91.41%. (3) The test method of meta-transfer learning was improved to deal with unknown fault classes adaptively. The experimental results show that when an unknown fault occurs, the method could use the trained fault diagnosis model to perform fast adaptive learning and adjust the structure and parameters of the model. Finally, it could identify new fault classes and reached 89.64% accuracy.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was funded by “National Natural Science Foundation of China [No. 62071493].”
