Abstract
Cross-domain fault diagnosis of planetary gearboxes remains a significant challenge due to complex operating conditions and pronounced domain-specific distribution discrepancies. To address this issue, this study proposes a novel cross-domain fault diagnosis method based on multi-channel information fusion and domain-invariant representation learning. First, the synchrosqueezing S-transform (SSST) is employed to fuse and transform raw multi-channel vibration signals collected under varying working conditions into discriminative three-channel time–frequency representations, effectively enhancing fault-related feature expression. To mitigate domain shift, a global–local domain discrepancy metric strategy is introduced, which simultaneously measures and minimizes global distribution differences and local subdomain discrepancies, thereby promoting more effective domain confusion. Subsequently, a unified diagnostic framework is constructed based on the ResNet-50 architecture, enabling joint feature extraction and domain adaptation in an end-to-end manner. Experiments conducted on two planetary gearbox datasets demonstrate that the proposed method outperforms existing methods in terms of cross-domain diagnostic accuracy and robustness.
Keywords
1. Introduction
Planetary gearboxes, as critical components in rotating machinery, are extensively employed in industries such as wind power, aerospace, and rail transportation due to their high transmission efficiency, compact structure, and strong load-bearing capacity (Liu et al., 2023). However, their complex internal structure and frequent operation under high-speed, heavy-load, and harsh environmental conditions result in vibration signals that are highly non-stationary and nonlinear, posing significant challenges for accurate fault diagnosis (Liu et al., 2023; Peng et al., 2025). Once a fault occurs, it may lead to severe equipment failure, safety hazards, and substantial economic losses (Alabsi et al., 2024; Zhang et al., 2024). Therefore, developing effective and robust fault diagnosis methods for planetary gearboxes is of great importance to ensure the safe and reliable operation of mechanical systems, especially under variable working conditions and cross-domain scenarios (Han et al., 2024; Liu et al., 2024).
Feature extraction and pattern recognition are two critical steps in fault diagnosis (Z. Zhu et al., 2023). The complexity of gearbox vibration signals makes time-frequency joint analysis methods more effective than single-domain approaches in either the time or frequency domain (Liu et al., 2023). The S-transform (ST) is a widely used time-frequency analysis technique that is well-suited for handling and analyzing non-stationary signal impact features. However, its further development is limited by suboptimal time-frequency resolution (Wang et al., 2023). To address this issue, the synchrosqueezing S-transform (SSST), which integrates the ST with the synchrosqueezing transform (SST), has been proposed to enhance the time-frequency resolution of traditional ST methods (X. Zheng et al., 2020). Shuo et al. (Meng et al., 2020) employed the SSST to process strongly time-varying signals and combined it with a convolutional neural network to extract image features from the time-frequency representations, thereby achieving accurate fault diagnosis of gearboxes. In S. Li et al. (2023), a fault state identification method for diesel engines was proposed by integrating the SSST with a vision transformer. This approach effectively leverages the advantages of SSST in handling nonlinear and non-smooth signals, as well as the powerful image classification capability of the vision transformer. Chen et al. (Chen and Zheng, 2023) employed the SSST to convert the acquired sensor signals into time-frequency images. Frequency-domain features were extracted using a convolutional neural network CNN, fused via an attention mechanism, and further processed by a gated recurrent unit to capture time-frequency features. Finally, classification was achieved using a softmax layer, enabling effective fault diagnosis of wind turbine. The aforementioned studies have demonstrated the effectiveness of the SSST in fault diagnosis. However, these methods primarily focus on the transformation and application of single-channel information, while the synergistic contribution of multi-channel data has been overlooked.
Multi-channel signals can comprehensively reflect the operating condition of mechanical equipment, thereby enabling thorough mining of fault-related information. They have been widely applied in scenarios such as early fault diagnosis, where fault features are weak. By fusing multi-channel vibration signals into high-dimensional images and feeding them into convolutional neural networks (CNNs), the superior capabilities of CNNs in image feature extraction can be effectively leveraged for accurate fault classification (Guo et al., 2023). Azamfar et al. (Azamfar et al., 2020) proposed a fault diagnosis method based on motor current signature analysis. This approach utilizes a two-dimensional convolutional neural network (2D-CNN) architecture to fuse data obtained from multiple current sensors, and performs classification directly without the need for manual feature extraction. In T. Li et al. (2021), a multi-channel information extraction and fusion framework within non-Euclidean spaces is proposed. The authors develop a graph convolutional neural network model with multiple receptive fields, which significantly enhances the expressiveness of learned representations. Liu et al.(Liu et al., 2023) proposed a multi-source time-frequency feature fusion method by employing a strategy involving image-to-matrix transformation, matrix concatenation, and matrix-to-image reconstruction. In addition, Peng et al. (Peng et al., 2020) proposed a novel multi-branch and multi-scale convolutional neural network capable of automatically learning and fusing rich and complementary fault information from multiple signal components and time scales of vibration signals.
Nevertheless, these methods rely heavily on the assumption that the distribution of test data is consistent with that of the training data. In practice, the operating conditions are frequently non-stationary and diverse, while labeled training samples are scarce. Such distribution shifts significantly impair the generalization ability of trained models, resulting in reduced diagnostic accuracy and hindering their deployment in real-world industrial settings (Liu et al., 2025; Misbah et al., 2024). To address the aforementioned challenges, domain adaptation methods developed in the field of image recognition have emerged as a promising solution (Sun and Saenko, 2016; Y. Zhu et al., 2021). These methods aim to reduce the distribution discrepancy between the source and target domains, thereby enabling the knowledge learned from labeled source domain data to be effectively transferred to the target domain, even when labeled data in the target domain is scarce or unavailable(Qian et al., 2023; Wang et al., 2025).
Among the various domain adaptation strategies, discrepancy-based methods and adversarial-based methods are two mainstream paradigms. In discrepancy-based methods, a domain discrepancy term is incorporated into the loss function to explicitly measure and minimize the distribution discrepancy (Peng et al., 2026). For example, Li et al. (J. Li et al., 2024) employed a multi-kernel maximum mean discrepancy approach to measure and minimize domain discrepancy, and Liang et al. (Liang et al., 2023) combined local maximum mean discrepancy (LMMD) with a residual network to address fault diagnosis under variable speed conditions. Cao et al. (Cao et al., 2022) introduced the Cauchy kernel-induced maximum mean discrepancy and applied it to gearbox cross-domain diagnosis. In adversarial-based methods, a domain discriminator is trained jointly with the feature extractor through minimax optimization to learn domain-invariant representations (Alabsi et al., 2024; Han et al., 2024). More recently, graph neural network-based approaches have also been explored to model topological relationships among multi-sensor signals for cross-domain diagnosis (Li et al., 2021; Liu et al., 2025), and Transformer-based methods have shown strong capabilities in capturing long-range dependencies in vibration signals (Chen and Zheng, 2023). However, the aforementioned methods primarily focus on either global domain discrepancy (e.g., MMD) or local domain alignment (e.g., LMMD), while neglecting the synergistic contribution of both global and local information in cross-domain feature discrepancy evaluation. In fact, global metrics such as MMD align the marginal distributions of the two domains but are insensitive to class-conditional differences, while local metrics such as LMMD achieve class-wise alignment within subdomains but overlook the overall distribution consistency. For planetary gearboxes under variable speeds, where both an overall distribution shift and fault-specific local discrepancies coexist, jointly exploiting global and local alignment is expected to provide a more comprehensive and balanced measure of domain divergence.
Based on that, a novel cross-domain fault diagnosis method based on multi-channel information fusion is proposed in this paper. First, the SSST is employed to fuse and convert raw multi-channel vibration signals under varying working conditions into three-channel time-frequency representations. To address domain discrepancies, a global-local domain discrepancy metric strategy is introduced to measure and minimize both global and local distribution differences simultaneously, thus promoting domain confusion. A diagnostic framework is then established based on the ResNet-50 architecture, which integrates feature extraction and domain-invariant representation learning. Finally, extensive experiments are conducted to validate the feasibility and superiority of the proposed method. The main contributions of this paper include the following: (1) A novel signal fusion strategy is proposed, where SSST is employed to convert multi-channel vibration signals under varying working conditions into unified three-channel time-frequency representations, effectively enhancing the feature richness for cross-domain diagnosis of planetary gearboxes. (2) A global-to-local domain discrepancy measurement scheme is developed, improving domain adaptability in fault representation learning. (3) A ResNet50-based cross-domain fault diagnosis framework is constructed, integrating the fused multi-channel time-frequency inputs and domain discrepancy metrics, and experimentally validated on planetary gearbox datasets to demonstrate its superior performance.
The rest of this paper is organized as follows: Section 2 presents the fundamental principles and advantages of the SSST. Section 3 details the framework of the proposed method. In Section 4, the data sources and preprocessing techniques used in this study are described. Section 5 provides the validation of the effectiveness of the proposed approach. Conclusions are made in Section 6.
2. Synchrosqueezing S-Transform
The SSST integrates the S-transform with synchrosqueezing techniques, enabling a more precise characterization of non-stationary signals in the time-frequency domain. By applying the SST to the time-frequency matrix obtained via the ST, the original time-frequency region is compressed, thereby enhancing the time-frequency resolution. This method is particularly well-suited for the time-frequency analysis of non-stationary vibration signals, as it allows for more accurate localization and tracking of transient components within the signal. Specifically, let x(t) represent the signal to be analyzed; its ST is given by (Stockwell et al., 1996):
To perform synchrosqueezing of the ST result along the frequency axis, the instantaneous frequency of the signal x(t) must be calculated. Based on the ST output, the instantaneous frequency of x(t) can be expressed as:
The SSST of the signal can thus be obtained as:
To verify the advantage of the SSST in signal feature extraction, the following simulated signal is constructed:
The signal is sampled at a frequency of 1000 Hz over a duration of 10 seconds. Both the ST and the SSST are applied for analysis. The resulting time-frequency representations are shown in Figure 1. After adding white noise to the signal, the corresponding results are presented in Figure 2. Time-frequency diagram of signal x(t): (a) ST; (b) SSST. Time-frequency diagram of signal x(t) after adding noise: (a) ST; (b) SSST.

It can be observed that both the ST and the SSST accurately capture the instantaneous frequency variations of the original signal prior to noise addition. However, the high-frequency component x3(t) appears more blurred in the ST time-frequency representation compared to SSST. After the addition of white noise, the overall signal clarity in the time-frequency domain is noticeably reduced, with the degradation being particularly evident for the x3(t) component. While both ST and SSST can still accurately identify low-frequency components, the high-frequency content becomes completely indistinguishable in the ST. In contrast, SSST effectively reassigns the dispersed energy back to its central frequency, maintaining high time-frequency resolution even in the presence of noise.
3. Proposed method
3.1. Methodological framework
This section presents a detailed description of the proposed method. As illustrated in Figure 3, the proposed method consists of three main components: a multi-channel information fusion module, a feature extractor, and a domain adaptation module. The multi-channel information fusion module constructs the model input based on the SSST approach. The feature extractor, built upon the ResNet50 architecture(He et al., 2016), extracts fault-related information and domain-invariant features from the fused input. The domain adaptation module facilitates the learning of transferable features by constructing global and local mean discrepancy measures. Finally, a classifier is applied to the learned transferable features to enable cross-domain fault diagnosis of gearboxes. The detailed parameter settings of the proposed method are summarized in Table 1. The framework of the proposed method. Structure and parameters of ResNet50.
3.2. Multi-channel information fusion
To fully exploit the time-frequency information of vibration signals, a multi-channel information fusion approach inspired by the color channels of images is proposed, as illustrated in Figure 4. Given the superior time-frequency representation capability of the SSST, it is employed to preprocess the acquired vibration signals and extract time-frequency feature matrices. In typical equipment health monitoring setups, sensors are installed in the horizontal, vertical, and axial directions, and vibration signals from each direction can be individually transformed into time-frequency coefficient matrices. Based on this, an independent feature channel is constructed for each directional monitoring signal, forming a multi-channel fused feature map. The specific steps are as follows: (1) The vibration signals acquired from the three orthogonal directions (x, y, and z) of the monitored object are processed using the SSST to obtain the corresponding time-frequency coefficient matrices (2) Based on equation (6), the time-frequency coefficient matrices (3) The normalized matrix s
i*
mn
is rounded and converted into an 8-bit unsigned integer type, which is commonly used for image storage. The resulting pixel matrices are denoted as (4) The pixel matrices Multi-channel information fusion method.

3.3. Domain adaptation module
Another advantage of applying deep learning to fault diagnosis of mechanical equipment lies in the model’s inherent complexity, which endows it with feature transfer learning capabilities. This allows the model to effectively adapt to discrepancies between source and target domain data, thereby addressing diagnostic challenges under varying operating conditions, different fault severities, different but related equipment or simulation models, and incomplete information sources (H. Zheng et al., 2019). Variations in factors such as operating conditions often lead to distribution differences between the training dataset (source domain) and the testing dataset (target domain). Such discrepancies significantly degrade the generalization performance of classifiers trained solely on the source domain when applied to the target domain. The concept of transfer learning aims to address this issue by leveraging deep neural networks to map both source and target domains into a shared feature space, in which the distribution discrepancy is minimized, thereby reducing domain divergence.
The cross-domain diagnostic model builds upon conventional deep neural networks by introducing an adaptation layer between the feature extraction module and the classifier to quantify the discrepancy between source and target domain data (Mao et al., 2022). The introduction of the adaptation layer shifts the network’s optimization objective from minimizing classification error alone to jointly minimizing both the classification error and the domain discrepancy loss (Shao and Kim, 2024):
For the classification error, this paper employs the cross-entropy loss function
For the domain discrepancy loss, this study adopts a strategy that combines both global and local distribution differences to achieve a comprehensive measure of domain divergence. Specifically, the global distribution discrepancy is measured using MMD, while the local discrepancy is quantified using LMMD. It can be represented as:
In summary, this study proposes a method that integrates multi-channel information fusion with global-local discrepancy measurement to achieve cross-domain fault diagnosis of planetary gearboxes under complex operating conditions. The main steps are as follows: (1) Multi-channel vibration signals of the gearbox are collected under various operating conditions and fault states. These signals are converted into equal-length time series samples in MATLAB. (2) The time series samples are downsampled to half of their original sampling frequency. The SSST is then applied to transform the single-channel signals into time-frequency coefficient matrices, where the width represents temporal information and the height represents frequency information. (3) The coefficient matrices obtained from each single channel are resized to a uniform dimension and normalized. The processed grayscale images from individual channels are stacked to form three-channel color images. (4) A transfer learning-based neural network is constructed under the PyTorch framework, using ResNet50 as the backbone. Cross-domain fault diagnosis is performed on samples from different image domains.
4. Data description
4.1. The planetary gearbox test rig
The feasibility of the proposed method is validated using a planetary gearbox vibration test rig with preset faults. The test rig consists of a motor, a planetary gearbox, a load motor, and couplings, as shown in Figure 5. By controlling the motor and load motor, different rotational speeds and torques are applied to the gearbox. To simulate the effects of load variation and fault-induced excitation under real operating conditions, typical faults such as pitting, broken tooth, and cracks are artificially introduced into the planetary gears, as shown in Figure 6. Planetary gearbox test rig. Three health states of tested gears: (a) Normal, (b) Pitting, (c) Broken tooth, (d) Crack.

4.2. Data acquisition and preprocessing
Acceleration sensors are used to collect vibration signals from the gearbox under various combinations of rotational speeds and torques, with different fault types introduced. The signals are measured in three directions: radial-horizontal, radial-vertical, and axial. The sensor placement locations are illustrated in Figure 5. The sampling frequency is set to 8192 Hz, and the duration of each sampling period is 20 seconds.
Details of experimental datasets.
Furthermore, SSST is applied to transform each dataset into a set of time-frequency images with a fixed size of 192 × 192, which are used as inputs to the neural network for fault diagnosis of planetary gearboxes under variable operating conditions. Channels A, B, and C represent the three signal acquisition directions. The fault categories considered include broken tooth, normal, pitting, and crack. Taking the normal signal from Dataset 1 as an example, the preprocessing procedure is illustrated in Figure 7. Data preprocessing process.
5. Experimental verification of cross-domain fault diagnosis
Training hyperparameters.
5.1. Cross-domain fault diagnosis at constant rotational speed
Dataset 1 is used as the source domain, while Datasets 2 and 3 served as target domains, to investigate the impact of rotational speed variation on fault recognition performance under constant-speed conditions. The source and target domain data are respectively input into the proposed transfer learning network and the original ResNet50 network for comparative testing. Figure 8 illustrates the variation of diagnostic accuracy with the number of training iterations in the first trial. Diagnostic accuracy convergence curve under constant rotational speed.
As observed in the figure, the ResNet50 network without transfer learning achieves faster convergence, but due to its limited transferability, the final stable accuracy remains below 70%. In contrast, with identical network structures, the recognition accuracy when transferring from Dataset 1 to Dataset 2 is higher than that from Dataset 1 to Dataset 3. This is mainly attributed to the greater operational discrepancy between Datasets 1 and 3 compared to Datasets 1 and 2, which hinders effective feature space alignment and domain discrepancy minimization within the network. Fine-tuning network parameters and hyperparameters can moderately improve fault classification performance.
To further explore recognition accuracy under different transfer directions, Datasets 2 and 3 are also used as source domains. The results are presented in Figure 9. In the plot, “2-1” denotes transfer from Dataset 2 to Dataset 1, and the rest follow the same convention. It can be clearly observed that the proposed model consistently achieves high accuracy across various transfer tasks, demonstrating strong generalization capability. Comparison of cross-domain diagnostic accuracy under constant rotational speed.
5.2. Cross-domain fault diagnosis at variable speeds
Gearboxes often operate under varying conditions when faults occur. To simulate the speed ramp-up and fluctuating conditions encountered during real-world operations, Dataset 1 is selected as the source domain, while Datasets 4 and 5 are used as target domains to investigate the fault diagnosis performance when transferring from constant speed to linearly and sinusoidally varying speeds. Ten repeated experiments are conducted, and the results are shown in Figure 10. Comparison of recognition accuracy under different rotational speed changing trends.
As illustrated in the figure, during repeated experiments, the network exhibited relatively small fluctuations in recognition accuracy for Dataset 5, achieving an average accuracy of up to 88.57% ± 1.33%, which is comparable to the accuracy obtained under constant-speed target domain conditions. This indicates that the proposed method offers good classification stability under linearly varying speed conditions. In contrast, when the target domain featured sinusoidal speed variation, the recognition accuracy fluctuated more significantly, with an average accuracy of only around 65.53% ± 5.54%. The relatively lower accuracy under sinusoidal speed variation is primarily attributed to two factors. First, the source domain is collected at a constant speed of 150 r/min, whereas the target domain exhibits a considerably broader speed range, which significantly amplifies the marginal distribution shift. Second, the relatively low frequency of sinusoidal speed variation causes samples of the same fault category to fall within different segments of the speed profile, thereby increasing intra-class variability.
5.3. The impact of channel number on diagnosis performance
Transfer tasks under different channel numbers.

Comparison of recognition accuracy under different numbers of channels.
As shown in the figure, it is evident that the diagnostic accuracy achieved with the fused three-channel samples is consistently higher than that obtained with each of the three single-channel sample types. Since the network parameters tend to stabilize after 800 iterations, the average diagnostic accuracy during the 800 th to 1000 th iterations are calculated to quantitatively compare the performance differences across the four transfer tasks.
To further evaluate the contribution of multi-channel fusion, the accuracy improvement rate of the three-channel samples over single-channel samples is defined as:
Specifically, the average accuracies for Transfer Tasks IV, V, VI, and VII (denoted as MA4, MA5, MA6, and MA7, respectively) are calculated to be 91.16%, 78.61%, 85.44%, and 79.47%. Based on these results, the accuracy improvement rate of the fused three-channel samples over the three types of single-channel samples is quantified as 12.3%.
5.4. Comparative study and ablation analysis
To comprehensively evaluate the proposed method, three groups of comparative and ablation experiments are designed in this section: (1) comparison with mainstream domain adaptation methods, (2) ablation study on the domain discrepancy metric strategy, and (3) comparison of different feature extraction methods. All experiments use the same ResNet-50 backbone, and results are reported as mean and standard deviation over ten independent runs with different random seeds.
5.4.1. Comparison with mainstream domain adaptation methods
Comparison with different domain adaptation methods.
Bold values denote the best results among all compared methods
As shown in Table 5, the no-adaptation baseline yields the lowest accuracy across all tasks, with an average of only 58.98%, confirming the significant impact of domain shift on diagnostic performance. DDC achieves limited improvement due to its single-layer single-kernel design. DAN and Deep CORAL show moderate gains through MMD alignment and second-order statistics matching, respectively. DANN achieves competitive results through adversarial training, but exhibits the largest standard deviation among all methods. DSAN performs well by incorporating subdomain-level alignment, reaching an average accuracy of 80.32%. The proposed method consistently achieves the highest accuracy across all transfer tasks, with an average of 87.72%, outperforming the best baseline DSAN by 7.40 percentage points. This superiority can be attributed to the complementary effects of simultaneous global and local alignment, which will be further analyzed in the following ablation study.
5.4.2. Ablation study on domain discrepancy metrics
Ablation study on domain discrepancy metrics.
Bold values denote the best results among all compared methods
Several observations can be drawn from Table 6. First, both MMD-only and LMMD-only substantially outperform the no-adaptation baseline, confirming the effectiveness of domain alignment. Second, LMMD-only achieves higher accuracy than MMD-only across all three tasks, with an average improvement of 7.07 percentage points, indicating that class-conditional alignment plays a more critical role than marginal distribution alignment in the cross-domain fault diagnosis of planetary gearboxes. Third, the proposed MMD + LMMD combination further improves the average accuracy to 87.72%, surpassing MMD-only and LMMD-only by 13.46 and 6.39 percentage points, respectively. This demonstrates that global alignment and local alignment address different aspects of domain shift: MMD reduces the overall marginal distribution discrepancy, while LMMD refines the alignment at the subdomain level. Their integration leads to more comprehensive domain confusion and thus yields the best diagnostic performance.
5.4.3. Impact of feature extraction methods
To further investigate the impact of different feature extraction methods on planetary gearbox fault identification performance, a new set of transfer learning tasks is established using Dataset 1 as the source domain and Dataset 5 as the target domain. The preprocessed multi-channel time-frequency images generated by SSST, ST, and raw data are employed as network inputs, respectively. The fault identification accuracy for each fault category under different feature processing methods is illustrated in Figure 12. As can be observed from Figure 12, the proposed method maintains high and stable diagnostic accuracy across all fault categories, achieving an average accuracy of 88.57% with a standard deviation of 1.33%. In contrast, the method employing ST as the feature extraction approach yields a lower diagnostic accuracy of 75.43% ± 3.31%, while the method using raw data as input achieves only 61.38% ± 3.46%. Comparison of recognition accuracy under different feature extraction methods.
To intuitively demonstrate the superiority of the proposed method in fault classification, t-distributed stochastic neighbor embedding (t-SNE) is employed to project the output features of the network’s fully connected layer into a two-dimensional space for each transfer task. As shown in Figure 13, when no feature processing or domain adaptation is applied, the similarity between the source and target domains is low, and the inter-class separability within the target domain is poor. Comparison of t-SNE visualization under different feature extraction methods.
By incorporating ST with transfer learning, both the inter-domain similarity and inter-class separability are notably improved. Furthermore, the combination of SSST and the transfer learning network achieves even better performance, clearly distinguishing different fault types across both domains. It also minimizes the domain shift for the same fault class between the source and target domains, resulting in the largest inter-class distance and the smallest intra-class distance.
5.5. Validation on a public dataset
Transfer tasks on the SEU gearbox dataset.
Results on the SEU gearbox dataset.
Bold values denote the best results among all compared methods
The results on the SEU dataset are consistent with the findings on the self-built test rig presented in Sections 5.1-5.4, confirming that the proposed method generalizes well across different experimental platforms, fault types, and operating conditions.
6. Conclusions
This paper proposes a cross-domain fault diagnosis method for gearboxes based on multi-channel information fusion. The method transforms raw multi-channel vibration signals under different operating conditions into three-channel time-frequency representations using the SSST, and employs a global-local domain discrepancy measurement strategy to achieve cross-domain feature alignment and domain confusion. Finally, experimental validation is conducted on a planetary gearbox vibration test rig. The main conclusions are as follows: (1) Multi-channel information fusion combined with a global-local domain discrepancy measurement strategy enables cross-domain fault diagnosis under various operating conditions, including constant and variable rotational speeds. The smaller the speed difference between the source and target domains, the higher the diagnostic accuracy. (2) Compared with single-channel samples, the fused three-channel samples provide significantly higher diagnostic accuracy when used as input to the network, with an overall improvement rate of up to 12.3%. (3) The use of SSST for processing time-varying vibration signals yields more stable and accurate diagnostic results compared to other methods such as the ST. When integrated with a deep transfer learning network, this approach can effectively reduce the adverse impact of working condition variations on diagnosis performance.
It should be noted that this study adopts a single network model for transfer learning-based fault diagnosis. The effects of different network architectures and transfer tasks under larger rotational speed differences have not been thoroughly investigated. Future work will further explore fault diagnosis under more complex operating conditions and varying fault severity levels.
Footnotes
Acknowledgment
The authors would like to thank the editor and referees for their valuable comments.
Author contributions
Funding
The authors disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Natural Sciences Foundation of Jiangxi Province (Grant No. 20252BAC20008), Open Topic of the Hunan Engineering Research Center of Precision Manufacturing Technology for Rotating Components of Railway Vehicles (Grant No. KFJJ2025101), National Natural Science Foundation of China (Grant No. 52565011), Early-Career Young Scientists and Technologists Project of Jiangxi Province (Grant No. 20244BCE52159), and Research Project of State Key Laboratory of Mechanical System and Vibration (Grant No. MSV202508).
Declaration of conflicting interests
The authors declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
