VAE-based feature extraction for SVM classification of rotor broken bar faults in induction motors

Abstract

Broken-bar faults in induction motors can degrade performance and compromise reliability. Conventional fault diagnosis methods based on time- and frequency-domain feature extraction require expert knowledge and extensive manual processing, limiting their real-time applicability. This work proposes a framework combining Variational Autoencoders (VAE) and Support Vector Machines (SVM) for automatic fault diagnosis with severity assessment. The VAE extracts informative latent features from high-dimensional vibration signals while performing nonlinear dimensionality reduction. These features are then used to train an SVM to classify normal and broken rotor bar fault severity levels operating modes. The method was validated on a publicly available dataset collected from a 1 HP induction motor under various load and rotor fault conditions. Results demonstrate that the VAE-SVM framework achieves high classification accuracy and reliably discriminates between vibration signal classes across multiple sensor placements under the considered experimental conditions. These results suggest that the proposed approach is a promising framework for automatic fault diagnosis, with potential applications in condition monitoring and predictive maintenance. Further validation under more diverse operating conditions and real industrial environments is required to confirm its robustness and practical applicability.

Keywords

broken rotor bar faults vibration signals variational autoencoder SVM

1. Introduction

Induction motors play a crucial role in powering a wide array of machinery and equipment across industries. Their integration with Industry 4.0 technologies represents a significant advancement in industrial processes, offering enhanced performance, reliability, and efficiency. By incorporating sensors into induction motors, data on various parameters such as temperature, vibration, current, and other relevant signals can be continuously collected. The analysis of this data through Artificial Intelligence (AI) facilitates the detection of patterns, trends, and irregularities, offering insights into potential faults or avenues for refining motor performance.

Time-domain statistical feature extraction, including metrics such as the mean, standard deviation, skewness, and kurtosis, is extensively employed in the analysis of sensor signals. The combination of these temporal statistical features enables the extraction of meaningful information for industrial system monitoring and fault diagnosis. Such features are often incorporated into machine learning models to build accurate and robust condition-monitoring frameworks (Ezziane et al., 2023). In this context, the study reported in Gangsar and Tiwari (2019) investigates the application of a Support Vector Machine for the classification of ten electrical and mechanical motor faults. The method is based on the extraction of fourteen time-domain features, encompassing statistical metrics such as the mean, standard deviation, crest factor, and kurtosis, computed from both current and vibration signals.

Frequency-domain feature extraction represents a powerful approach for fault diagnosis in industrial processes, as it emphasizes variations in the spectral components of measured signals. Such variations are essential for signal analysis and provide insights into fault-related changes occurring in different industrial systems. The distinctive features identified in the frequency domain form the basis of the signal’s frequency signature. Previous studies have combined spectral analysis with machine learning techniques to diagnose motor faults. For instance, rotor bar faults were detected using sideband amplitudes, with decision trees and neural networks achieving accuracies up to 98% (Chisedzi and Muteba, 2023). Similarly, inter-turn short circuits in stator windings were identified from characteristic frequency components using an SVM, achieving 94.7% accuracy (Pandarakone et al., 2016), demonstrating the effectiveness of frequency-domain features for fault diagnosis.

Time- and frequency-domain feature extraction methods are generally coupled with dimensionality reduction techniques, including Principal Component Analysis (PCA) (Marmouch et al., 2017) and Linear Discriminant Analysis (LDA) (Zhou et al., 2021). These methods facilitate selecting the most informative features and correcting potential errors in the estimation of characteristic fault frequency amplitudes.

Owing to rapid advances in deep learning, image-based approaches have attracted considerable attention for fault diagnosis using measured signals. The core concept is to convert measured signals into two-dimensional representations that can be processed by Convolutional Neural Networks (CNNs). CNNs facilitate automatic extraction of informative features from the input signals, leading to substantial improvements in detection and diagnostic accuracy. A variety of image analysis techniques are widely used in industrial fault diagnosis to transform signals into image representations. These include spectrograms (Tami et al., 2024), wavelet scalograms (Hasan et al., 2021), Hilbert–Huang transform scalograms (Du et al., 2022), Gramian Angular Fields (GAFs) (Zhou et al., 2022), Markov Transition Fields (MTFs) (Memariam et al., 2023), and Recurrence Plots (RPs) (Tarek and Sameh, 2024), each providing a unique perspective on the signal’s image characteristics.

Beyond CNN-based approaches, recurrent neural network architectures, such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) networks, have been developed to capture temporal dependencies inherent in measured signals, thereby improving robustness under noisy and dynamic operating conditions. A hierarchical LSTM-based deep network was introduced in Yu et al. (2019) for feature learning and fault recognition of rolling bearings. More recently, attention-enhanced LSTM architectures have demonstrated superior fault detection performance compared with conventional LSTM models by enabling selective focus on informative temporal features (Khaniki et al., 2023).In Yahui et al. (2021), a GRU-based framework was proposed to learn representative features from vibration signals, followed by a multilayer perceptron for fault recognition. Aref et al. (2026) introduced a Dual-Attention CNN–GRU framework with Per-Regime Scaling for predictive maintenance and Remaining Useful Life (RUL) prediction, achieving reliable performance on the challenging C-MAPSS FD002 dataset.

Recently, Transformer-based architectures using self-attention mechanisms have been investigated for machinery fault diagnosis, offering advantages in improving feature extraction capabilities from time series data (Yuhong et al., 2022). These models have shown promising results in detecting faults in rotating machinery by capturing long-range relationships in vibration signals (Yandong et al., 2023). In Tang et al. (2022), a Vision Transformer (ViT) model combined with a wavelet transform and a soft voting method was proposed to improve diagnostic accuracy further.

Despite the advantages of conventional feature extraction approaches, they often require extensive domain expertise and result in high-dimensional representations. While deep learning models such as CNNs, LSTMs/GRUs, and Transformers have achieved state-of-the-art performance in fault diagnosis, they typically require substantial computational resources. To overcome these limitations, this work proposes a hybrid framework combining a variational autoencoder with a Support Vector Machine. The VAE automatically extracts informative, low-dimensional features from sensor signals while capturing the underlying data distribution (Spina et al., 2024), reducing the need for manual feature engineering. The SVM, known for its robustness across classification and regression tasks, leverages these features to provide reliable, accurate fault diagnosis. Although alternative kernel-based models, such as Relevance Vector Machines (RVMs), have been explored in the literature and may offer improved computational efficiency, they do not yield significant gains in classification accuracy compared to SVMs (Kuai et al., 2024). This VAE–SVM pipeline, which combines the strengths of generative deep learning with the proven stability of SVM-based models, was validated on a publicly available dataset, achieving high classification accuracy while indicating favorable computational efficiency.

The remainder of this paper is organized as follows. Section 2 presents the experimental database used in this work and describes the experimental workbench characteristics. Section 3 introduces the theoretical background of the proposed method, including Variational Autoencoders and Support Vector Machines. Section 4 reports the experimental results and performance evaluation. Finally, the conclusion and future recommendations are given in Section 5.

2. Database

In this study, we use an experimental database developed by the Laboratory of Intelligent Automation of Processes and Systems and the Laboratory of Intelligent Control of Electrical Machines, accessible at https://ieee-dataport.org/open-access/ (Treml et al., 2020). The dataset provides synchronized electrical and mechanical measurements designed for induction motor fault diagnosis and rotor broken-bar severity analysis. Recorded from a 1HP induction motor operating under controlled laboratory conditions, the dataset ensures synchronized sampling of all signals over an 18 seconds acquisition period. The motor was tested under 12.5, 25, 37.5, 50, 62.5, 75, 87.5, and 100% of full load. Each experimental condition was repeated 10 times, yielding multiple independent acquisition trials. In total, 400 experimental signal acquisitions (5 operating conditions × 8 load levels × 10 repetitions) are available for dataset construction and validation of the proposed approach. Table 1 presents detailed characteristics of the workbench and specific conditions under which the data were collected.

Table 1.

The experimental workbench characteristics.

Induction motor	1hp, 220V/380V, 3.02A/1.75A, 4 poles, f_s = 60 Hz, 4.1 Nm, 1715 rpm, and 34 bars
Fault condition	Healthy, one broken bar, two adjacent broken bars, three adjacent broken bars, and four adjacent broken bars
Measured electrical signals	The voltage and the current in each phase
Measured vibration signals	⁃ Radial vibration on the driven side ⁃ Radial vibration on the non-driven side ⁃ Vibration at the base ⁃ Vibration in the housing ⁃ Axial vibration on the driven side
Current probes	Yokogawa 96033 model Output: 10 mV/A
Accelerometers	Frequency = 5 to 2000 Hz Sensitivity = 10 mV/mm/s
Sampling frequency	Vibration signals = 7600 Hz Electrical signals = 50000 Hz

3. Theoretical background

3.1. Variational autoencoders (VAEs)

Variational autoencoders, introduced by Kingma and Welling (2013), are a class of probabilistic models designed to discover latent, low-dimensional representations in data and serve as a method for dimensionality reduction.

VAEs are a form of autoencoders. Traditional autoencoders consist of an encoder and a decoder network (Figure 1), where the encoder compresses the input data X into a lower-dimensional representation Z (latent space), and the decoder reconstructs the original data from this representation X’. By learning to compress and reconstruct input data, autoencoders can capture essential features in a lower-dimensional space, facilitating tasks such as dimensionality reduction, denoising, and data generation (Kingma and Welling, 2013).

Figure 1.

The autoencoder architecture.

During training, the autoencoder aims to minimize the reconstruction error between the input data and the reconstructed data according to equation (1)

R_{l o s s} = \frac{1}{n} \sum_{i = 1}^{n} {(X_{i} - X_{i}^{'})}^{2}

(1)

where n is the X and X^’ size.

VAEs extend this concept by introducing probabilistic principles into the encoding process. VAEs encode each input as a probability distribution characterized by its mean and variance (Figure 2). This enables VAEs to capture the uncertainty inherent in the data and to generate novel samples that capture the variability in the input distribution (Givnan et al., 2022).

Figure 2.

The VAEs architecture.

The loss function used to train VAEs typically consists of the sum of two terms: the reconstruction loss and the Kullback–Leibler (KL) divergence between the approximate posterior and the prior distribution over latent variables. The KL divergence term in the VAE loss function serves as a regularizer. It ensures that the distribution of latent variables produced by the encoder closely matches a predefined distribution, typically a standard normal distribution. It is a special case of the Gaussian distribution where the mean is centered at 0 and the variance is 1; then, the KL divergence term can be expressed as (Jakubowski et al., 2022):

K L_{l o s s} = - \frac{1}{2} \sum_{i = 1}^{k} 1 + \log ({Z_{σ_{i}}}^{2}) - {Z_{μ_{i}}}^{2} - {Z_{σ_{i}}}^{2}

(2)

where k represents the dimensionality of the latent space,

Z_{μ_{i}}

and Z_σiare the mean and standard deviation of the Gaussian distribution parameterized by the encoder network for the input X.

3.2. Support vector machines (SVMs)

Support Vector Machines offer a powerful approach to both linear and nonlinear classification tasks. SVMs are renowned for their excellent generalization performance, making them widely used in various applications where accurate predictions on unseen data are essential. As illustrated in Figure 3, SVMs aim to find the hyperplane that maximizes the margin between classes. By maximizing the margin, SVMs inherently seek the decision boundary that best separates the classes, which often results in better generalization to unseen data. The data points that lie closest to the hyperplane are called support vectors. These are the critical points that define the decision boundary.

Figure 3.

Optimal hyperplane and support vectors in SVM classification.

For linearly separable data, the hyperplane can be represented by the equation

f (X) = w^{T} X + b = 0

(3)

with X being the input feature vector, w is the weight vector perpendicular to the hyperplane, and b is the bias term.

The margin is the distance between the hyperplane and the nearest data point from either class (support vectors). The optimal hyperplane that separates the classes with the maximum margin while ensuring that all data points are correctly classified can be found by solving the optimization problem (Cortes and Vapnik, 1995):

{\begin{cases} \min_{w, b} \frac{1}{2} {‖ w ‖}^{2} \\ y_{i} (w^{T} X_{i} + b) \geq 1, & f o r e a c h d a t a p o i n t (X_{i}, y_{i}) \end{cases}

(4)

where y_i represents the class label.

Lagrange duality allows us to transform the primal optimization problem into its dual form. The Lagrangian is then defined as the objective function along with the constraints multiplied by their respective Lagrange multipliers α_i ≥ 0 (Cortes and Vapnik, 1995)

L (w, b, α) = \frac{1}{2} {‖ w ‖}^{2} - \sum_{i = 1}^{n} α_{i} (y_{i} (w^{T} X_{i} + b) - 1)

(5)

where α = (α₁,…, α_n) is the set of Lagrange multipliers.

Since practical datasets are rarely perfectly separable, slack variables ξ_i ≥ 0 are introduced to handle non-separable data and improve robustness to noise and outliers. The soft-margin optimization problem becomes:

{\begin{cases} \min_{w, b, ξ} \frac{1}{2} {‖ w ‖}^{2} + C \sum_{i = 1}^{n} ξ_{i} \\ \begin{array}{l} y_{i} (w^{T} X_{i} + b) \geq 1 - ξ_{i} \\ ξ_{i} \geq 0, i = 1 \dots n \end{array} \end{cases}

(6)

where C > 0 is the regularization parameter controlling the trade-off between maximizing the margin and minimizing classification errors.

After minimizing the Lagrangian with respect to w and b, the dual formulation is obtained:

{\begin{cases} \max_{α} \sum_{i = 1}^{n} α_{i} - \frac{1}{2} \sum_{i, j = 1}^{n} α_{i} α_{j} y_{i} y_{j} X_{i}^{T} X_{j} \\ \begin{array}{l} 0 \leq α_{i} \leq C, i = 1 \dots n \\ \sum_{i = 1}^{n} α_{i} y_{i} = 0, i = 1 \dots n \end{array} \end{cases}

(7)

The decision function for classifying a new observation X is given by (Cortes and Vapnik, 1995)

f (X) = \sum_{i = 1}^{n} α_{i} y_{i} X_{i}^{T} X + b

(8)

The kernel trick allows SVMs to address nonlinear classification problems by implicitly mapping the input features into a higher-dimensional space where the classes might be linearly separable (Huh, 2015). This allows us to perform linear classification in the higher-dimensional space without explicitly computing the transformed feature vectors. SVMs support various types of kernel functions, such as linear, polynomial, radial basis function, and sigmoid kernels (Chandra and Bedi, 2021). The Radial Basis Function (RBF) kernel, used in this study, is expressed as follows:

K (X_{i}, X_{j}) = \exp - \frac{{‖ X_{i} - X_{j} ‖}^{2}}{2 σ^{2}}

(9)

with X_i and X_j are input feature vectors, and σ is a hyperparameter that controls the spread of the kernel.

In the context of SVMs with a nonlinear kernel, the final decision function for classification of a new data point X can be expressed as follows (Chandra and Bedi, 2021).

f (X) = \sum_{i = 1}^{n} α_{i} y_{i} K (X_{i}, X) + b

(10)

4. Results and evaluation

To evaluate the performance of the proposed approach in determining the severity of a rotor cage fault severity, we extracted a database of 40,000 measurements per sensor using a sliding window of 800 samples from three distinct vibration sensors. For each operating mode, 8000 measurements were collected, comprising 1000 measurements per load level and 100 measurements per experimental repetition. Each load condition includes ten independent experimental trials, from which 100 non-overlapping windows were extracted. The resulting dataset is balanced across both load conditions and fault severities, with all eight load conditions represented, ensuring comprehensive coverage while minimizing redundancy. The assessment relies on vibration data acquired from three strategically positioned accelerometers, placed to capture radial and axial vibrations on the driven side and radial vibrations on the non-driven side of the motor. Data from each sensor were analyzed independently to evaluate the robustness of the proposed approach across different sensor locations and vibration characteristics.

Dataset splitting was performed at the experimental-trial level to prevent data leakage, ensuring that all signal windows (100) originating from the same trial were assigned to the same subset. For each sensor, the 40,000 samples (400 trials) were divided into training (70%: 280 trials, 28,000 samples), validation (15%: 60 trials, 6000 samples), and test (15%: 60 trials, 6000 samples) sets using a stratified partitioning strategy based on fault severity to preserve class distribution across all subsets. Load conditions were not stratified and were randomly distributed at the trial level across the training, validation, and test sets, ensuring experimental-trial independence. This procedure ensures class balance while treating load conditions as an uncontrolled operational variable, enabling a realistic evaluation setting.

The block diagram in Figure 4 provides an overview of the architecture employed in this study. The probability distribution describing the latent-space representation is obtained from a variational dense autoencoder. Variational dense autoencoders are characterized by their utilization of densely connected layers. The encoder network initially accepts the input data. It sequentially processes the data through a series of fully connected layers arranged hierarchically, gradually reducing the input data’s dimensionality. Each neuron in a given layer is connected to every neuron in the subsequent layer, facilitating the extraction of intricate features and patterns from the input. Symmetrically, the decoder network of a dense autoencoder mirrors the architecture of the encoder.

Figure 4.

VAE-based feature extraction and RBF-SVM for multi-class fault severity classification.

Table 2 summarizes the VAE model architecture selected for its optimal performance. Prior to model training, the vibration signals were scaled using min–max normalization to the range [0, 1]. The VAE was trained using the Adam optimizer with a learning rate of 0.001 and a batch size of 512. Training was performed for 50 epochs, with early stopping applied if the validation loss did not improve by at least 0.0001 over 10 consecutive epochs. The activation function used in all hidden layers was ReLU, while a Sigmoid activation was applied to the output layer. The loss function combined the mean squared error for reconstruction with the Kullback–Leibler divergence, as defined in equations (1) and (2). A fixed random seed (42) was set to ensure reproducibility of the results. Figure 5 illustrates the evolution of the training and validation losses over the training process for the three vibration signals considered in this work.

Table 2.

VAE model architecture and parameters.

Layer (type)	Output Shape	Param
Dense (Input layer)	(None, 800)	0
Dense	(None, 400)	320400
Dropout (10%)	(None, 400)	0
Dense	(None, 100)	40100
z_mean (Dense)	(None, 20)	2020
z_log_var (Dense)	(None, 20)	2020
Sampling	(None, 20)	0
Dense	(None, 100)	2100
Dropout (10%)	(None, 100)	0
Dense	(None, 400)	40400
Dense	(None, 800)	320800
Total params: 727,840 Trainable params: 727,840 Non-trainable params: 0

Figure 5.

Evolution of the training and validation loss over the training iterations: (a) radial vibrations on the driven side, (b) radial vibrations on the non-driven side, and (c) axial vibrations on the driven side.

Figures 6 –8 illustrate the vibration signals measured under the various operating modes considered, using a 20-dimensional latent space. Each figure superimposes the measured vibration signal and the autoencoder-reconstructed signal, along with the corresponding reconstruction error. This representation provides a clear visual evaluation of the model’s reconstruction capability. Furthermore, the reconstruction error is clearly identified to highlight the differences between the measured signals and their reconstructed equivalents. These visual results confirm that the original data are accurately reproduced across all the operating modes examined.

Figure 6.

Comparison of normalized radial vibration signals measured on the driven side, the autoencoder-based reconstruction, and the associated reconstruction error under the investigated operating conditions.

Figure 7.

Comparison of normalized radial vibration signals measured on the non-driven side, the autoencoder-based reconstruction, and the associated reconstruction error under the investigated operating conditions.

Figure 8.

Comparison of normalized axial vibration signals measured on the driven side, the autoencoder-based reconstruction, and the associated reconstruction error under the investigated operating conditions.

The histograms in Figure 9 illustrate the distribution of reconstruction errors on the test data for a latent space of dimension 20. To provide a more comprehensive assessment of reconstruction performance, several statistical indicators are considered, including the mean, median, and standard deviation of the reconstruction error, as well as the correlation coefficient between the original and reconstructed signals. These metrics provide a comprehensive evaluation of the reconstruction quality by quantifying both the average deviation and the similarity between the reconstructed outputs and the original signals. Overall, the results show that the latent space effectively reflects the underlying structure of the input data, enabling faithful reconstruction of the original signals. This confirms that the learned representation preserves the essential information of the input signals, in accordance with the autoencoder’s objective, thereby justifying the use of the latent space as a compact representation instead of the original high-dimensional data.

Figure 9.

Histograms of reconstruction errors on the test data: (a) radial vibrations on the driven side, (b) radial vibrations on the non-driven side, and (c) axial vibrations on the driven side.

Principal Component Analysis (PCA) is used to analyze the distribution of the encoder’s latent representations. Figure 10 presents the PCA projection of the test data for a 20-dimensional latent space. Each point corresponds to a data sample, while distinct colors indicate different operating modes. The spatial proximity of the points reflects the autoencoder’s learned similarity. The results show that the different operating modes (normal and fault severity levels) can be visually distinguished, demonstrating the autoencoder’s ability to extract meaningful features from the original data during the encoding stage. As the latent representations are not linearly separable, a nonlinear kernel is required for the SVM classifier to achieve effective discrimination between the different operating modes.

Figure 10.

Principal component analysis (PCA) of the encoded test data: (a) radial vibrations on the driven side, (b) radial vibrations on the non-driven side, and (c) axial vibrations on the driven side.

In the final stage, a Support Vector Machine classifier with an RBF kernel is trained using the latent representations as input features and the associated labels as target outputs. The SVM hyperparameters (C = 100, γ = 1/(2σ²) = 0.1) were optimized via cross-validation on the training set. The SVM implementation adopts a one-vs-one (OvO) multi-class approach for the five operating modes. The trained model is then evaluated on unseen test data to assess its classification performance and generalization ability. This evaluation validates the reliability of the SVM classifier trained on latent features extracted by the neural autoencoder.

Table 3 summarizes the results obtained for various latent space dimensionalities. The results indicate that latent spaces with dimensionalities up to 10 yield consistently strong performance, demonstrating the autoencoder’s capability to encode and reconstruct the most informative signal characteristics. However, a significant decline in accuracy is observed when the latent space dimensionality is reduced below this value. For illustrative purposes, results for a 3-dimensional latent space are also reported, highlighting substantial performance degradation. This behavior is expected, since excessively compressed latent representations cannot preserve sufficient discriminative information required to distinguish between operating modes. Therefore, the latent dimension represents a trade-off between representation compactness and diagnostic performance and was selected empirically based on validation results.

Table 3.

Accuracy performance of the autoencoder–SVM approach as a function of the latent space dimensionality.

Measured signalsLatent space dimensionality	Radial vibrations on the driven side	Radial vibrations on the non-driven side	Axial vibrations on the driven side.
50	100%	99.63%	99.31%
40	100%	99.42%	99.53%
30	100%	99.43%	99.23%
20	99.98%	99.1%	99.28%
10	99.42%	99.05%	98.83%
03	80.55%	77.08%	64.17%

Figure 11 illustrates the confusion matrices computed on the test dataset for the three vibration signals, using an autoencoder with a 20-dimensional latent space. These matrices provide a comprehensive assessment of the SVM classifiers’ ability to accurately distinguish among the different vibration signal classes, based on their representations in the latent space learned by the autoencoder.

Figure 11.

Confusion matrices computed on the test data using the autoencoder–SVM approach with a 20-dimensional latent space: (a) radial vibrations on the driven side, (b) radial vibrations on the non-driven side, and (c) axial vibrations on the driven side.

For a latent dimension of 20, classification performance is quantitatively evaluated using class-wise precision, recall, and F1 Scores. In addition, macro-averaged metrics are reported in Table 4 to summarize the overall performance for each signal.

Table 4.

Class-wise performance comparison (%) for latent dimension of 20.

Class	Radial vibrations on the driven side			Radial vibrations on the non-driven side			Axial vibrations on the driven side.
Class	Precision	Recall	F1-score	Precision	Recall	F1-score	Precision	Recall	F1-score
Normal	100	100	100	99.75	99.92	99.83	100	99.92	99.96
1 broken bar	99.92	100	99.96	97.76	98.08	97.92	97.86	99.17	98.51
2 broken bars	100	100	100	100	99.83	99.92	99.92	99.83	99.87
3 broken bars	100	100	100	99.75	99.75	99.75	99.75	100	99.88
4 broken bars	100	99.92	99.96	98.24	97.92	98.08	98.9	97.50	98.2
Macro Avg	99.98	99.98	99.98	99.1	99.1	99.1	99.29	99.28	99.28
Accuracy	99.98			99.1			99.28

These results demonstrate that the proposed VAE–SVM framework effectively learns discriminative representations from vibration signals while maintaining high classification accuracy for the investigated operating modes. The results indicate a potential reduction in the computational complexity of the classification process while achieving improved classification performance.

To further assess the effectiveness of the proposed method, a comparative analysis is conducted against previously reported approaches in the literature using the same dataset. This comparison includes both classical feature-based approaches and recent representation-learning techniques, enabling a fair evaluation of the proposed method under identical experimental conditions. The detailed results are summarized in Table 5.

Table 5.

Comparison results across previous studies using the same dataset.

Reference	Signals	Features	Methods	Accuracy (%)
(Tarek and Sameh, 2024)	⁃ Radial vibrations on the driven side ⁃ Radial vibrations on the non-driven side ⁃ Axial vibrations	Recurrent plots	Googlenet	96.8 89.6 92.8
(Tarek and Sameh, 2024)		Recurrent plots from multi-sensor vibration signals	Googlenet	100
(Misra et al., 2022)	⁃ Axial vibrations	Features from the time domain	KNN Decision Tree Random forest	77.37 83.8 86.8
		Features from the frequency domain	KNN Decision Tree Random forest	80.53 81.71 85.92
		Spectrograms	VGG16 InceptionV3 MobileNetV2	95.33 94.00 97.67
(Dişli et al., 2023)	⁃ Radial vibrations on the driven side ⁃ Stator current	Scalogram from radial vibrations	RESNET18/SVM	99
(Dişli et al., 2023)	⁃ Radial vibrations on the driven side ⁃ Stator current	Radial vibrations signal and stator current features	RESNET18/SVM	100
(Kumar et al., 2026)	⁃ Radial vibrations on the driven side ⁃ Radial vibrations on the non-driven side ⁃ Axial vibrations	Gramian angular field	EfficientNetB3	99.83
		Scalograms		99.80
		Spectrograms		94.53
		Raw time series data	1D-CNN	80.13

It can be observed that classical machine learning approaches based on time- and frequency-domain features exhibit lower performance, as reported by Misra et al. (2022), highlighting the limitations of manually engineered features in capturing complex fault characteristics. Deep learning approaches based on image transformations significantly improve classification accuracy but require computationally expensive preprocessing steps and large model architectures. However, methods combining convolutional neural networks with SVM classifiers or relying on multi-sensor fusion strategies, although achieving very high performance, generally increase computational cost and reduce computational efficiency.

In contrast, the proposed VAE–SVM framework achieves comparable or superior classification accuracy while operating on compact probabilistic latent representations learned directly and independently from vibration signals. By jointly performing nonlinear dimensionality reduction and feature learning, the proposed method provides an effective trade-off between diagnostic accuracy, computational efficiency, and model generalization. These results confirm the robustness and practical suitability of the proposed approach.

5. Conclusion

In this study, a VAE-SVM approach was proposed for the automatic detection and classification of broken-bar faults in three-phase induction motors. Variational autoencoders were employed to extract informative latent features from vibration signals while simultaneously reducing data dimensionality. These latent representations were subsequently used as inputs for Support Vector Machine classifiers to evaluate fault classification under the considered operating conditions.

The results show that the proposed VAE-SVM framework effectively captures the essential characteristics of vibration signals and enables accurate classification of normal operating conditions and broken bar fault severity levels. Classification accuracies remained high for latent spaces with dimensions up to 10, while a noticeable performance degradation occurred for very low-dimensional representations. The SVM classifier demonstrated good discriminative capability across the different studied fault classes, reflecting the usefulness of the learned latent features.

Overall, integrating VAEs as an automatic feature-extraction and dimensionality-reduction stage significantly reduces the complexity of the classification process and eliminates the need for manual feature engineering. However, the evaluation in this study is limited to a publicly available dataset under controlled experimental conditions. While the results confirm the approach’s effectiveness under these conditions, further investigations are required to assess its robustness in more realistic industrial scenarios, including variable-speed operation, noisy environments, and different machine types.

Future work will focus on improving generalization by considering a wider range of operating conditions and extending the framework to multi-fault and more complex fault scenarios. In addition, the proposed approach will be evaluated using different types of measured signals (electrical and acoustic signals) across various industrial processes.

Footnotes

Acknowledgements

The author expresses his sincere gratitude to the Laboratory of Intelligent Automation of Processes and Systems and the Laboratory of Intelligent Control of Electrical Machines, School of Engineering of São Carlos, University of São Paulo (USP), Brazil, for providing access to the open-source database.

ORCID iD

Tarek Aroui

Funding

The author received no financial support for the research, authorship, and/or publication of this article.

Declaration of conflicting interests

The author declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

References

Aref

Sahar

Omid

(2026) Condition-aware AI for predictive maintenance: dual-attention CNN-GRU with per-regime scaling. Expert Systems with Applications 315: 131582. https://doi.org/10.1016/j.eswa.2026.131582

Chandra

Bedi

(2021) Survey on SVM and their application in image classification. International Journal of Information Technology 13: 1–11. https://doi.org/10.1007/s41870-017-0080-1

Chisedzi

Muteba

(2023) Detection of broken rotor bars in cage induction motors using machine learning methods. Sensors 23(22): 9079. https://doi.org/10.3390/s23229079

Cortes

Vapnik

(1995) Support-vector networks. Machine Learning 20: 273–297. https://doi.org/10.1007/BF00994018

Dişli

Gedikpinar

Sengur

(2023) Deep transfer learning-based broken rotor fault diagnosis for induction motors. Turkish Journal of Science and Technology 18(1): 275–290. https://doi.org/10.55525/tjst.1261887

Zhang

Fang

, et al. (2022) Motor bearing fault diagnosis based on Hilbert–Huang transform and convolutional neural networks. In: IEEE transportation electrification conference and expo (ITEC Asia-Pacific), Haining, China, 28-31 October 2022, pp. 1–5. https://doi.org/10.1109/ITECAsia-Pacific56316.2022.9941910

Ezziane

Houassine

Moulahoum

, et al. (2023) Evaluating the severity of transformer winding faults using FRA and artificial intelligence. Russian Electrical Engineering 94: 138–142. https://doi.org/10.3103/S1068371223020050

Gangsar

Tiwari

(2019) Support vector machine-based fault diagnostics of induction motors for a practical situation of multi-sensor limited data case. Measurement 135: 694–711. https://doi.org/10.1016/j.measurement.2018.12.011

Givnan

Chalmers

Fergus

, et al. (2022) Anomaly detection using autoencoder reconstruction upon industrial motors. Sensors 22(9): 3166. https://doi.org/10.3390/s22093166

10.

Hasan

Rai

Ahmad

, et al. (2021) A fault diagnosis framework for centrifugal pumps by scalogram-based imaging and deep learning. IEEE Access 9: 58052–58066. https://doi.org/10.1109/ACCESS.2021.3072854

11.

Huh

M-H

(2015) Kernel-trick regression and classification. Communications for Statistical Applications and Methods 22(2): 201–207. https://doi.org/10.5351/CSAM.2015.22.2.201

12.

Jakubowski

Stanisz

Bobek

, et al. (2022) Anomaly detection in asset degradation process using variational autoencoder and explanations. Sensors 22(1): 291. https://doi.org/10.3390/s22010291

13.

Khaniki

MAL

Mirzaeibonehkhater

Manthouri

(2023) Enhancing fault detection in induction motors using LSTM-attention neural networks. In: 9th international conference on control, instrumentation and automation (ICCIA), Tehran, Iran, 20-21 December 2023, pp. 1–5. https://doi.org/10.1109/ICCIA61416.2023.10506369

14.

Kingma

Welling

(2013) Auto-encoding variational bayes . arXiv. arXiv preprint arXiv:1312.6114. https://doi.org/10.48550/arXiv.1312.6114

15.

Kuai

Civera

Coletta

, et al. (2024) Cointegration strategy for damage assessment of offshore platforms subject to wind and wave forces. Ocean Engineering 304: 117692. https://doi.org/10.1016/j.oceaneng.2024.117692

16.

Kumar

(2026) Image-based fault detection and severity classification of broken rotor bars in induction motors using EfficientNetB3. Energies 19(4): 1110. https://doi.org/10.3390/en19041110

17.

Marmouch

Aroui

Koubaa

(2017) Induction machine faults diagnosis by statistical neural networks with selection variables based on principal component analysis. In: 18th international conference on sciences and techniques of automatic control and computer engineering (STA), Monastir, Tunisia, 21-23 December 2017, pp. 99–103. https://doi.org/10.1109/STA.2017.8314887

18.

Memariam

Seshu

Huang

(2023) Control valve stiction detection using Markov transition field and deep convolutional neural network. The Canadian Journal of Chemical Engineering 101(11): 6114–6125. https://doi.org/10.1002/cjce.25054

19.

Misra

Kumar

Sayyad

, et al. (2022) Fault detection in induction motor using time domain and spectral imaging-based transfer learning approach on vibration data. Sensors 22(21): 8210. https://doi.org/10.3390/s22218210

20.

Pandarakone

Mizuno

Nakamura

(2016) Frequency spectrum investigation andanalytical diagnosis method for turn-to-turn short-circuit insulation failure in stator winding of low voltage induction motor. IEEE Transactions on Dielectrics and Electrical Insulation 23(6): 3249–3255. https://doi.org/10.1109/TDEI.2016.006095

21.

Spina

Luiz

FOC

Wallthynay

, et al. (2024) Comparison of autoencoder architectures for fault detection in industrial processes. Digital Chemical Engineering 12: 100162. https://doi.org/10.1016/j.dche.2024.100162

22.

Tami

Masri

Hasasneh

, et al. (2024) Transformer-based approach to pathology diagnosis using audio spectrogram. Information 15(5): 253. https://doi.org/10.3390/info15050253

23.

Tang

Wang

(2022) A novel fault diagnosis method of rolling bearing based on integrated vision transformer model. Sensors 22(10): 3878. https://doi.org/10.3390/s22103878

24.

Tarek

Sameh

(2024) Improved deep-learning rotor fault diagnosis based on multi vibration sensors and recurrence plots. Journal of Vibration and Control 31(9–10): 1874–1883. https://doi.org/10.1177/10775463241250367

25.

Treml

Flauzino

Suetake

, et al. (2020) Experimental database for detecting and diagnosing rotor broken bar in a three phase induction motor. IEEE Dataport. https://ieee-dataport.org/open-access/experimental-databasedetecting-and-diagnosing-rotor-broken-bar-three-phase-induction

26.

Yahui

Taotao

Xufeng

, et al. (2021) Fault diagnosis of rotating machinery based on recurrent neural networks. Measurement 171: 108774. https://doi.org/10.1016/j.measurement.2020.108774

27.

Yandong

Jinjin

Zhengquan

, et al. (2023) Diagnosisformer: an efficient rolling bearing fault diagnosis method based on improved transformer. Engineering Applications of Artificial Intelligence 124: 106507. https://doi.org/10.1016/j.engappai.2023.106507

28.

Gao

, et al. (2019) A novel hierarchical algorithm for bearing fault diagnosis based on stacked LSTM. Shock and Vibration 2019: 2756284. https://doi.org/10.1155/2019/2756284

29.

Yuhong

Lei

Yushu

(2022) A time series transformer based method for the rotating machinery fault diagnosis. Neurocomputing 494: 379–395. https://doi.org/10.1016/j.neucom.2022.04.111

30.

Zhou

Yan

Ren

, et al. (2021) Rolling bearing fault diagnosis using transient-extracting transform and linear discriminant analysis. Measurement 178: 109298. https://doi.org/10.1016/j.measurement.2021.109298

31.

Zhou

Long

Sun

, et al. (2022) Bearing fault diagnosis based on Gramian angular field and DenseNet. Mathematical Biosciences and Engineering 19(12): 14086–14101. https://doi.org/10.3934/mbe.2022656