A fuzzy support vector machine based on environmental membership and its application to motor fault classification

Abstract

To weaken the effects of the outliers or noise in classification, a fuzzy support vector machine (FSVM) based on environmental fuzzy membership is proposed. The environmental fuzzy membership considers not only the number of the similar samples nearby but also the distribution of the samples nearby. As more information of the samples is considered, the reliability and robustness of the FSVM is further enhanced, which can improve the classification performance, especially for overlapping samples. The classification performance of the proposed method is validated by numerical case studies, an experimental study for a breast cancer dataset, and an application to motor fault classification. Compared with the FSVM based on the k-nearest neighbor algorithm, the proposed method obtains more robust and accurate classification rates in all case studies.

Keywords

Environmental fuzzy membership fuzzy support vector machine motor fault classification k-nearest neighbor algorithm Motor fault diagnosis

1. Introduction

With their great performance in terms of reliability, low-cost robustness, and ease of control, motors are widely used in industry (Arabaci and Bilgin, 2009). Thus it is of obvious significance for condition monitoring and fault diagnosis to improve the operational conditions of motors. Furthermore, it has been proved that advanced diagnosis and prognostics of motor faults can reduce the cost of maintenance and probability of unexpected failure (Ayaz et al., 2006). However, due to the complicated mechanism and aliasing characteristic frequencies of motor faults, it is still challenging to classify motor faults to obtain an accurate fault diagnosis.

As a result of the development of data mining techniques, the classification of motor faults has attracted considerable attention in recent years. Many signal processing techniques have been used in this field, such as fast Fourier transform (FFT; Kabla and Mokrani, 2011; Akar, 2013), short time Fourier transform (STFT; Nandi et al., 2011; Cabal-Yepez et al., 2013; Wang et al., 2013a), wavelet transform (Yaqub et al., 2012; Gritli et al., 2013), blind source separation (Cheng et al., 2011, 2012a, 2012b, 2014), Hilbert transform (HT; Pineda-Sanchez et al., 2009; Xu et al., 2013), and empirical mode decomposition (Antonino-Daviu et al., 2012). Intelligent methods also have been considered and widely used, such as artificial neural networks (Arabaci and Bilgin, 2010; Bingol and Pacaci, 2012) and support vector machines (SVMs; Ebrahimi and Faiz, 2010; Banerjee and Das, 2012; Yaqub et al., 2013). All of these methods make great contributions to the classification of motor faults. Torkaman et al. (2011) applied FFT to classify bearing failures based on the characteristic frequencies of faults. Vulli et al. (2009) applied STFT to isolate the occurrence of a particular fault in the time domain. Zhang et al. (2012) applied wavelets to the classification of motor faults, especially for broken bars. To overcome the disadvantages of FFT for the induction motors at low slip, Xu et al. (2013) proposed an improved Hilbert method by combining the HT and estimation of signal parameters via rotational invariance technique (ESPRIT). However, the mechanism of motor faults is complicated, and fault features are difficult to extract, which makes it still a challenging problem for motor fault classification, especially multiple classification. Furthermore, most of these methods are applied to diagnose specific faults rather than multiple faults. Therefore, this paper proposes a novel classification method based on the fuzzy support vector machine (FSVM) to overcome such problems for motor fault classification.

An SVM is a powerful tool for classification problems, and many SVM-based classification methods have been proposed and applied to real applications (Sabzekar et al., 2011; Keskes et al., 2013; Liu et al., 2013; Wang et al., 2013b). However, they are all sensitive to outliers or noise. To overcome such a problem, Lin and Wang (2002) proposed the FSVM, which has been widely studied and applied to engineering applications in recent years. However, the definition of fuzzy membership is a major issue of the FSVM and a general criterion is still lacking. The most challenging problem is that the noise or outliers influence the definition of fuzzy memberships. Lin and Wang (2002) defined a fuzzy membership based on the distances between the samples and their center. Zhang et al. (2006) define a fuzzy membership by combining an affinity with the distances between the samples and their center. Heo et al. (2010) used k-nearest neighbor (KNN) to define the fuzzy membership. Wei and Wu (2012) used a Gaussian kernel-based definition to calculate the fuzzy membership.

However, all these proposed methods still cannot provide an effective way to overcome the effects of the outliers or noise. Therefore, we propose a novel definition of fuzzy membership entitled environmental fuzzy membership, which focuses on the local distribution so that the noise or outliers cannot influence the global fuzzy membership of samples. Our definition contains two factors: average distance and the ratio of similar samples and heterogeneous samples in a local area. Based on these two factors, we classify all the samples into three clusters, and the fuzzy membership is calculated by different methods in each cluster. The performance of our method is validated by numerical case studies, an experimental study for a breast cancer dataset, and an application to motor fault classification. The results show that our method can effectively weaken the effects of the outliers or noise, and adaptively classify all the samples into the right cluster with a high correct rate (near 100%), while the performance of the traditional FSVM based on KNN is influenced by the parameter k and the classification correct rates of some samples are very low (even 19.03%). Therefore, the proposed FSVM with environmental fuzzy membership can effectively solve the problems caused by the outliers or noise.

The rest of this paper is organized as follows. In Section 2, the theory of FSVM is introduced. In Section 3, the disadvantages of the fuzzy membership are described, and the environmental fuzzy membership is also proposed and introduced. In Sections 4 and 5, numerical case studies, an experimental study on a breast cancer dataset, and an application to motor fault classification are presented, and the performances of the FSVM based on KNN and our method are comparatively studied and discussed. In Section 6, we present our conclusions.

2. Fuzzy support vector machine

The FSVM, which is a modification of the SVM proposed by Lin and Wang (2002), defines a fuzzy membership to each sample of the SVM. With different fuzzy memberships, the samples can make different contributions to the classification. Therefore, we can weaken the effects of the noise or outliers by means of the fuzzy membership.

The calculating framework of FSVM is similar to SVM. Given a set of labeled training samples

(y_{1}, X_{1}), \dots, (y_{n}, X_{n})

(1)

Each training sample $x_{i} \in R^{N}$ belongs to two clusters, and $y_{i} \in {- 1, 1}$ for $i = 1, \dots, n$ , is the label of clusters. Lin and Wang (2002) defined every sample as an associated fuzzy membership $0 < σ \leq S_{i} \leq 1$ for $i = 1, \dots, n$ , while σ is as small as possible

(y_{1}, X_{1}, S_{1}), \dots, (y_{n}, X_{n}, S_{n})

(2)

Thus the classification problem can be treated as a quadratic programming problem as

\begin{array}{l} \frac{Minimize}{12} W * W + c \sum_{i = 1}^{n} s_{i} ξ_{i} \\ Subject to {\begin{matrix} y_{i} (W \times Z_{i} + b) \geq 1 - ζ_{i}, & i = 1, ..., n \\ ζ_{i} \geq 0, & i = 1, ..., n \end{matrix} \end{array}

(3)

where c is a constant,

Z_{i} = φ (x)

denotes the corresponding feature space vector with a mapping φ from

R^{N}

to a feature space Z,

ξ_{i}

is a measure of the error for the SVM, and

s_{i} ξ_{i}

is a measure of the error with different weighting. It should be noted that a smaller s_i can weaken the effects of

ξ_{i}

to reduce the weighting of the corresponding sample x_i, which can weaken the effects of noises or outliers in the classification problems.

A Lagrangian function is applied to solve such a problem

\begin{matrix} L (W, b, ξ, α, β) \\ = \frac{1}{2} W * W + c \sum_{i = 1}^{l} s_{i} ξ_{i} \\ - \sum_{i = 1}^{l} α_{i} (y_{i} (W * Z_{i} + b) - 1 + ξ_{i}) - \sum_{i = 1}^{l} β_{i} ξ_{i} \end{matrix}

(4)

where α, β are the Lagrange operators. The saddle point of

L (W, b, ξ, α, β)

can be found as all the parameters satisfy the following conditions

\frac{\partial L (W, b, ξ, α, β)}{\partial W} = W - \sum_{i = 1}^{l} α_{i} y_{i} z_{i} = 0

(5)

\frac{\partial L (W, b, ξ, α, β)}{\partial b} = - \sum_{i = 1}^{l} α_{i} y_{i} = 0

(6)

\frac{\partial L (W, b, ξ, α, β)}{\partial ξ_{i}} = s_{i} c - α_{i} - β_{i} = 0

(7)

Thus the quadratic programming problem in equation (3) can be transformed as

\begin{matrix} maximizeW (α) = \sum_{i = 1}^{l} α_{i} - \frac{1}{2} \sum_{i = 1}^{l} \sum_{j = 1}^{l} α_{i} α_{j} y_{i} y_{j} K (x_{i}, x_{j}) \\ subjectto \sum_{i = 1}^{l} y_{i} α_{i} = 00 \leq α_{i} \leq s_{i} C, i = 1, \dots, l \end{matrix}

(8)

$K (x_{i}, x_{j})$ called the kernel can compute the dot products of the data samples in feature space Z. As FSVM can reduce the effects of the outliers or noise, it has been considered as a powerful fault classification method.

3. Fuzzy membership

3.1. The disadvantages of the existing fuzzy membership

The most challenging problem of FSVM is how to define the fuzzy membership. In past decades, many criteria have been proposed (Zhang et al., 2006; Heo et al., 2010; Wei and Wu, 2012). However, until now there is still lack of a general criterion for the fuzzy membership. All the current FSVMs can be classified into two kinds: one is defined based on the global distribution, and the most representative FSVM in this kind defines the fuzzy membership as the distance between the sample and the cluster center; the other one is based on the local distribution.

For the fuzzy membership of FSVM defined as the distance between the sample and the cluster center, the first step is to calculate the cluster center $O^{+}$ and $O^{-}$ of the samples $x^{+}$ and $x^{-}$ , and then the radiuses of each cluster are defined as

r^{+} = max ‖ O^{+} - x_{i} ‖ where x_{i} \in O^{+}

(9)

r^{-} = max ‖ O^{-} - x_{i} ‖ where x_{i} \in O^{-}

(10)

The fuzzy membership s_i is defined as

\begin{matrix} s_{i} = {\begin{matrix} 1 - ‖ O^{+} - x_{i} ‖ / (r^{+} + δ) & if x_{i} \in O^{+} \\ 1 - ‖ O^{-} - x_{i} ‖ / (r^{-} + δ) & if x_{i} \in O^{-} \end{matrix}, \end{matrix}

(11)

where

δ > 0

is used to avoid the case s_i = 0.

Figures 1 and 2 present two different kinds of distributions. Comparing Figure 1 with Figure 2, it can be seen that these two kinds of samples have the same distribution besides one sample. As an additional sample is added into the second samples, the position of the cluster center is changed, and the radius is also changed. Therefore, the fuzzy memberships of all the samples are changed because of one additional sample, which is the reason that the classification performance of FSVM considering the global distribution normally is influenced by the outliers or noises.

Figure 1.

A distribution without outliers.

Figure 2.

A distribution with one outlier.

Zhang et al. (2006) defined an affinity to improve the performances of the fuzzy memberships. They classify all the samples into two clusters: the fuzzy membership of the samples near the cluster center is more than 0.4, while the fuzzy membership of the samples far from the cluster center is less than 0.4. However, for their method, the additional sample in Figure 2 still influences the fuzzy memberships of some samples, which are very important samples for classification. Furthermore, this method may define some normal samples far from the cluster center as the outliers or noises, which may decrease the correct rate of the classification. Besides, there are several other methods based on the global distribution, but all of them still cannot effectively weaken the effects of the outliers or noises.

The fuzzy memberships based on the local distribution are proposed to complement the insufficiency of the fuzzy membership based on the global distribution, and the well-known one is the KNN: to find k nearest neighbors (k is an operator determined before classification), and calculate the fuzzy membership based on the ratio of similar samples and the k samples. The outliers or noise only influences the nearest k samples, which can improve the correct rate of the classification for the normal samples far from the cluster center. However, the KNN may cause problems as it only considers the number of the similar samples but their distributions.

Figure 3 presents a special distribution in that the red sample is an outlier, whose fuzzy membership becomes bigger if we use KNN, because the close samples are similar samples even the distances between this outlier and the close samples are very big. Therefore, the classification of outliers influences the accuracy rate of FSVM significantly. As the distances between some samples (far from the cluster center) and the nearest samples become bigger, the distributions of these samples tend to the global one rather than the local one. Therefore, some useful information of the distribution is not considered sufficiently as the KNN only considers the number of the nearest samples in the same cluster.

Figure 3.

The distributions of two samples with one outlier.

3.2. The environmental fuzzy membership

In this section, we propose a novel definition of the fuzzy membership entitled as environmental fuzzy membership, and we consider both the number of the similar samples nearby and the distribution of the samples nearby. The definition of the environmental fuzzy membership S contains two parts

S = N \times D

(12)

where

N = {n_{1}, n_{2}, \dots, n_{l}}

is determined by the number of the similar samples nearby and the heterogeneous samples nearby, and

\begin{matrix} D = [\begin{matrix} d_{1} & 0 & \dots & 0 \\ 0 & d_{2} & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & d_{l} \end{matrix}] \end{matrix}

is determined by the mean distance of the similar samples nearby and the mean distance of the heterogeneous samples nearby. Different from the KNN, the number of the samples considered is not a constant, which is determined by the number of the samples in the radius R. First we define a range R which can be adaptively optimized in the training step. It should be noted that all the distances that we consider are in the feature space Z, and the distance

r_{ij}

between sample i and sample j is defined as follows

r_{ij} = ‖ x_{i} - x_{j} ‖ = K (x_{i}, x_{i}) - 2 K (x_{i}, x_{j}) + K (x_{j}, x_{j})

(13)

With the distance $r_{ij}$ , we construct the distance matrix Q as follows

\begin{matrix} Q = [\begin{matrix} p_{11} r_{11} & p_{12} r_{12} & p_{13} r_{13} & \dots & p_{1 l} r_{1 l} \\ p_{21} r_{21} & p_{22} r_{22} & p_{23} r_{23} & \dots & p_{2 l} r_{2 l} \\ p_{31} r_{31} & p_{32} r_{32} & p_{33} r_{33} & \dots & p_{3 l} r_{3 l} \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ p_{l 1} r_{l 1} & p_{l 2} r_{l 2} & p_{l 3} r_{l 3} & \dots & p_{ll} r_{ll} \end{matrix}] \end{matrix}

(14)

where l is the number of the samples, and

P = {p_{ij}}

is a matrix based on the distances between two samples

\begin{matrix} P = [\begin{matrix} p_{11} & p_{12} & \dots & p_{1 l} \\ p_{21} & p_{22} & \dots & p_{2 l} \\ : & : & : \\ p_{l 1} & p_{l 2} & \dots & p_{ll} \end{matrix}] \end{matrix}

(15)

\begin{matrix} p_{ij} = {\begin{matrix} 1 & r_{ij} \leq R \\ 0 & r_{ij} > R \end{matrix} i, j = 1, 2, \dots l \end{matrix}

(16)

The samples located in the range R can be found with the distance matrix Q. In general, the distribution of the samples can be classified into three kinds, which are shown in Figure 4(a), (b), and (c).

Figure 4.

Different conditions of the outlier: (a) case 1; (b) case 2; (c) case 3.

For all the three different conditions of the outlier in Figure 4, the blue sample in the center is the one that we calculate its fuzzy membership. In Figure 4(a), the number of the similar samples nearby is much bigger than the number of the heterogeneous samples nearby. Therefore, we consider this sample as a normal sample with a high probability, and we can give a large fuzzy membership to the sample. However, in Figure 4(b), the number of the similar samples nearby is much smaller than the number of the heterogeneous samples nearby, and we consider it as an outlier or noise with a high probability. Therefore, we give a small fuzzy membership to the sample. For the case in Figure 4(c), the number of the similar samples nearby is similar to the number of the heterogeneous samples nearby, and we consider it as a sample locating near the margin with a high probability. Therefore, we give a suitable fuzzy membership to the sample. According to this principle, we can determine the fuzzy memberships based on the environments of the given samples.

We also consider the mean distances of the similar samples nearby and heterogeneous samples nearby, especially for the case shown in Figure 4(c). The samples are classified into three parts according to the difference between the mean distance of similar samples nearby and the mean distance of the heterogeneous samples nearby. After these processes, we construct $N = {n_{i}}$ and $D = {d_{ii}}$ as follows

\begin{matrix} n_{i} = {\begin{matrix} e^{- M (i)}, & M (i) \leq - t or O (i) > d \\ \frac{1}{M (i)}, & | M (i) | < t and | O (i) | < d \\ 1, & M (i) \geq t or O (i) < - d \end{matrix} \end{matrix}

(17)

\begin{matrix} d_{i} = {\begin{matrix} 1 - \frac{1}{O_{1} (i)}, & | M (i) | > t \\ 1 - \frac{1}{O_{1} (i) - O_{2} (i)}, & | M (i) | \leq t \end{matrix} i = 1, 2, \dots, l \end{matrix}

(18)

where M is a matrix that denotes the difference between the number of the similar samples nearby and the number of the heterogeneous samples nearby, O is a matrix that denotes the difference between the mean distance of the similar samples nearby and the mean distance of the heterogeneous samples nearby, t and d are two thresholds of M and O, which can be adaptively determined in training process.

O_{1}

is a vector which denotes the mean distance of the similar samples nearby, and

O_{2}

is a vector which denotes the mean distance of the heterogeneous samples nearby.

Thus, there are

M = P \times Y - 1

(19)

O = Q \times Y

(20)

O_{1} (i) = \frac{O (i) + \sum_{j = 1}^{l} y_{i} * Q (i, j)}{2 \times (M (i) + \sum_{j = 1}^{l} y_{i} * P (i, j) - 1)}

(21)

O_{2} (i) = \frac{O (i) - \sum_{j = 1}^{l} y_{i} * Q (i, j)}{2 \times (M (i) - \sum_{j = 1}^{l} y_{i} * P (i, j) + 1)}

(22)

where

Y = {y_{1}, y_{2}, \dots y_{l}}^{T}

is a vector which denotes the labels of the clusters.

The flowchart of the environmental fuzzy membership is shown in Figure 5: the fuzzy memberships are given suitable values in each cluster as some samples are particularly treated, which can significantly improve the classification performances of the FSVM.

Figure 5.

The definition of the environmental fuzzy membership.

4. Experimental studies

In this section, we validate our method according to two numerical case studies and a breast cancer dataset from UCI machine learning database. For these cases, all the samples are classified by both FSVM based on KNN (KNNFSVM) and FSVM based on the environmental fuzzy membership (EFSVM), and the classification performances of the two methods are comparatively studied and discussed.

4.1. Numerical case studies

To reveal the classification performances of EFSVM on weakening the effects of the outliers or noises, we randomly generate two kinds of samples with some outliers, and there is no overlapping between these two kinds of samples. In the dataset, there are in total 600 samples: 80 of 600 samples are used to train the FSVM algorithm, and the other 520 samples are used to test the classification performances. For the 80 samples, they contain two different kinds of samples, and each kind of samples has three outliers. In the training process, we apply a particle swarm optimization (PSO) algorithm to optimize the parameter c, d, t, and R.

The classification results of KNNFSVM and EFSVM are shown in Table 1. In Table 1, the KNNFSVM obtains classification correct rates of 22.12%, 72.88%, 88.85%, and 50% for k = 2, 3, 4, and 5, respectively, while the EFSVM classifies all the samples into the right clusters with correct rates of 100%, which indicates that the EFSVM has outstanding performance on weakening the effects of the outliers. Furthermore, the classification performance of the EFSVM is reliable and robust without a parameter optimization, while the classification performance of KNNFSVM is related to the parameter k, which indicates that the KNNFSVM is not as robust and reliable to the outliers as the EFSVM.

Table 1.

Correct rates of EFSVM and KNNFSVM for numerical case 1.

	EFFSVM	KNNFSVM
	EFFSVM	k = 2	k = 3	k = 4	k = 5
Correct rates	100%	22.12%	72.88%	88.85%	50%

Another case is generated with an overlap between these two kinds of samples, and the classification performances of the KNNFSVM and the EFSVM for the overlapping samples are also comparatively studied, the results of which are shown in Table 2. In Table 2, the correct rate of the EFSVM is 95.63%, while the correct rates of the KNNFSVM are less than 94.23% and even 19.03% for the case of k = 5, which indicates that the classification performances of EFSVM are more robust and reliable compared with the KNNFSVM for the overlapping samples.

Table 2.

Correct rates of EFSVM and KNNFSVM for numerical case 2.

	EFSVM	KNNFSVM
	EFSVM	k = 2	k = 3	k = 4	k = 5
Correct rates	95.63%	94.23%	93.46%	94.23%	19.03%

4.2. Experimental studies on a breast cancer dataset

In this section, the classification performances of the EFSVM and KNNFSVM are comparatively studied with a breast cancer dataset from the UCI machine learning database. The dataset was obtained from the University of Wisconsin Hospital at Madison by Kristin and Mangasarian (2002), and it contains 699 samples which have nine attributes. Among the 699 samples, 458 samples belong to the data of normal people, and 241 samples belong to the data of the patients. We choose 80 samples as the training samples (two kinds, each kind contains 40 samples), and the remaining 619 samples are used as the test samples.

Table 3 shows that the correct rates of the KNNFSVM and the EFSVM for the breast cancer dataset. The correct rate of EFSVM for the normal person is up to 97.12%, and the correct rate for the patients is up to 96.01%, which makes a total correct rate 96.76%. The correct rates of the KNNFSVM are between 82.09% and 97.85%, which indicates a total correct rate about 92%. However, the KNNFSVM cannot equally classify all the samples into the right clusters, when the correct rate of the normal person is high (such as 97.84%, k = 3), the correct rate of the patients would be low (82.09%, k = 3), and this phenomenon occurs for each classification as k = 2, 3, 4, and 5. The reason for this phenomenon is that the outliers influence the fuzzy memberships of the KNNFSVM. As the fuzzy memberships of the outliers are enlarged, the classification boundary tends to one cluster. Therefore, the correct rate of this cluster would be lower than that of the other one.

Table 3.

Correct rates of EFSVM and KNNFSVM for the breast cancer dataset.

	EFSVM	KNNFSVM
	EFSVM	k = 2	k = 3	k = 4	k = 5
Total correct rates	96.76%	90.46%	92.73%	92.73%	93.53%
Correct rates of normal person	97.12%	87.32%	97.84%	97.85%	97.61%
Correct rates of patients	96.01%	97.01%	82.09%	82.09%	85.07%

According to these three case studies, it can be concluded that the outliers or noise influence the classification significantly. The fuzzy membership of KNN only considers the number of the nearest samples with the same kind, while the environment fuzzy membership considers both the number of all the samples and the distribution of the samples nearby. Therefore, the correct rates of the EFSVM are higher than that of the KNNFSVM for all the three case studies, which validate that the EFSVM is more robust and reliable than the KNNFSVM in weakening the effects of the outliers or noises.

5. Application to motor fault classification

5.1. Introduction of the motor test bed

In this section, the proposed method is applied to a motor test bed which contains three parts: the transmission system with a shaft, a coupling, two pairs of rolling bearings, four rotors, a pulley, and a gear box; the horsepower variable frequency AC driver with multi-featured front panel programmable controller that can control the motor rotational speed; seven motors with a normal condition and six faults. A photo of the test bed is shown in Figure 6, and the detailed information of the motors is shown in Table 4.

Figure 6.

The photo of the motor test bed.

Table 4.

Details of the motors.

Motors	Conditions
Motor 1	Normal AC motor with good condition
Motor 2	AC motor with built-in broken rotor bars
Motor 3	AC motor with built-in faulted bearing
Motor 4	AC motor with built-in unbalance rotor
Motor 5	AC motor with built-in bowed rotor
Motor 6	AC motor with built-in misalignment rotor
Motor 7	AC motor with built-in faulted winding

The locations of the sensors are shown in Figure 7: two sensors are located on the top and left sides of the motor casing to measure the vertical and horizontal vibrations, and another sensor is located at the shaft to measure the vibration in axial direction. The Sony EX series data acquisition and analysis system is applied to collect the vibration data, and the parameters of the measuring system are shown in Table 5.

Figure 7.

The locations of the sensors.

Table 5.

Parameters of the data acquisition system.

Parameters	Mode / Units
Sampling frequency	6400 Hz
Acquisition mode	Continuous acquisition
Acquisition channel	3

5.2. Motor fault classification

In the experiments, the rotational speeds of the motors are set to 25 Hz, and three data labeled data 1, data 2, and data 3 with a sampling length of 1 min are collected on the top of the motor casing: the first 20 s of the data 1 are used as the training samples, and the first 20 s of the data 2 are used as the testing samples to optimize the parameters of the FSVM, and then the first 40 s of the data 3 for all seven motors are classified by both the EFSVM and the KNNFSVM. The waveforms of the vibration data for each motor are shown in Figure 8.

Figure 8.

The waveforms of the vibration data for each motor.

In the training process, each training sample has a data length of 1 s, and thus each training dataset has 20 samples. We extract 30 features for each sample: 8 frequency domain features, 6 time domain features, and 16 wavelet packet energy features with a four-class decomposition. The detailed information of all the features is shown in Table 6.

Table 6.

The time and frequency features.

Time	$p_{1} = \frac{1}{N} \sum_{i = 1}^{N} \| x_{i} \|^{3}$	$p_{2} = \frac{1}{N} \sum_{i = 1}^{N} x_{i}^{4}$
	$p_{3} = \frac{max (x_{i})}{\sqrt{\frac{1}{N} \sum_{i = 1}^{N} x_{i}^{2}}}$	$p_{4} = \frac{max (x_{i})}{(\frac{1}{N} \sum_{i = 1}^{N} \sqrt{\| x_{i} \|})^{2}}$
	$p_{5} = \frac{max (x_{i})}{\frac{1}{N} \sum_{i = 1}^{N} \| x_{i} \|}$	$p_{6} = \frac{\sqrt{\frac{1}{N} \sum_{i = 1}^{N} x_{i}^{2}}}{\frac{1}{N} \sum_{i = 1}^{N} \| x_{i} \|}$
Frequency	$F_{1} = \frac{1}{N} \sum_{l = 1}^{N} y_{l}$	$F_{2} = \frac{1}{N - 1} \sum_{l = 1}^{N} (y_{l} - F_{1})^{2}$
	$F_{3} = \frac{\sum_{l = 1}^{N} (y_{l} - F_{1})^{3}}{N \times \sqrt{F_{2}^{3}}}$	$F_{4} = \frac{\sum_{l = 1}^{N} (y_{l} - F_{1})^{4}}{N \times F_{2}^{2}}$
	$F_{5} = \frac{\sum_{l = 1}^{N} f_{l} y_{l}}{\sum_{l = 1}^{N} y_{l}}$	$F_{6} = \sqrt{\frac{\sum_{l = 1}^{N} ((f_{l} - F_{5})^{2} \times y_{l})}{N}}$
	$F_{7} = \sqrt{\frac{\sum_{l = 1}^{N} (f_{l}^{2} \times y_{l})}{\sum_{l = 1}^{N} y_{l}}}$	$F_{8} = \sqrt{\frac{\sum_{l = 1}^{N} (f_{l}^{4} \times y_{l})}{\sum_{l = 1}^{N} (f_{l}^{2} \times y_{l})}}$
	$F_{9} = \frac{\sum_{l = 1}^{N} (f_{l}^{2} \times y_{l})}{\sqrt{\sum_{l = 1}^{N} y_{l} \times \sum_{l = 1}^{N} (f_{l}^{4} \times y_{l})}}$	$F_{10} = \frac{F_{6}}{F_{5}}$
	$F_{11} = \frac{\sum_{l = 1}^{N} ((f_{l} - F_{5})^{3} \times y_{l})}{N \times F_{6}^{3}}$	$F_{12} = \frac{\sum_{l = 1}^{N} ((f_{l} - F_{5})^{4} \times y_{l})}{N \times F_{6}^{4}}$
	$F_{13} = \frac{\sum_{l = 1}^{N} (\sqrt{\| f_{l} - F_{5} \|} \times y_{l})}{N \times \sqrt{F_{6}}}$

SVM is a two-category classifier, but our case is a multi-category classification problem. Based on the distributions of the samples for all seven motors: the distributions of the samples for motor 1, motor 2, motor 4, and motor 6 overlap, while the distributions of the samples for motor 3, motor 5, and motor 7 are far from each other. Therefore, we use a classification strategy of binary tree for the motor 1, motor 2, motor 4, and motor 6, and use a classification strategy of the directed acyclic graph (DAG) for the motor 3, motor 5 and motor 7. For the first case, the distance between the distributions of the motor 2 and that of motor 4 are much close, and the distance between the distributions of motor 1 and that of the motor 6 are much close. Therefore, we consider motor 2 and motor 4, motor 1 and motor 6 as two kinds, respectively. Our classification strategy is shown in Figure 9.

Figure 9.

The multi-classification strategy of the motor faults.

The PSO algorithm is also applied to optimize the parameters c, d, t and R of the FSVM. The correct rates for the motor fault classification of the EFSVM and the KNNFSVM are shown in Table 7 and Table 8, respectively. The row of the table represents the real categories of the motor, while the column of the table represents the categories classified by the FSVM, and thus the correct rates of the classification are shown in the diagonal of the tables.

Table 7.

The classification correct rates of the EFSVM.

	Motor 1	Motor 2	Motor 3	Motor 4	Motor 5	Motor 6	Motor 7
Motor 1	100%
Motor 2		85%		15%
Motor 3			100%
Motor 4				100%
Motor 5					100%
Motor 6						100%
Motor 7							100%

Table 8.

The classification correct rates of the KNNFSVM.

	Motor 1	Motor 2	Motor 3	Motor 4	Motor 5	Motor 6	Motor 7
Motor 1	70%		5%	25%
Motor 2		40%		60%
Motor 3	5%		90%				5%
Motor 4	10%			90%
Motor 5			5%		95%
Motor 6	5%		5%			90%
Motor 7							100%

Comparing Table 7 with Table 8, both the EFSVM and the KNNFSVM obtain a good classification performance, and the mean correct rates of the EFSVM and the KNNFSVM are up to 97.86% and 82.14%, respectively. However, the correct rates of EFSVM for motors 1–7 are 100%, 85%, 100%, 100%, 100%, 100%, 100%, while the correct rates of KNNFSVM for motors 1–7 are 70%, 40%, 90%, 90%, 95%, 90%, and 100%. Obviously, EFSVM obtains a better classification performance for all the motors (especially motor 2) than KNNFSVM. The reason is that the distributions of the samples for motor 1, motor 2, motor 4, and motor 6 are close to each other, and some samples close to the classification boundary known as the outliers cause KNNFSVM low classification correct rates, because KNNFSVM is sensitive to the outliers or noises; while EFSVM obtains more higher correct rates as it has better performances on weakening the effects of the outliers or noise.

6. Conclusions

In this paper, a FSVM based on environmental fuzzy membership is proposed to weaken the effects of the outliers or noises in classification. The environmental fuzzy membership considers not only the number of the similar samples nearby but also the distribution of the samples nearby. Different from the KNN, the number of the considered samples is not a constant but a variable determined by a local area. By means of this definition, the effects of the outliers or noises on classification problem are obviously weakened, and thus the reliability and robustness of FSVM are enhanced.

In the numerical case studies, the correct rates of EFSVM are 100% and 95.63% while the correct rates of KNNFSVM are 22.12% for k = 2 and 19.03% for k = 5, which indicate that the classification performances of KNNFSVM are influenced by the parameter k while our method is adaptive and robust for all two cases. For the breast cancer data set, the boundary of KNNFSVM is obviously influenced by the outliers or noise, which causes some samples to be classified into wrong clusters with high probability; while our method can overcome such a difficulty and obtain higher classification correct rates. The application to motor fault classification also indicates that the EFSVM has a better classification performance for all seven kinds of motors, while the KNNFSVM cannot classify some faults into correct clusters as the correct rate is only 40%. Therefore, the EFSVM has more robust and reliable classification performances as it can effectively weaken the effects of the outliers or noises.

Footnotes

Declaration of Conflicting Interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China (grant numbers 51775407 and 51475356), Natural Science in Shaanxi Province (grant number 2015JQ5183), and the Fundamental Research Funds for the Central Universities.

References

Akar

(2013) Detection of a static eccentricity fault in a closed loop driven induction motor by using the angular domain order tracking analysis method. Mechanical Systems and Signal Processing 34(1–2): 173–182.

Antonino-Daviu

Riera-Guasp

Pons-Llinares

et al. (2012) Toward condition monitoring of damper windings in synchronous motors via EMD analysis. IEEE Transactions on Energy Conversion 27(2): 432–439.

Arabaci H and Bilgin O (2009) Neural network classification and diagnosis of broken rotor bar faults by means of short time Fourier transform. In: IMECS 2009: international multi-conference of engineers and computer scientists (ed O Castillo, C Douglas, DD Feng, et al.), Hong Kong, 18–20 March 2009, pp.219–223. Hong Kong: IMECS.

Arabaci

Bilgin

(2010) Automatic detection and classification of rotor cage faults in squirrel cage induction motor. Neural Computing & Applications 19(5): 713–723.

Ayaz E, Ozturk A and Seker S (2006) Continuous wavelet transform for bearing damage detection in electric motors. In: IEEE Mediterranean Electrotechnical Conference-MELECON 2006 (ed F Sandoval, C Camacho A and Puerta), Malaga, Spain, 16–19 May 2006. IEEE.

Banerjee

Das

(2012) Multi-sensor data fusion using support vector machine for motor fault detection. Information Sciences 217: 96–107.

Bingol

Pacaci

(2012) A virtual laboratory for neural network controlled DC motors based on a DC–DC buck converter. International Journal of Engineering Education 28(3): 713–723.

Cabal-Yepez

Garcia-Ramirez

Romero-Troncoso

et al. (2013) Reconfigurable monitoring system for time-frequency analysis on industrial equipment through STFT and DWT. IEEE Transactions on Industrial Informatics 9(2): 760–771.

Cheng

Lee

Zhang

et al. (2012a) Independent component analysis based source number estimation and its comparison for mechanical systems. Journal of Sound and Vibration 331(23): 5153–5167.

10.

Cheng

Zhang

(2011) Enhance the separation performance of ICA via clustering evaluation and its applications. Advanced Science Letters 4(6–7): 1951–1956.

11.

Cheng

Zhang

Lee

et al. (2012b) Source contribution evaluation of mechanical vibration signals via enhanced independent component analysis. Journal of Manufacturing Science and Engineering 134(2): 021014.

12.

Cheng

Zhang

Lee

et al. (2014) Investigations of denoising source separation technique and its application to source separation and identification of mechanical vibration signals. Journal of Vibration and Control 20(14): 2100–2117.

13.

Ebrahimi

Faiz

(2010) Feature extraction for short-circuit fault detection in permanent-magnet synchronous motors using stator-current monitoring. IEEE Transactions on Power Electronics 25(10): 2673–2682.

14.

Gritli

Zarri

Rossi

et al. (2013) Advanced diagnosis of electrical faults in wound-rotor induction machines. IEEE Transactions on Industrial Electronics 60(9): 4012–4024.

15.

Heo G, Klette R, Woo YW, et al. (2010) Fuzzy support vector machine with a fuzzy nearest neighbor classifier for insect footprint classification. In: 2010 IEEE international conference on fuzzy systems, Barcelona, Spain, 18–23 September 2010. New York: IEEE.

16.

Kabla

Mokrani

(2011) Marginal spectrum for bearing fault diagnosis in MCSA. International Journal of Research and Reviews in Artificial Intelligence 1(4): 76–80.

17.

Keskes

Braham

Lachiri

(2013) Broken rotor bar diagnosis in induction machines through stationary wavelet packet transform and multiclass wavelet SVM. Electric Power Systems Research 97: 151–157.

18.

Kristin

Mangasarian

(2002) Robust linear programming discrimination of two linearly inseparable sets. Optimization Methods & Software 1(1): 23–34.

19.

Lin

Wang

(2002) Fuzzy support vector machines. IEEE Transactions on Neural Networks 13(2): 464–471.

20.

Liu

Chen

Zhao

et al. (2013) Internal model control of permanent magnet synchronous motor using support vector machine generalized inverse. IEEE Transactions on Industrial Informatics 9(2): 890–898.

21.

Nandi

Ilamparithi

Bin Lee

et al. (2011) Detection of eccentricity faults in induction machines based on nameplate parameters. IEEE Transactions on Industrial Electronics 58(5): 1673–1683.

22.

Pineda-Sanchez

Riera-Guasp

Antonino-Daviu

et al. (2009) Instantaneous frequency of the left sideband harmonic during the start-up transient: a new method for diagnosis of broken bars. IEEE Transactions on Industrial Electronics 56(11): 4557–4570.

23.

Sabzekar

Yazdi

Naghibzadeh

(2011) Relaxed constraints support vector machines for noisy data. Neural Computing & Applications 20(5): 671–685.

24.

Torkaman

Afjei

Ravaud

et al. (2011) Misalignment fault analysis and diagnosis in switched reluctance motor. International Journal of Applied Electromagnetics and Mechanics 36(3): 253–265.

25.

Vulli

Dunne

Potenza

et al. (2009) Time-frequency analysis of single-point engine-block vibration measurements for multiple excitation-event identification. Journal of Sound and Vibration 321(3–5): 1129–1143.

26.

Wang

Tse

Tsui

(2013a) An enhanced Kurtogram method for fault diagnosis of rolling element bearings. Mechanical Systems and Signal Processing 35(1–2): 176–199.

27.

Wang

Yang

Zhang

et al. (2013b) Image denoising using SVM classification in nonsubsampled contourlet transform domain. Information Sciences 246: 155–176.

28.

Wei

(2012) A new fuzzy SVM based on the posterior probability weighting membership. Journal of Computers 7(6): 1385–1392.

29.

Sun

et al. (2013) Improvement of the Hilbert method via ESPRIT for detecting rotor fault in induction motors at low slip. IEEE Transactions on Energy Conversion 28(1): 225–233.

30.

Yaqub

Gondal

Kamruzzaman

(2012) Inchoate fault detection framework: adaptive selection of wavelet nodes and cumulant orders. IEEE Transactions on Instrumentation and Measurement 61(3): 685–695.

31.

Yaqub

Gondal

Kamruzzaman

(2013) An adaptive self-configuration scheme for severity invariant machine fault diagnosis. IEEE Transactions on Reliability 62(1): 116–126.

32.

Zhang QX, Li J, Li HB, et al. (2012) Motor broken-bar fault diagnosis based on park vector and wavelet neural network. In: 2nd International conference on advanced research on advanced structure, materials and engineering (ed H Zhang), Guangzhou, China, 13–14 April 2013, pp.163–166.

33.

Zhang

Xiao

(2006) Fuzzy support vector machine based on affinity among samples. Journal of Software 17(5): 951–958.