Abstract
Time-frequency representiation has been intensively employed for the analysis of biomedical signals. In order to extract discriminative information, time-frequency matrix is often transformed into a 1D vector followed by principal component analysis (PCA). This study contributes a two-directional two-dimensional principal component analysis (2D2PCA)-based technique for time-frequency feature extraction. The S transform, integrating the strengths of short time Fourier transform and wavelet transform, is applied to perform the time-frequency decomposition. Then, 2D2PCA is directly conducted on the time-frequency matrix rather than 1D vectors for feature extraction. The proposed method can significantly reduce the computational cost while capture the directions of maximal time-frequency matrix variance. The efficiency and effectiveness of the proposed method is demonstrated by classifying eight hand motions using 4-channel myoelectric signals recorded in health subjects and amputees.
Keywords
Introduction
Biomedical signal analysis has been broadly applied for robotics control, human/brain machine interface, disease diagnosis, wearable devices and rehabilitation programming. Most biomedical signals, for example myoelectric signal (MES), an electrical manifestation of skeletal muscle contractions, are typically nonlinear and nonstationary. Time-frequency (TF) analysis offers simultaneous interpretation of the biomedical signal in both the time and frequency domains, allowing the elucidation of local, transient or intermittent components at various scales (Clemson et al., 2016; Xie et al., 2009a). However, there are typically a large amount of TF coefficients generated from such a two-dimensional analysis. In addition, noise artifacts as well as redundant information may be present in these TF coefficients. Principal component analysis (PCA) decomposes the covariant structure of the dependent variables into orthogonal components by calculating the eigenvalues and eigenvectors of the data covariance matrix. It linearly projects the original data from a high-dimensional space to a set of uncorrelated components in a low-dimensional feature space, while simultaneously preserving the most original information. Therefore, TF analysis combined with PCA (TF-PCA) has been one of the most powerful approaches for simultaneously extracting discriminative features and reducing the dimension for biosignals classification tasks. The basic algorithm for this hybrid method consists of decomposing biomedical signals into the TF plane, re-arranging the TF elements into a row vector, and reducing the dimension using PCA. Martis et al. (2012) developed an approach for discriminating arrhythmia from normal sinus rhythm based on feature extraction using discrete Daubechies-4 wavelet transform (WT) as well as feature reduction using PCA. Korurek and Nizam (2010) employed the PCA compressed discrete Daubechies-5 wavelet coefficients to classify normal heart beats and five types of arrhythmia for the diagnosis of cardiovascular disease. Ghorbanian et al. (2012) conducted a similar study to analyse five symptoms of heart failure using a continuous Haar wavelet transform followed by PCA reduction and multi-layered perceptron neural network classification. They evaluated the proposed algorithm using the MIT-BIH database, resulting in 99.5% sensitivity, 99.66% positive predictive accuracy, and 99.17% total accuracy. More recently, Martis et al. (2013) further compared the effects of wavelet coefficient reduction by PCA, independent component analysis (ICA) and linear discriminant analysis (LDA) for automated diagnosis of five types of arrhythmia. Giri et al. (2013) carried out a similar comparative study for the automatic detection of coronary artery disease using PCA, ICA and LDA to reduce the number of discrete WT coefficients extracted from particular ECG signal sub-bands. A similar WTPCA approach was also applied to automatically detect electroencephalographic (EEG) analysis epileptic activity (Acharya et al., 2012) and classify right and left hand movement (Ince et al., 2006) for brain-machine interface control. Furthermore, atherosclerosis and four types of brain blood vessel disease were diagnosed by WTPCA from Doppler ultrasound signals by Kara and Dirgenali (2007) and Uguz (2012), respectively. Englehart et al. (1999, 2001), who decomposed four channels of transient MESs using short-time Fourier transform (STFT), WT and wavelet packet transform (WPT) methods to discriminate six hand motions for prosthetic hand control. They compared the performance of PCA feature reduction against the Euclidean distance class separability (CS) criterion. The results indicated TF-PCA was vastly superior to TF-CS in classification accuracy, as well as a significant improvement of all TF-based methods compared with time domain feature extraction when using a LDA classifier. The study of Khezri and Jahed (2007) using adaptive neuro-fuzzy inference system further confirmed the superiority of TF-PCA hybridization in MES-based hand motion pattern recognition. Unfortunately, in all of these TF-PCA-based signal representation and recognition methods, TF coefficients must first be transformed into a vector. However, concatenating TF coefficients at various scales into a 1D array often leads to a high-dimensional vector space, where it is difficult to evaluate the covariance matrix accurately owing to its large size and the relatively small number of training samples. Furthermore, computing the eigenvectors of a large size covariance matrix is very time-consuming, whilst the response time of many biomedical real-time control systems should not introduce a delay that is perceivable by the user (Xie et al., 2009a).
In recent years, there has increasing interests in developing matrix-based methods for image feature extraction and classification (Lu et al., 2011). The essence of these techniques is that the image matrix is not converted into vectors prior to dimensionality reduction. Among these newly developed approaches, two-dimensional principal component analysis (2DPCA) is an attractive technique, which has been widely used in face recognition and classification (Mashhoori and Jahromi, 2013; Nagabhushan et al., 2006; Yang and Liu, 2007; Yang et al., 2004). Compared with the conventional one-dimensional PCA, 2DPCA operates directly with the two-dimensional matrices, rather than 1D vectors. Thus, the matrix does not need to be transformed, which not only alleviate the computational burden but also preserves all spatial information of the original matrices.
Although 2DPCA is typically able to obtain higher recognition accuracy than PCA, a vital unresolved problem is that 2DPCA needs many more coefficients for image representation than PCA (Zhang and Zhou, 2005). Zhang and Zhou (2005) indicated that 2DPCA essentially operates along the row direction of the image matrix and, thus, proposed an alternative 2DPCA operating along the column direction. By simultaneously considering the row and column directions, they developed the two-directional two-dimensional principal component analysis (2D2PCA) for a more efficient image representation and recognition.
In fact, a two-dimensional TF plane can be regarded as an image. It is thus feasible to apply image processing techniques to indicate time-frequency matrix (TFM) characteristics. In a recent study, we have demonstrated the superiority of 2D2PCA over PCA to extract discriminant features form wavelet coefficients for high-density electrode array MES recognition (Xie et al., 2016). Despite the success of this recent 2D2PCA-based MES study, the involved wavelet transform exhibits some disadvantages, such as its complicated computation, sensitivity to noise level and the dependency of its accuracy on the chosen basis wavelet (Nguyen and Liao, 2009). The accuracy of the features extracted from noisy MES by using the wavelet transform is thus rather susceptible to the level of noise (Nguyen and Liao, 2009). In addition, two signals with different phase shifts can have significantly different energy distributions at their wavelet decomposition levels, which make the WT more difficult to distinguish changes in the MES energy distributions owing to varied motion patterns (Gargoom et al., 2008). S transform (ST), the “phase correction” of the wavelet transform or the variable window STFT, is capable of obtaining reasonably accurate amplitude and phase spectrum of the analyzed signals even at the presence of high level of noise (Nguyen and Liao, 2009; Stockwell et al., 1996). In this study, inspired by the success of 2D2PCA in imaging processing, we investigate the feasibility of using 2D2PCA to efficiently and effectively extract feature information from the signal TF representation. ST, integrating the strengths of both STFT and WT, is selected to decompose the discrete time signal into TFM. The key idea is that 2D2PCA is applied to reduce the dimension of ST coefficient matrix in a highly efficient manner for pattern classification. The method is, therefore, termed as S transform-based two-directional two-dimensional principal component analysis (ST2D2PCA). To evaluate the performance of the proposed method, results are presented on the recognition of eight hand motions from 4-channel MESs recorded in both health subjects and amputees, aiming for the prosthetic hand, robot, and human man interface controlling. The results obtained using ST2D2PCA are compared with WTPCA (Huang et al., 2012), WT2D2PCA (Xie et al. 2016), as well as the hybrid STFT and 2D2PCA (STFT2D2PCA) method.
Methods
ST
The ST established by Stockwell et al. (1996) can be regarded as an extension of the STFT and WT. With respect to a given signal
where
and
Based on equations (1) to (3), the ST can be formulated as
From the point of view of WT, the ST can also be represented as
where
On one hand, compared with STFT, the standard deviation
2D2PCA
Figure 1 is a schematic diagram of 2D2PCA. Without loss of generality, we consider an m by n TFM
where

Schematic diagram of 2D2PCA to obtain the reduced TFM F (right) from an input TFM A (left).
Considering the
which is an
Suppose that the training feature set is
Denoting the kth row vectors of
and
The horizontal covariance matrix can then be obtained from the outer product of these TFM row vectors
Similarly, for the
which is
TMFs and their average are now denoted by column vectors
where
Now, the vertical covariance matrix of equation (14) can be constructed from the outer products of column vectors:
Zhang and Zhou (2005) demonstrated that the optimal projection matrices
After obtaining the projection matrices
Using the above procedure, an
ST2D2PCA
In this section, we describe the ST2D2PCA algorithm for extracting discriminant feature information from these matrices as follows:
Multiple-channel signals are first segmented by a moving window with width d.
Set the frequency band width and the decomposition interval in time and frequency domain, the ST is then employed to decompose each time-segment of individual channels into TFM with size
2D2PCA is subsequently carried out on each of the
Since the discriminant abilities of principal components (PCs) at various scales are different, a simple distance-based technique is applied to re-order all PCs (Xie et al., 2009a).
The performance of the algorithm is evaluated by feeding the optimal PCs obtained into a classifier.
Experimental protocol and performance evaluation
The proposed algorithm was evaluated using the MESs collected from the following experiment. Eight distinct wrist and hand motions were used: grasp (GR), hand open (OP), wrist flexion (WF), wrist extension (WE), ulnar deviation (UD), radial deviation (RD), pinch (PN), and thumb flexion (TF), as depicted in Figure 2. These represent the commonly used wrist and hand movements in daily life. The initial purpose of this data collection was to recognize arm and hand movements from MES for prosthetic hand control to improve the hand function of amputees (Huang et al., 2012).

Eight classes of motion were used in the experiment.
In the experiment, the myoelectric data were collected from 10 health subjects and two amputees (eight males and four females,

The experimental setting to record MESs from an amputee.
Fifteen sessions were conducted for each subject. The first five sessions were used for the learning procedures, while the sixth to tenth sessions for the validation set, with the remaining sessions for performance evaluations. Each subject was asked to maintain a static contraction for each motion and to change the motions with a fixed movement velocity. For those specific tasks the amputees cannot perform, they tried to perform under the guidance. In every session, each motion was performed once for a duration of 5 s, then switched to another motion in random order.
The 4-channel myoelectric data were further segmented into a series of overlapping windows (window length: 256 ms, overlap step: 128 ms). The remaining procedures for ST2D2PCA described in Section 2.3 were employed to extract two-dimensional PCs. Support vector machine (SVM) (Xie et al., 2009b), a typical nonlinear MES classifier used in previous study, was employed to evaluate the classification performance of the proposed algorithm. After the classification, the accuracy was further improved by a post-processing procedure using majority vote (MV) (Huang et al., 2012). The WTPCA, WT2D2PCA, STFT2D2PCA algorithms to analyse the same data set was also conducted for comparison.
Results
Multi-scale muscle activity patterns
Using the proposed ST2D2PCA technique, the MES at each channel was first transformed into TFM. Figure 4 shows the typical contour plots for eight motions for subject 8, each row corresponding to a motion type. For each panel, the abscissa represents the time and the ordinate represents the frequency or the scale of S transform. With each intended motion, a significant difference between the intensity of the MESs over the upper limb muscles can be readily discerned in the first column contour plots. Similar to the panels in the first column, there was significant discrepancy in the intensity distributions of the remaining contour plots in remaining three columns, indicating useful discriminant information in the ST matrices.

Contour plots of ST matrices for 4-channel myoelectric traces of eight hand motions obtained from subject 8.
The 2D2PCA was then used to reduce the dimension of each matrix. Figure 5 shows the contour plots of each matrix in Figure 4 following dimension reduction using 2D2PCA when the energy conservation rate and total energy preserved were 98% and 90% respectively. The physical meanings of time at the abscissa and frequency at the ordinate have been lost in Figure 5 since 2D2PCA is a pure mathematical transform to reduce the size of the matrix. Compared with Figure 4, the intensity difference between certain sub-panels in Figure 5 is further enhanced, including, for example, those in the first row. More importantly, the matrix size at each channel was significantly decreased, which were

The contour plots of ST matrices reduced using 2D2PCA for 4-channel MESs of eight hand motions obtained from subject 8.
Effect of energy conservation rate
A large energy conservation rate results in more information loss, whilst a low rate increases the computational burden. To reach a trade-off between these two factors, three energy conservation rates of 97%, 98% and 99%, were employed to assess its effect on classification accuracy. Figure 6 shows the classification accuracy of subject 8 at these various energy conservation rates for the SVM classifier. With the increasing number of PCs, the accuracy of all three conservation rates initially increased and then entered a relatively flat range with moderate fluctuations. The optimal PCs to achieve the highest accuracy for three conservation rates were all in the range of 20 to 30. In addition, there was no significant difference between energy conservation rates (p<0.01). The effect of energy conservation rate on accuracy for other subject was similar to subject 8.

The effect of energy conservation rate of ST2D2PCA on the MESs classification for the subject 8.
Effect of total energy conserved
For PCA analysis, a typical recommendation is to set the threshold of total energy conserved between 0.8 and 0.95. Figure 7 shows the classification accuracy for the subject 8 for three threshold values of total energy conserved; that is, 95%, 90% and 85%. With the reduction in threshold, there was no significant difference in the accuracy for SVM. Findings on the effect of total energy conserved for the remaining subjects were similar. The insensitivity of SVM to the total energy preserved may be owing to its adaptive ability to map input features to high-dimensional feature space.

The effect of total energy conserved of ST2D2PCA on the MESs classification for the subject 8.
Recognition of intended motions
Pattern recognition was performed using the optimal number of PCs previously determined. Tables 1 and 2 summarize the subject-specific classification accuracy for all eight intended upper-limb motions using four methods before and after majority vote, respectively. An average classification accuracy above 94% could be achieved among all subjects after majority vote using ST2D2PCA. Across all subjects, there is significant difference between the accuracy of ST2D2PCA and WTPCA (p<0.05) with lower average accuracy for WTPCA in both cases of with or without majority vote. The PCs extracted from WT and STFT using 2D2PCA also achieve higher average accuracy than WTPCA, which demonstrate the superiority of 2D2PCA over PCA. For three 2D2PCA-based feature extraction methods, ST outperforms WT and the STFT has the lowest average accuracy regardless before or after majority vote. These results are consistent with previous studies to compare ST, WT and STFT using PCA or other feature extraction methods (Gargoom et al. 2008; Nguyen and Liao (2009). In regard to the comparison between the healthy subjects and amputees, owing to the fact that amputees can only perform grasping and opening based on imagination, the related muscle activities were not as strong as the health subjects. The accuracy for two amputees was much lower than the health subjects. As mentioned before, it should be emphasized that myoelectric activity is subject-dependent for both healthy subjects and amputees. Therefore, the structure and information distribution in the time-frequency matrices varied between subjects, which led to different reduced sizes with ST2D2PCA. Ultimately, this subject-specific TF distribution of myoelectric feature information led to inconsistent classification errors among different subjects. The subject-specific myoelectric activity and classification performance suggested that optimal myoelectric pattern-recognition control system parameters should be individually customized.
Classification results of all 12 subjects by ST2D2PCA-, WTPCA-, STFT2D2PCA- and WT2D2PCA-based feature subsets before majority vote.
Amputee.
Classification results of all 12 subjects by ST2D2PCA-, WTPCA-, STFT2D2PCA- and WT2D2PCA-based feature subsets after majority vote.
Amputee.
Discussion and conclusions
A novel ST2D2PCA for signal classification has been proposed and examined in this study. TF discriminant information can be effectively extracted and reduced using the proposed method. In essence, PCA aims to find the single best (in the sense of least-square error) subspace for a given dimension. In conventional array-based PCA, a 2D TF feature matrix is usually transformed into a 1D vector and modelled as a point in a high-dimensional vector space. However, this approach leads to several issues, such as the “curse of dimensionality” dilemma and the small sample size problem, leading to numerical instability, as well as high computational complexity and storage requirements in classification. Aimed at the problem of high volume feature space, this study has used 2D2PCA instead of PCA to find the optimal subspace. The size of individual row or column vectors in a matrix is much smaller than that of a single long vector transformed from the matrix, circumventing the high dimensionality. Furthermore, as the input feature vectors to be analysed are actually the rows or columns of the matrix, the feature set is significantly enlarged, avoiding the small sample size problem. However, the essence of 2D2PCA is similar to PCA, projecting the data along the directions of maximal variance. If we adopt an inverse operation, a matrix
The ST is a time–frequency tool generated by the combination of WT and STFT (Stockwell et al., 1996). It uniquely combines a frequency-dependent resolution that simultaneously localizes the real and imaginary spectra. Basis functions of the ST are Gaussian modulated cosinusoids, so that it is possible to use intuitive notions of cosinusoidal frequencies in interpreting and exploiting the resulting TF spectrum. With the advantage of fast lossless invertibility from time, to TF, and back to the time domain, the usage of the ST is very analogous to the Fourier transform. In the case of nonstationary interferences with noisy data, the ST provides patterns that closely resemble the interference type. Furthermore, the ST can be derived from the continuous WT choosing a specific mother wavelet and multiplying a phase correction factor. Thus, the ST can be interpreted as a phase-corrected continuous wavelet transform, possessing the merits of both wavelet multi-resolution analysis and STFT lossless invertibility. Though ST introduces some redundancy in the signal, there is no information lost. Therefore, ST provides better effects for denoising, feature extraction and localization of the signal. In order to handle the redundancy of ST, 2D2PCA is proposed to remove the redundant information and noise in the ST2D2PCA algorithm.
In order to test this approach, we used ST2D2PCA to extract and classify specific TF patterns in four-channel MESs from 10 health subjects and two amputees for identification of eight hand motions. The ST revealed subtle time-invariant pattern differences between movements, whilst 2D2PCA effectively solved the dilemma of high dimensionality in the subsequent classification. The segmented subspace and enlarged feature set are the major reasons for the optimal dimension of the proposed method being much lower than PCA on the same data set. In addition, each row of the vertical or horizontal scatter matrices captures the local information contained in the corresponding rows of the training matrix, characterizing the overall scatter of the corresponding local region for all training samples. Therefore, 2D2PCA is a feature extraction method combining local and global information simultaneously. This is the reason ST2D2PCA achieved higher accuracy than WTPCA. In regard to the comparison between ST, WT and STFT, ST integrates the strengths of both STF and WT and thus shows the best performance. The deficiencies such as fixed resolution, sensitivity to noise and severe frequency leakage make STFT yield the worst performance. Local muscle fatigue can significantly degrade the MES classification accuracy in prosthetic control and other MES-based human-mahchine interfaces (Xie et al., 2010). The effect of ST2D2PCA to classify MES recorded on fresh and fatigue muscles should be further investigated in the future study. In addition, the efficiency and effectiveness of the method can be further validated by using high-dimensional EEG, MEG, fMRI signals. Although the present study focuses on signal pattern classification, based on the PCs obtained from TFM, it is relatively straightforward to expand ST2D2PCA for signal compression, denoising, instantaneous frequency estimation and other related tasks.
Footnotes
Declaration of conflicting interest
The authors declare that there is no conflict of interest.
Funding
This research received no specific grant from any funding agency in the public, commercial or not-for-profit sectors.
