A novel data-driven method for structural health monitoring under ambient vibration and high-dimensional features by robust multidimensional scaling

Abstract

Dealing with the problem of large volumes of high-dimensional features and detecting damage under ambient vibration are critical to structural health monitoring. To address these challenges, this article proposes a novel data-driven method for early damage detection of civil engineering structures by robust multidimensional scaling. The proposed method consists of some simple but effective computational parts including a segmentation process, a pairwise distance calculation, an iterative algorithm regarding robust multidimensional scaling, a matrix vectorization procedure, and a Euclidean norm computation. AutoRegressive Moving Average models are fitted to vibration time-domain responses caused by ambient excitations to extract the model residuals as high-dimensional features. In order to increase the reliability of damage detection and avoid any false alarm, the extreme value theory is considered to determine a reliable threshold limit. However, the selection of an appropriate extreme value distribution is crucial and troublesome. To cope with this limitation, this article introduces the generalized extreme value distribution and its shape parameter for choosing the best extreme value model among Gumbel, Fréchet, and Weibull distributions. The main contributions of this article include developing a novel data-driven strategy for early damage detection and addressing the limitation of using high-dimensional features. Experimental data sets of two well-known civil structures are utilized to validate the proposed method along with some comparative studies. Results demonstrate that the proposed data-driven method in conjunction with the extreme value theory is highly able to detect damage under ambient vibration and high-dimensional features.

Keywords

Structural health monitoring early damage detection ambient vibration time series analysis high-dimensional features robust multidimensional scaling

Introduction

Early damage detection is a necessity for many civil structures that play crucial roles in social life, transportation networks, and economics. In order to guarantee the safety and serviceability of important civil structures and avoid high costs of maintenance, rehabilitation, and reconstruction, it is essential to focus on structural health monitoring (SHM). Data-driven methods present practical approaches to SHM based on the statistical pattern recognition paradigm.¹ In comparison to the model-driven techniques, which generally require constructing an elaborate finite element model of the real structure and model updating,^2,3 the data-driven methods only consider raw measured data without any model construction or updating.

Due to directly applying measured data, feature extraction and feature classification are two key parts of any data-driven method. The process of feature extraction discovers meaningful information (features) from raw vibration measurements.¹ For the SHM applications, such features should be sensitive to damage; hence, those are usually known as damage-sensitive features (DSFs). Depending upon the nature and type of vibration data, advanced signal processing techniques provide powerful and effective approaches to extract diverse DSFs from vibration signals.⁴ Time series analysis by time-invariant linear representations is one of the robust feature extraction methods.⁵ Several research studies have utilized various time-invariant linear models for feature extraction such as AutoRegressive (AR),^6–8 AutoRegressive with eXogeneous input (ARX),⁹ AutoRegressive Moving Average (ARMA),^10,11 AutoRegressive-AutoRegressive with eXogeneous (AR-ARX),¹² and AutoRegressive Moving Average with eXogenous input (ARMAX).¹³ Using these models, it is possible to use the AR coefficients and the model residuals as the main DSFs.^6,14 However, the extraction of features sensitive to damage from measured vibration responses under unpredictable and unknown ambient excitations may be problematic.

The process of feature classification utilizes the DSFs extracted from the normal (known) and current (unknown) states of the structure so as to evaluate the global structural condition and detect any potential damage.^1,15 Hence, this process compares two different states of a structure in order to recognize the status of the current state, which can be either undamaged or damaged. Some effective and well-known feature classification methods include Mahalanobis distance (MD),^16–19 artificial neural networks,^20,21 and clustering.^23–26 One of the major challenging issues in feature classification is concerned with the emergence of high-dimensional DSFs or Big Data. The main limitation is that the implementation of early damage detection by such features may be troublesome and time-consuming. On the contrary, when the high-dimensional DSFs are obtained from unmeasurable and unknown ambient excitations, those may cause unreliable results of damage detection.

The initial solution is to reduce the size of data or DSFs by various dimensionality reduction techniques such as principal component analysis (PCA),^27,28 random projection,²⁹ deep autoencoders,^30,31 and so on to convert such features into low-dimensional spaces. However, the loss of important information is the main concern about dimensionality reduction techniques.³² Mujica et al.³³ compared four techniques including PCA, partial least square (PLS), and some extensions called multiway PCA and multiway PLS to reduce the dimension of data for damage identification. They concluded that the multiway approaches are very useful in systems that involve several sensors since those decrease the computation cost drastically. The other comparative study on the problem of dimensionality reduction in the context of SHM can be found in the article of Rebillat and Mechbal,³⁴ who applied several methods including simple direct regression, PCA, PLS, canonical correlation analysis, and autoencoders. They concluded that among the mentioned techniques, PCA, PLS, and canonical correlation analysis are all able to discover a low-dimensional space in their problem.

The other challenging issue in feature classification for any data-driven SHM method is to determine a reliable threshold limit that should be able to distinguish the damaged state from the normal condition. The threshold limit determination is a crucial process because an inaccurate choice increases false alarm (false positive) and false detection (false negative) errors that cause confusing results.³⁵ In reality, the information of the current state of the structure is usually unknown. Therefore, in most cases of SHM applications, the threshold limit determination is carried out by considering the probabilistic properties of samples or outputs (e.g. damage indices) associated with the normal or undamaged condition. For this purpose, it is necessary to assume a probability distribution model for such samples or outputs and calculate a statistical quantity (i.e. a percentile of the distribution of interest), which is incorporated as a threshold value.¹⁶ The simplest way is to consider the Gaussian or normal distribution and estimate a threshold limit based on a standard confidence interval (CI). Despite the simplicity and computational efficiency of this approach, Sarmadi and Karamodin¹⁶ demonstrated that it is not able to provide a correct and reliable threshold when the samples or outputs regarding the undamaged state are non-Gaussian or heavy-tailed. To address this problem, one can utilize extreme value (EV) statistics and obtain a threshold value by fitting a proper EV distribution.^36–38 However, the selection of an appropriate EV distribution among three potential models (i.e. Gumbel, Fréchet, and Gumbel) is not trivial.

This article proposes a novel data-driven method for early damage detection under high-dimensional features and ambient vibration by robust multidimensional scaling (RMDS). The RMDS is an improvement on the traditional multidimensional scaling (MDS). Both techniques aim to create low-dimensional projections by retaining pairwise distances between data samples as much as possible. The major disadvantage of the MDS is that it may suffer from outliers and uncertainties.³⁹ In most real cases of SHM, the outliers and uncertainties may be noise in measurements, unknown ambient excitations, environmental variability, and so on that may seriously affect the performance of any feature classification method. On this basis, the RMDS is proposed to develop a novel data-driven SHM method. This method consists of some simple but effective computational steps including the segmentation of high-dimensional DSFs, the pairwise distance computation by the Euclidean-squared distance (ESD), the implementation of the RMDS iterative algorithm, the matrix vectorization process, and the Euclidean norm calculation. The process of feature extraction under ambient vibration is performed by fitting ARMA models to measured vibration responses and extracting their residuals as randomly high-dimensional DSFs. The major contribution of this article is to develop a novel data-driven strategy for detecting damage based on the output of the RMDS as a new method for dimensionality reduction. Dealing with the problem of using large volumes of high-dimensional DSFs for early damage detection is the main advantage of the proposed method. Concerning the challenge of the threshold limit determination based on the EV theory, this article introduces the generalized EV (GEV) distribution and its shape parameter to select the most appropriate EV model. The effectiveness and performance of the proposed RMDS-based method are validated by experimental data sets of two well-known benchmark models, the International Association for Structural Control–American Society of Civil Engineers (IASC-ASCE) structure⁴⁰ and the Tianjin-Yonghe Bridge in China.⁴¹ Comparative studies are also conducted to demonstrate the superiority of the proposed RMDS-based method along with the EV-based threshold determination over some existing techniques. Results demonstrate that the proposed RMDS-based method in conjunction with the EV statistics provides an influential and reliable tool for detecting damage under ambient vibration and high-dimensional DSFs.

Feature extraction by ARMA model

Time series analysis is a powerful statistical method for modeling vibration time-domain signals (e.g. acceleration time histories) and extracting meaningful information that can be interpreted as the DSFs. The type and nature of vibration signals are important factors for choosing an appropriate time series model. From a statistical viewpoint, time series is a sequence of values at a time interval that is observable as stationary or non-stationary, deterministic or random, linear or nonlinear, and so on. When time series data are linear and stationary, a suitable choice for time series modeling is to use time-invariant representations including AR, ARX, ARMA, AR-ARX and ARMAX.⁴²

Depending upon the source of excitation, the process of time series modeling is generally decomposed into input–output and output-only problems. For the input–output problem, it is necessary to have both the excitation (input) and response (output) signals and choose a time series model between ARX and ARMAX. When the input data are unknown such as the ambient excitations, the process of time series modeling is an output-only problem, in which case the only vibration responses of the structure are available. Under such circumstances, one can apply AR and ARMA for modeling the structural responses. An important note is that the vibration time-domain signals resulting from the ambient excitations conform to the ARMA model.¹¹ Therefore, it is important to use this representation for modeling the structural responses caused by ambient vibration.⁴³ Supposing that y(t) is a vibration signal (structural response) at time t, the ARMA model is expressed as

y (t) = \sum_{i = 1}^{n a} φ_{i} y (t - i) + \sum_{j = 1}^{n c} ψ_{j} e (t - j) + e (t)

(1)

where na and nc denote the orders of the AR and MA terms of the ARMA model; Φ = [φ₁ … φ_na], Ψ = [ψ₁ … ψ_nc] are the vectors of the AR and MA coefficients, respectively. In equation (1), e(t) is the residual of the model at time t, which can be written as follows

e (t) = y (t) - \hat{y} (t) = y (t) - (\sum_{i = 1}^{na} φ_{i} y (t - i) + \sum_{j = 1}^{nc} ψ_{j} e (t - j))

(2)

This value corresponds to the difference between the actual vibration signal y(t) and the predicted time series ŷ(t) obtained from the ARMA model. Determination of sufficient orders of time series representations is a crucial part of time series modeling. This is because a selection of appropriate orders guarantees the model accuracy and adequacy. In other words, a sufficient order is one that enables the time series model to generate uncorrelated residuals; otherwise, the order should be improved.⁶ Considering the equality of AR and MA orders (na = nc), it is possible to determine them by checking the uncorrelatedness of the ARMA model residuals through the Ljung–Box test. This is a statistical hypothesis test that assesses the correlation between the residual sequences of a time series model.⁴² Accordingly, if the p-value of the test is greater than a significance level, one can realize that the residual sequences are uncorrelated. As a result, the iterative order determination technique proposed by Entezami and Shariatmadar⁶ is exploited to choose the sufficient orders of the ARMA model. Although this technique was proposed to determine the order of the AR model, its great merit is that one can utilize it for other kinds of time series models.

On the contrary, the process of feature extraction via time series modeling is decomposed into the coefficient-based and residual-based approaches.⁶ For the ARMA model, the coefficient-based approach is intended to extract the AR coefficients, which are directly related to the structural properties. In this regard, it is necessary to determine the orders of the ARMA model from the vibration signals of the normal condition and then estimate the AR coefficients of the normal and damaged states by one of the computational techniques. For the use of the model residuals as the DSFs, one initially requires obtaining the orders and coefficients of the ARMA representation from the only normal condition and then extracting the model residuals. In the damaged state, the model information (the orders and coefficients) gained by the normal condition are utilized in an effort to extract the residuals of the ARMA model. The central idea behind the residual-based approach is that the model obtained from the normal state cannot give a reliable goodness-of-fit for time series modeling in the damaged state, which leads to increases in the model residuals.⁶ Despite the high applicability of the coefficient-based and residual-based approaches to the feature extraction, the latter does not require any order determination and parameter estimation in the damaged state. Therefore, it seems that the use of the model residuals is more advantageous to the process of feature extraction.

RMDS

The RMDS is a technique for reducing the dimension of sampling data and analyzing the dissimilarity on a set of samples by considering outliers and uncertainties in data. In fact, this technique is an improvement of the MDS,⁴⁴ which is sensitive to outliers and uncertainties.³⁹ The RMDS aims to establish a dissimilarity model that explicitly accounts for outliers and find an embedding matrix by solving the proposed dissimilarity model in an iterative manner.³⁹ Consider a multivariate data set of n vectors of m-dimensional samples X = [x₁, x₂, …, x_n] ∈ ℝ^n×m. Using the conventional ESD, the dissimilarity between the vectors x_h and x_r is given by

δ_{h, r} (X) = (x_{h} - x_{r}) {(x_{h} - x_{r})}^{T}

(3)

where h,r = 1, 2, …, n. The dissimilarity should satisfy δ_h_,r(X) ≥ 0, δ_h_,r(X) = δ_r_,s(X), and δ_h_,h(X) = δ_r_,r(X) = 0. By calculating all dissimilarities, one can obtain the distance matrix D ∈ ℝ^n×n, which is used in the RMDS algorithm. Given D, this algorithm seeks an embedding of n vectors in a v-dimensional space (n > v), which leads to the embedding or configuration matrix U = [u₁ … u_n] ∈ ℝ^n×v so that the pairwise distance d_h,r(U) = ||u_h − u_r||₂ approximates δ_h_,r(X). Note that ||.||₂ refers to the Euclidean norm. In order to take into account outliers and find U, the RMDS algorithm presents a dissimilarity model as follows

δ_{h, r} (X) = d_{h, r} (U) + o_{h, r} + ε_{h, r}

(4)

where o_h_,r denotes an outlier variable of δ_h_,r(X), and ε_h_,r is an independent random value. For all outlier variables, one can collect them and construct the outlier matrix O ∈ ℝ^n×n. Hence, the estimate of O and U in the RMDS algorithm is performed by minimizing the following objective function

f (O, U) = \sum_{h < r} {(δ_{h, r} (X) - d_{h, r} (U) - o_{h, r})}^{2} + λ \sum_{h < r} {‖ o_{h, r} ‖}_{0}

(5)

where λ > 0 is a regularization value that controls the sparsity of O and corresponds to the l₀-norm (||.||₀) of the outlier matrix. The minimization of equation (5) in the RMDS algorithm is based on an iterative solver using a majorization–minimization (MM) approach.³⁹ For this purpose, one needs to expand f(O, U), which yields

\begin{matrix} f (O, U) = \frac{1}{2} ‖ D - O ‖_{F}^{2} \\ - 2 \sum_{h < r} (δ_{h, r} (X) - o_{h, r}) d_{h, r} (U) + Tr (UL U^{T}) + \frac{λ}{2} ‖ O ‖_{1} \end{matrix}

(6)

in which

‖ O ‖_{1} = \sum_{h < r} | o_{h, r} |

(7)

In equation (6), L is a matrix with diagonal and off-diagonal entries equal to 1 and −1, respectively. Furthermore, Tr refers to the trace operator of a matrix. Supposing that V is an arbitrary matrix, one can define the matrices A₁(O,V) and A₂(O,V) in the following forms

A_{1} (O, V) = \begin{matrix} \frac{δ_{h, r} (X) - o_{h, r}}{d_{h, r} (V)} (h, r) \in δ_{h, r} (X) > o_{h, r}, d_{h, r} (V) > 0 \\ 0 otherwise \end{matrix}

(8)

A_{2} (O, V) = \begin{matrix} - \frac{δ_{h, r} (X) - o_{h, r}}{d_{h, r} (V)} (h, r) \in δ_{h, r} (X) \leq o_{h, r}, d_{h, r} (V) > 0 \\ 0 otherwise \end{matrix}

(9)

Accordingly, their Laplacian matrices are given by

L_{1} (O, V) = diag (A_{1} (O, V) l) - A_{1} (O, V)

(10)

L_{2} (O, V) = diag (A_{2} (O, V) l) - A_{2} (O, V)

(11)

where l denotes an n×1 vector of all ones. Using the above-mentioned matrices, the majorizer of f(O, U) is given by

\begin{array}{l} g (O, U; V) = T r (U (L + L_{2} (O, U)) U^{T}) \\ + α (O, V) - 2 T r (U L_{1} (O, V) V^{T}) \\ + \frac{1}{2} {‖ D - O ‖}_{F}^{2} + \frac{λ}{2} {‖ O ‖}_{1} \end{array}

(12)

For δ_h,r(X) ≤ o_h_,r and d_h_,r(U) > 0, α(O,V) is expressed as

α (O, V) = \sum_{h < r} (δ_{h, r} (X) - o_{h, r}) d_{h, r} (V)

(13)

Therefore, the MM approach provides an iterative algorithm to minimize f(O, U) and obtain the matrices O and U at (t+1) iterations in the following forms

O_{(t + 1)} = \arg min g (O, U_{(t)}; U_{(t)})

(14)

U_{(t + 1)} = \arg min g (O_{(t + 1)}, U; U_{(t)})

(15)

For equation (14), each entry of O_(t+1) corresponds to the solution of

\begin{array}{l} \min {(δ_{h, r} (X) - d_{h, r} (U_{(t)}) - o_{h, r})}^{2} \\ + λ | o_{h, r} | \end{array}

(16)

which is a scalar Lasso problem, whose solution is expressible by using the operator S_λ as follows

S_{λ} (x) = sgn (x) β (x)

(17)

where “sgn” is the sign function and β(x) = max(0, |x|−λ/2). Using these expressions, the solution to equation (16) can be written as

o_{h, r}^{(t + 1)} = S_{λ} (δ_{h, r} (X) - d_{h, r} (U_{(t)}))

(18)

Since the majorizer g is defined via the matrices L₁ and L₂, the update of U in equation (15) can be expressed as

U_{(t + 1)} = U_{(t)} L_{1} (O_{(t + 1)}, U_{(t)}) L^{+}

(19)

where “+” denotes the Moore–Penrose pseudo-inverse. It is worth remarking that the arbitrary matrix V in equations (8) to (11) is the embedding matrix at the tth iteration; that is, V = U_(t). In order to terminate the iteration in the MM approach, one needs to define a stopping condition. For this aim, the RMDS method terminates when the relative error ||U_(t+1) − U_(t)||_F/||U_(t+1)||_F is smaller than an inconsiderable positive value.

Proposed data-driven method for SHM

Segmentation of high-dimensional DSFs

Assume that E and $\bar{E}$ ∈ ℝ^nd×ns are the residual data sets (matrices) of the ARMA models in the undamaged and damaged states, where nd and ns denote the numbers of residual samples and sensors (nd≫ns), respectively. Due to the high-dimensionality of the residual matrices, it is necessary to divide them into several low-dimensional partitions with the same dimension as shown in Figure 1. In this regard, the matrices E and $\bar{E}$ can be segmented into p partitions with np samples; that is, $E_{1}^{*}, \dots, E_{p}^{*}$ and ${\bar{E}}_{1}^{*}, \dots, {\bar{E}}_{p}^{*}$ , where E^* and ${\bar{E}}^{*}$ ∈ ℝ^np×ns (nd > np≫ns). Based on the descriptions in the previous the residual matrices $E_{1}^{*}, \dots, E_{p}^{*}$ and ${\bar{E}}_{1}^{*}, \dots, {\bar{E}}_{p}^{*}$ are equivalent to X, in which case np = n and ns = m.

Figure 1.

Graphical representation of the segmentation of the residual data sets of the normal and damaged states.

Determination of embedding norm values as damage indicators

After the segmentation of the residual data sets of the normal and damaged states, one initially needs to compute the distance matrices of $E_{1}^{*}, \dots, E_{p}^{*}$ and ${\bar{E}}_{1}^{*}, \dots, {\bar{E}}_{p}^{*}$ by the ESD technique or equation (3). For the np vectors of the residual data sets, the distance matrices are the size of np × np. In the following, the RMDS algorithm based on the iterative MM approach is applied to obtain the embedding matrices of all segments for the normal and damaged states; that is, U₁, …, U_p and Ū₁, …, Ū_p, where U and Ū ∈ ℝ^np×v. When np is relatively large, it is not trivial to use the high-dimensional large embedding matrices of all partitions for feature classification. To address this limitation, the matrix vectorization technique is used to convert the embedding matrices into vectors u₁, …, u_p and ū₁, …, ū_p, each of which includes nv samples (nv = np × v).

To obtain a damage indicator, an embedding norm value is defined by calculating the Euclidean norm of each embedding vector of the undamaged and damaged states. Taking the norm values of all segments (i.e. ||u₁||₂, …, ||u_p||₂ and||ū₁||₂, …, ||ū_p||₂) into consideration, one can combine them into the distance vector d as follows

d = [{‖ u_{1} ‖}_{2}, \dots, {‖ u_{p} ‖}_{2}, \dots, {‖ {\bar{u}}_{1} ‖}_{2}, \dots, {‖ {\bar{u}}_{p} ‖}_{2}] \in R^{2 p}

(20)

where the first p embedding norm values belong to the normal condition and the remaining quantities are associated with the damaged state. In some cases, the measurement of vibration data is repeated several times in order to increase the reliability of data acquisition. Under such circumstances, there are large volumes of vibration data sets leading to the massive DSFs. Assuming that nm refers to the number of test measurements, one can extract nm sets of the DSFs (i.e. the residual matrices E and $\bar{E}$ ). Therefore, the determination of embedding norm values in the normal and damaged conditions is repeated nm times, which leads to a larger distance–vector d_T = [d₁ … d_nm] ∈ ℝ^(2p×nm) obtained from sequentially arranging the nm vectors of d. It is worthwhile remarking that the first p × nm embedding norm values of d_T belong to the normal condition and the remaining quantities are associated with the current (possibly damaged) state. For the problem of early damage detection, it is necessary to define a threshold limit by using the first p × nm embedding norm values in d_T. On this basis, any deviation of the embedding norm value of the current state from the threshold limit is representative of the damage occurrence. For the sake of convenience, Figure 2 shows the flowchart of the proposed RMDS-based method.

Figure 2.

Flowchart of the proposed RMDS-based method for SHM.

Threshold limit determination

EV statistics

In statistics and probability, the EV theory is a branch of order statistics that focuses on modeling the tails of probability distributions.^45,46 For independent and identically distributed data, the EV distributions can only take one of the three families including Gumbel (Type 1), Fréchet (Type 2), or Weibull (Type 3). Assume that H(x) is a non-degenerate limit function. On this basis, the Gumbel distribution model is expressed as

H_{1} (x) = \exp (- \exp (- \frac{x - μ}{σ}))

(21)

where −∞< x <∞ and σ > 0. Moreover, the Fréchet and Weibull distribution models are formulated in the following forms

H_{2} (x) = \begin{matrix} \exp (- {(\frac{σ}{x - μ})}^{ξ}) & x \geq μ \\ 0 & x < μ \end{matrix}

(22)

H_{3} (x) = \begin{matrix} 1 & x \geq μ \\ \exp (- {(\frac{μ - x}{σ})}^{ξ}) & x < μ \end{matrix}

(23)

In equations (21) to (23), μ, σ, and ξ are the parameters of the EV distributions known as the location, scale, and shape (i.e. expect for the Gumbel distribution).⁴⁶ Given the maximum values of a data set, it is possible to select a proper limit distribution among H₁(x), H₂(x), and H₃(x) and fit an EV distribution model to the maximum values. Once the type of EV distribution and its unknown parameters have been obtained, the threshold limit under a significance level (α) is computed by inverting the limit function of the selected EV distribution. In the Gumbel-type EV distribution, for example, one needs to invert the following equation

\exp (- \exp (- \frac{x - μ}{σ})) = 1 - \frac{α}{2}

(24)

which leads to the upper-limit threshold value (τ₁) as follows

τ_{1} = x = μ - σ \ln (- \ln (1 - \frac{α}{2}))

(25)

The same process can be implemented for the other EV distributions to define their threshold limits in the following forms

τ_{2} = x = μ + \frac{σ}{{(- \ln (1 - \frac{α}{2}))}^{1 / ξ}}

(26)

τ_{3} = x = μ - σ {(- \ln (1 - \frac{α}{2}))}^{1 / ξ}

(27)

Selection of an appropriate EV distribution by GEV

The process of threshold limit determination by the EV statistics depends strongly on the choice of an appropriate EV distribution among Gumbel, Fréchet, and Weibull families. To deal with this limitation, this article utilizes an effective and simple approach based on the GEV. It is a family of continuous probability distributions developed within the EV theory. The main objective of GEV is to integrate the three EV distributions into a single family of distribution as follows

H (x) = \exp (- {(1 + ξ_{G} (\frac{x - μ_{G}}{σ_{G}}))}^{- \frac{1}{ξ_{G}}})

(28)

where μ_G, σ_G, and ξ_G represent the location, scale, and shape of the GEV distribution model. The key characteristic of the GEV distribution is the ability to recognize the type of EV distribution. When ξ_G < 0, the GEV is the Weibull-type EV distribution. In contrast, the GEV conforms to the Fréchet-type EV distribution for ξ_G > 0. Finally, if ξ_G is identical to zero, one can realize that the GEV is the Gumbel-type EV distribution.⁴⁵ Therefore, it is only necessary to estimate the shape parameter of the GEV and then choose an EV distribution for the threshold limit determination. It is important to mention that the maximum-likelihood estimation (MLE) is applied to estimate the unknown parameters of the EV and GEV distributions.

Applications

In this section, the experimental data sets of two well-known benchmark structures in SHM are used to validate the effectiveness and reliability of the proposed methods. The first one is a model-scale four-story steel structure related to the second phase of the IASC-ASCE SHM problem.⁴⁰ Another structure is a full-scale cable-stayed bridge, which belongs to Structural Monitoring and Control (SMC) at the Harbin Institute of Technology in China.⁴¹ In both the structures, ambient vibrations are the main excitation sources.

The IASC-ASCE structure

The four-story structure of the IASC-ASCE problem consisted of two-bay-by-two-bay steel frames with 2.5×2.5 m in plan and 3.6 m in height as shown in Figure 3(a). The members were hot-rolled grade 300 W steel with the nominal yield stress of 300 MPa. The columns and floor beams were constructed by B100 × 9 and S75 × 11 sections, respectively. In each bay, the bracing system included two 12.7 mm diameter threaded steel rods placed in parallel along the diagonal. To make a reasonably realistic mass distribution, there was one-floor slab per bay per floor so that four 1000 kg slabs were placed at each of the first, second, and third levels and four 750 kg slabs on the fourth floor. On each floor, two of the masses were placed off-center to increase the degree of coupling between the translational motions of the structure. Figure 3(b) depicts the plan of the IASC-ASCE structure.

Figure 3.

(a) The IASC-ASCE structure and (b) the sensor numbers and locations.⁹

The structure was subjected to ambient vibrations including excitations present from the environment due to the wind, pedestrians, and traffic. The vibration signals were acquired by 15 accelerometers with 5 V/g sensitivity distributed on the four stories and the base of the structure as shown in Figure 3(b). The force balance accelerometers (FBA) were located on the east and west frames to measure the acceleration time histories in the north–south direction (along the strong axis). The EPI accelerometers were installed near the center column of the structure to measure acceleration responses in the east–west direction (along the weak axis). It needs to mention that the vibration responses of Sensors 1–3 mounted on the base do not provide relevant information about the dynamic behavior of the structure. For this reason, one can neglect to use them in the process of feature extraction. The damage scenarios of the IASC-ASCE structure were simulated by removing some braces from the east, south-east, and north sides (the first pattern) and loosening bolts at the beam–column connections (the second pattern). This article considers the first damage pattern to evaluate the performance of the proposed methods for early damage detection. Table 1 lists the five damaged cases resulting from the elimination of the bracing systems from the east and south-east sides.

Table 1.

The undamaged and damaged conditions of the IASC-ASCE structure.

Case no.	Structural condition	Description
1	Undamaged	Full-braced structural system
2	Damaged	Removing the braces of all floors from the east side
3	Damaged	Removing the braces of all floors from the south-east corner
4	Damaged	Removing the braces of the first and fourth floors from the south-east corner
5	Damaged	Removing the braces of the first floor from the south-east corner

IASC: International Association for Structural Control; ASCE: American Society of Civil Engineers.

Before performing the process of feature extraction, it is necessary to implement some signal pre-processing techniques such as data detrending (i.e. removing linear trends from time series) and standardizing (i.e. normalizing time series by its mean and standard deviation). After that, the initial step of response modeling via the ARMA representation is to determine the model orders (na and nc). Using the iterative order determination technique proposed by Entezami and Shariatmadar⁶ for the ARMA model, Table 2 presents the amounts of na and nc as well as the p-values of the Ljung-Box test under the 5% significance level. All p-values in Table 2 are larger than the amount of significant level (0.05), in which case one can infer that the residuals of ARMA models for the undamaged condition are uncorrelated. As a sample, Figure 4 indicates the evolution of the p-values associated with the residuals of Sensor 15. It is clear that the p-value at the 54th iteration is greater than 0.05; hence, the appropriate amount for na and nc is 54.

Table 2.

The orders of ARMA models and the p-values of the Ljung-Box test for the first case.

Sensor no.	Order no.		p-value
Sensor no.	na	nc	p-value
4	45	45	0.1422
5	42	42	0.7971
6	40	40	0.1775
7	61	61	0.1280
8	43	43	0.8807
9	39	39	0.0819
10	64	64	0.1652
11	48	48	0.2876
12	40	40	0.0853
13	47	47	0.2793
14	46	46	0.5056
15	54	54	0.0754

ARMA: AutoRegressive Moving Average.

Figure 4.

The evolution of p-values of the ARMA residuals at Sensor 15.

Using the orders gained by the iterative algorithm, the coefficients of ARMA models of the normal condition are estimated by the prediction-error technique.⁴² In the following, the uncorrelated residuals at all sensors of the first case are extracted as the random DSFs, which make the residual data set (matrix) of the normal condition E ∈ ℝ^60,000×12 (i.e. nd = 60,000 and ns = 12). Based on the residual-based feature extraction, the ARMA models (i.e. the orders and coefficients) obtained from the first case are used to extract the residuals of the damaged cases. For each case, one can make a matrix of residual samples as $\bar{E}$ ∈ ℝ^60,000×12. To demonstrate the sensitivity of the ARMA model residuals to damage, Figure 5 shows the comparison between the residual samples of the normal condition and each damaged case at Sensor 15. Based on Figure 3(b) and the descriptions in Table 1, one can understand that the location of Sensor 15 is near to the damaged area of the structure for Cases 2–4 (i.e. at the east side of the fourth floor) except for Case 5. As can be seen in Figure 5(a) to (c), there are clear increases in the ARMA residuals regarding Cases 2–4 compared to Case 1. On the contrary, since the location of Sensor 15 is not the damaged area of Case 5, no increase in the residuals is observable in Figure 5(d) between Cases 1 and 5. Therefore, the observations in Figure 5 prove the sensitivity of ARMA residuals to damage.

Figure 5.

Comparison of the ARMA residuals between the normal and damaged cases at Sensor 15: (a) Case 1 versus Case 2, (b) Case 1 versus Case 3, (c) Case 1 versus Case 4, and (d) Case 1 versus Case 5.

Based on the proposed RMDS-based method, the first step is to divide the residual matrices E and $\bar{E}$ of the normal and damaged conditions into several partitions with the same dimension. In this regard, the number of the partition (p) is set as 60, in which case the number of data points (np) of each partition becomes 1000. Hence, the matrices E and $\bar{E}$ are decomposed into 60 smaller matrices ( $E_{1}^{*}, \dots, E_{60}^{*}$ and ${\bar{E}}_{1}^{*}, \dots, {\bar{E}}_{60}^{*}$ ) consisting of 1000 rows and 12 columns. Subsequently, the distance matrices $D_{1}^{*}, \dots, D_{60}^{*}$ and ${\bar{D}}_{1}^{*}, \dots, {\bar{D}}_{60}^{*}$ (of the sizes of 1000×1000) are computed by using the ESD technique. The RMDS method based on the MM approach is applied to obtain the embedding matrices U₁, …, U₆₀ and Ū₁, …, Ū₆₀ with the sizes of 1000×2. The matrix vectorization technique is then utilized to make the vectors u₁, …, u₆₀ and ū₁, …, ū₆₀, each of which has 2000 embedding samples. Using the Euclidean norm, the vector d is determined to use in the process of early damage detection. Since the vibration data of the IASC-ASCE problem in the undamaged and damaged cases were acquired once, there is a set of vibration data sets and residual matrices E and $\bar{E}$ . In other words, the number of test measurements (nm) is equal to one, in which case d_T = d.

For early damage detection, one initially needs to define an accurate and appropriate threshold limit via the EV statistics and the first 60 embedding norm values of d regarding the undamaged state. First, the GEV theory is applied to choose one of the Gumbel, Fréchet, and Weibull distributions. Based on the MLE technique under the 5% significance level, the shape of GEV is equal to 0.1859. Hence, one can realize that the Fréchet-type EV distribution is suitable for modeling the first 60 embedding norm values of d for the threshold limit determination. The shape, scale, and location of the Fréchet distribution correspond to 5.3787, 6.3256, and 1.9304, respectively, which lead to the threshold limit τ₂ = 16.5615. Having considered all embedding norm quantities of d and the threshold value of interest, Figure 6 indicates the results of damage detection of the IASC-ASCE structure in Cases 2–5. As can be observed, the first 60 embedding norm values associated with the normal condition of the structure are smaller than the threshold limit without any false alarms or Type I errors. In contrast, the remaining 60 embedding norm values in Figure 6(a) to (c) are larger than the threshold limit implying the occurrence of damage in Cases 2–4. The same conclusion is observed for Case 5 in Figure 6(d), where only three points are under the threshold limit leading to the 5% Type II error. Therefore, one can conclude that the proposed RMDS-based method in conjunction with the residuals of the ARMA model and the EV theory is sufficiently able to detect damage with different severities and distinguish the damaged state from the normal condition under ambient vibration.

Figure 6.

Early damage detection of the IASC-ASCE structure by the proposed RMDS-based method and the EV statistic (Frechet): (a) Case 2, (b) Case 3, (c) Case 4, and (d) Case 5.

To demonstrate the effect of the threshold limit on the early damage detection, a comparative study is conducted by using the standard CI under the normality assumption of the embedding norm values of the undamaged conditions. Using a 5% significance level leading to the 95% CI, the threshold amount is equal to 9.4983. Accordingly, the results of damage detection by the standard CI are illustrated in Figure 7. It is clear that all embedding norm values of Cases 2–5 exceed the threshold value without any Type II error. However, there are numerous false alarms in the embedding norm values of Case 1 (Type I = 35%). For more details, Figure 8 shows the normal probability plot of the first 60 embedding norm values as well as the comparison between the EV theory and standard CI in terms of the rate of false alarm. As Figure 8(a) indicates, there are clear deviations from the straight line in the sense that the probability distribution of the first 60 embedding norm values of d, which are applied to determine the threshold limit, is non-normal. For this reason, the use of standard CI based on the normality assumption is not sufficiently effective in estimating the accurate threshold limit. Furthermore, Figure 8(b) reveals the superiority of the EV theory over the standard CI in terms of defining an accurate threshold value without any false alarm (Type I error). As can be observed, all norm values regarding the normal condition fall below the threshold obtained from the EV theory implying no Type I error, whereas one can observe that several norm values exceed the threshold gained by the standard CI.

Figure 7.

Early damage detection of the IASC-ASCE structure by the proposed RMDS-based method and the standard CI: (a) Case 2, (b) Case 3, (c) Case 4, and (d) Case 5.

Figure 8.

(a) The normal probability plot of the first 60 embedding norm values of the vector d_T and (b) comparison between the EV theory and standard CI for the threshold limit determination.

As the other comparative study, it is attempted to compare the proposed RMDS-based method with the well-known MD technique, which is widely used in SHM applications. This statistical distance measures the dissimilarity between two multivariate data sets based on the correlation of data samples. For a reasonable comparison, the Euclidean norms of the residual matrices $E_{1}^{*}, \dots, E_{60}^{*}$ and ${\bar{E}}_{1}^{*}, \dots, {\bar{E}}_{60}^{*}$ at all sensors are calculated to create the multivariate feature data set of the size of 1000×60 for each of Cases 1–5. On this basis, there are 1000 MD values for each case. The EV theory is then applied to estimate the threshold value based on the distance quantities of the normal condition. The shape of GEV distribution is equal to −0.0814, which refers to the Weibull-type EV distribution. Using the MLE technique, the shape, scale, and location of Weibull distribution are estimated and used in equation (27) to determine τ₃ = 12.5673 as the threshold limit for damage detection via the MD technique. Figure 9 shows the results of damage detection in Cases 2–5, where the first 1000 MD values belong to the undamaged case, and the remaining distance quantities are associated with the damaged cases. It is apparent that the first 1000 distance quantities of the normal condition are below the threshold limit without any Type I error. This conclusion proves the ability of the EV statistics to estimate an accurate and reliable threshold limit. Moreover, the remaining distance quantities of Cases 2–4 in Figure 9(a) to (c) exceed the threshold limit indicating the occurrence of damage with no Type II error. However, it is observed that there are numerous false-negatives (Type II = 46.8%) in the distance values of Case 5. Therefore, the comparison between the proposed RMDS-based method and the conventional MD technique demonstrates that although both of them are successful in detecting damage and distinguishing the damaged state from the normal condition, the proposed method outperforms the MD technique with fewer Type II errors.

Figure 9.

Early damage detection of the IASC-ASCE structure by the conventional MD method and the EV statistic (Weibull): (a) Case 2, (b) Case 3, (c) Case 4, and (d) Case 5.

Finally, Table 3 presents the numbers and percentages of Type I, Type II, and total errors in detecting early damage of the IASC-ASCE structure in all damaged cases by the proposed RMDS-based and the classical MD methods. In this comparison, the two threshold determination approaches (i.e. the EV statistics and the standard CI) are considered as well. As the data in Table 3 appear, the proposed RMDS-based method in conjunction with the EV statistic (Frechet) yields the best performance in terms of the smallest rates of the triple errors except for the Type II error in Case 5, where only 3 points from 60 norm values fall below the threshold as shown in Figure 6(d). The same conclusion can be reached for the MD technique. However, this feature classification approach suffers from an extremely large Type II error in Case 5 leading to a considerable total error as well. This conclusion proves the superiority of the proposed method over the classical MD technique. On the contrary, the numerical comparison between the EV statistics (i.e. Frechet and Weibull) and the standard CI reveals that the latter, despite its inconsiderable Type II errors, is not a reliable approach to estimating an accurate threshold for early damage detection owing to large Type I and total errors in both feature classification methods. It is worth remarking that the good performance of the EV statistics depends directly on the use of a robust feature classification technique (e.g. the proposed RMDS-based method) with high damage detectability.

Table 3.

Performance evaluation of the feature classification methods using the EV statistics and standard CI for the threshold limit determination.

Case	Method	Threshold	Type I	Type II	Total
2	RMDS	EV	0 (0%)	0 (0%)	0 (0%)
	RMDS	Standard CI	21 (35%)	0 (0%)	21 (17.5%)
	MD	EV	0 (0%)	0 (0%)	0 (0%)
	MD	Standard CI	40 (4%)	0 (0%)	40 (2%)
3	RMDS	EV	0 (0%)	0 (0%)	0 (0%)
	RMDS	Standard CI	21 (35%)	0 (0%)	21 (17.5%)
	MD	EV	0 (0%)	0 (0%)	0 (0%)
	MD	Standard CI	40 (4%)	0 (0%)	40 (2%)
4	RMDS	EV	0 (0%)	0 (0%)	0 (0%)
	RMDS	Standard CI	21 (35%)	0 (0%)	21 (17.5%)
	MD	EV	0 (0%)	0 (0%)	0 (0%)
	MD	Standard CI	40 (4%)	0 (0%)	40 (2%)
5	RMDS	EV	0 (0%)	3 (5%)	3 (2.5%)
	RMDS	Standard CI	21 (35%)	0 (0%)	21 (17.5%)
	MD	EV	0 (0%)	468 (46.8%)	468 (23.4%)
	MD	Standard CI	40 (4%)	64 (6.4%)	104 (5.2%)

EV: extreme value; CI: confidence interval; RMDS: robust multidimensional scaling; MD: Mahalanobis distance.

The cable-stayed bridge

The Tianjin-Yonghe Bridge is one of the earliest cable-stayed bridges with continuous pre-stressed box-girder constructed in China, which was opened to traffic since December 1987. It consists of a total length of 514.4 m including the main span of 260 m and two side spans of 25.15 and 99.85 m as depicted in Figure 10. The bridge is 510 m long and 11 m wide including 9 m for vehicles and 2×1 m for pedestrians. The concrete towers, connected by two transverse beams, include the height of 60.5 m. More details of the bridge are available in the study by Li et al.⁴¹ In 2005, after 19 years of operation, some serious cracks were found at the bottom of a girder segment over the mid-span. In addition, some cables near the anchors were severely corroded. After a major rehabilitation program for replacing the damaged girder segment and all the cables between 2005 and 2007, a sophisticated SHM system organized by the Center of SMC at the Harbin Institute of Technology in China was applied to monitor the bridge in 2007. During a routine inspection in August 2008, new damage patterns were found in the girders of the bridge. Due to the availability of acceleration time histories of the health status of the bridge on 17 January 2008 and the damage status on 31 July 2008, it was possible to evaluate the performance of the proposed methods for early damage detection.⁴¹

Figure 10.

The Tianjin-Yonghe Bridge:⁴¹ (a) general view and (b) the dimensions and sensors.

The acceleration time histories of the normal and damaged conditions were acquired from 14 single-axis accelerometers during 24 h (nm = 24) under the sampling frequency of 100 Hz and time interval of 0.01 s. On this basis, 360,000 samples (nd) were measured by each accelerometer at each hour. According to initial data analysis on the acceleration data sets, it is found that the data of the 10th accelerometer is out of order due to meaningless measurement samples. Hence, the acceleration time series of the 13 accelerometers (ns = 13) are used to fit ARMA models to the vibration responses and extract the model residuals of the normal and damaged states as the high-dimensional DSFs. Similar to the previous application case, the signal pre-processing techniques including data detrending and standardizing are carried out to prepare the time series data sets for ARMA modeling. Using the iterative order determination algorithm based on the Ljung-Box test under the 5% significance level, Table 4 lists the orders of ARMA models of the accelerometers 1–9 and 11–14 along with their p-values for the first test measurement on 17 January 2008 (the healthy state). As can be observed, all p-values are larger than 0.05, which means that the ARMA residuals of all accelerometers are uncorrelated. Furthermore, Figure 11 illustrates the evolution of the p-values regarding the ARMA residuals of the sixth accelerometer in 23 iterations.

Table 4.

The orders of ARMA models and the p-values of the Ljung-Box test in the first test measurement on 17 January 2008.

Sensor no.	Order no.		p-value
Sensor no.	na	nc	p-value
1	30	30	0.1337
2	19	19	0.7840
3	31	31	0.9340
4	24	24	0.1158
5	16	16	0.3234
6	23	23	0.0956
7	25	25	0.8181
8	26	26	0.6483
9	15	15	0.8207
11	21	21	0.0820
12	30	30	0.4263
13	20	20	0.2804
14	19	19	0.6573

ARMA: AutoRegressive Moving Average.

Figure 11.

The evolution of the p-values of the ARMA residuals regarding the sixth accelerometer.

Having considered the ARMA orders, the model coefficients are estimated by the prediction-error technique. Subsequently, the uncorrelated residuals of the accelerometers 1–9 and 11–14 on 17 January 2008, are extracted as the DSFs of the normal condition leading to the residual matrix E ∈ ℝ^360,000×13 for each test measurement. Finally, the ARMA models including their orders and coefficients obtained from the normal condition are utilized to extract the model residuals of the accelerometers 1–9 and 11–14 on 31 July 2008, which makes the residual matrix $\bar{E}$ ∈ ℝ^360,000×13 for each test measurement. Note that there are 24 sets of the residual matrix for the normal and damaged conditions based on the number of test measurements. As a sample, Figure 12 indicates the comparison between the residual samples at the ninth accelerometer regarding the normal and damaged states in the first test measurement.

Figure 12.

The residual samples of ARMA models at the ninth accelerometer regarding the normal and damaged states in the first test measurement.

The increases in the residuals of the ARMA model on 31 July 2008 demonstrate the occurrence of damage. The same conclusion is obtainable for the other accelerometers and other test measurements. However, the direct comparison of the high-dimensional residual samples at each sensor and each hour may cause a complex and time-consuming process. This limitation suggests the necessity of using a robust and efficient approach such as the proposed RMDS-based method for damage detection. Based on the first step of the proposed method, the residual matrices E and $\bar{E}$ of each test measurement are divided into 200 partitions (p = 200); that is, $E_{1}^{*}, \dots, E_{200}^{*}$ and ${\bar{E}}_{1}^{*}, \dots, {\bar{E}}_{200}^{*}$ , each of which consists of 1800 rows (np = 1800) and 13 columns. Using the ESD technique, their distance matrices are $D_{1}^{*}, \dots, D_{200}^{*}$ and ${\bar{D}}_{1}^{*}, \dots, {\bar{D}}_{200}^{*}$ of the size of 1800×1800. Subsequently, the RMDS method based on the MM approach is applied to obtain the embedding matrices U₁, …, U₂₀₀ and Ū₁, …, Ū₂₀₀ of the size of 1800×2. The matrix vectorization technique is utilized to construct the vectors u₁, …, u₂₀₀ and ū₁, …, ū₂₀₀ with 3600 embedding samples. This process continues for all test measurements, which makes 24 sets of the vectors of u₁, …, u₂₀₀ and ū₁, …, ū₂₀₀. Finally, the vector d_T is determined by calculating the Euclidean norms of the 24 sets of embedding vectors. On this basis, this vector includes 9600 embedding norm values, where the first 4800 quantities belong to the healthy state and the remaining norm amounts are related to the damaged condition.

The GEV distribution is initially fitted to the first 4800 embedding norm values of d_T to estimate the shape parameter by the MLE technique and choose an appropriate EV distribution for the threshold limit determination. The shape of GEV distribution corresponds to 0.2338, which refers to the Fréchet-type EV distribution. By estimating the shape, scale, and location parameters of the Fréchet distribution, the threshold limit based on the 5% significance level is obtained from equation (26), which is identical to τ₂ = 73.6991. For a comparative study, another threshold limit under the normality assumption of the same embedding norm values is determined by the standard CI. Applying the 5% significance level, the threshold limit based on the 95% CI is equal to 43.5110. Figures 13 and 14 illustrate the results of early damage detection based on the proposed RMDS-based method in conjunction with the EV statistic (Frechet) and standard CI, respectively. In Figure 13, the first 4800 embedding norm values do not exceed the threshold limit obtained from the EV statistic implying no false alarm. Moreover, the majority of the embedding norm quantities regarding the damaged state are larger than the threshold value indicating the occurrence of damage on 31 July. However, only eight norm values are under the threshold limit, which leads to the 0.17% Type II error. Although the observations in Figure 14 appear that no false negative is available in the embedding norm values (4801–9600) on 31 July, 2008, there are numerous false alarms in the first 4800 embedding norm values (Type I = 56.60%).

Figure 13.

Early damage detection of the Tianjin-Yonghe Bridge by the proposed RMDS-based method and the EV statistic (Frechet).

Figure 14.

Early damage detection of the Tianjin-Yonghe Bridge by the proposed RMDS-based method and the standard CI.

For more evaluation, Figure 15 indicates the normal probability of the embedding norm values 1–4800 along with the comparison between the EV theory and standard CI in terms of the rate of false alarm. It is apparent from Figure 15(a) that the probability distribution embedding norm quantities is non-normal resulting from large deviations from the straight line. Therefore, it can be expected that the standard CI based on the normality assumption is not able to estimate an accurate and reliable threshold limit. In this regard, one can observe in Figure 15(b) that the use of EV theory gives no false alarm, whereas the standard CI suffers from the high rate of Type I error.

Figure 15.

(a) The normal probability plot of the first 4800 embedding norm values of d_T and (b) comparison between the EV theory and standard CI for the threshold limit determination.

Similar to the preceding application case, the performance of the proposed RMDS-based method is compared with the classical MD technique. For this purpose, the Euclidean norms of the residual matrices $E_{1}^{*}, \dots, E_{200}^{*}$ and ${\bar{E}}_{1}^{*}, \dots, {\bar{E}}_{200}^{*}$ at all sensors and all measurements are calculated and collected to generate two multivariate feature data sets regarding the undamaged (17 January) and damaged (31 July) states. In this regard, both data sets are matrices of the size 360,000×24, in which case the feature set of the undamaged state is considered to estimate the 24-dimensional mean vector and the covariance matrix of the size (24×24) needed for the MD metric. Using 360,000 MD values of the undamaged condition, the EV statistics are considered to determine a threshold limit. On this basis, the shape of the GEV distribution corresponds to −0.0524 implying the Weibull-type EV distribution. Under the 5% significance level, the threshold value from equation (27) is identical to τ₃ = 10.6540. Figure 16 illustrates the result of early damage detection in the bridge via the MD metric, where the first and second 360,000 samples pertain to the undamaged and damaged states, respectively. It is seen that all the MD values regarding 17 January (the normal condition) fall below the threshold line without any Type I error. However, there are numerous erroneous results in the MD quantities of 31 July (the damaged condition) that are under the threshold value causing serious Type II errors. Apart from these conclusions, one of the limitations of utilizing the MD metric in this application case is related to its high-dimensional outputs. Unlike the IASC-ASCE structure (see Figure 9), there are 720,000 MD values for feature classification. In comparison with the result of early damage detection gained by the proposed RMDS-based method (see Figure 13, where there are 9600 outputs for decision-making), one can conclude that the proposed RMDS-based method is more efficient than the classical MD technique.

Figure 16.

Early damage detection in the Tianjin-Yonghe Bridge by the classical MD technique and the EV statistic (Weibull).

Eventually, the final comparison is concerned with the numerical evaluations of the performances of the proposed RMDS-based and classical MD methods based on Type I, Type II, and total errors. Using the two threshold determination approaches, Table 5 gives the numbers and percentages of these errors. As can be perceived, the best performance in terms of the smallest rates of Type I and total errors belongs to the proposed RMDS-based method in conjunction with the EV statistic (Frechet). Furthermore, the EV theory is highly superior to the standard CI owing to smaller Type I and total errors. Although the standard CI provides smaller numbers and percentages of Type II error, the corresponding amounts concerning the proposed RMDS-based method are inconsiderable (i.e. only 8 points from 4800 norm values fall below the threshold limit gained by the EV statistic). However, this conclusion is not valid for the outputs of the MD metric based on the EV theory. In summary, the numerical evaluations in Table 5 confirm that the proposed RMDS-based method outperforms the classical MD technique and it is preferable to using the EV statistics for the threshold limit determination.

Table 5.

Performance evaluation of the feature classification methods using the EV statistics and standard CI for the threshold limit determination.

Method	Threshold	Type I	Type II	Total
RMDS	EV	0 (0%)	8 (0.17%)	8 (0.08%)
RMDS	Standard CI	2717 (56.60%)	0 (0%)	2717 (28.30%)
MD	EV	0 (0%)	241 (0.06%)	241 (0.03%)
MD	Standard CI	13654 (3.79%)	3 (~0%)	13654 (1.89%)

EV: extreme value; CI: confidence interval; RMDS: robust multidimensional scaling; MD: Mahalanobis distance.

Conclusion

This article has proposed a novel data-driven method for SHM under ambient vibration and high-dimensional features. The proposed method has been developed from the RMDS algorithm and some computational approaches to deal with the main limitation of using high-dimensional DSFs for early damage detection. The process of feature extraction has been performed by ARMA modeling and the model residuals of the normal and damaged conditions have been extracted as the main DSFs. For accurately detecting damage without any false alarm, the EV theory has been used to determine a reliable threshold value. The shape of GEV distribution has also been considered to choose an appropriate EV distribution for the threshold limit determination. Eventually, the experimental data sets of the IASC-ASCE benchmark problem and the Tianjin-Yonghe Bridge have been used to verify the accuracy and reliability of the proposed methods as well as some comparative studies.

The results have shown that the proposed RMDS-based method with the aid of the ARMA residuals and the EV theory is highly able to detect early damage without any false alarm error. It has been observed that the distribution of embedding norm values of the normal condition used in the threshold limit determination is not normal, in which case the standard CI based on the normality assumption suffers from numerous Type I errors. Therefore, it is recommended to utilize the EV statistics in the threshold limit determination when the outputs of the feature classification method regarding the undamaged condition are non-normal. The comparison between the proposed RMDS-based method and the well-known MD technique has demonstrated that the former is superior to the latter in detecting small damage. Furthermore, the proposed RMDS-based method is more efficient than the MD technique due to providing much fewer outputs for feature classification and decision-making.

Footnotes

Acknowledgements

The authors express their sincere gratitude to the IASC-ASCE Structural Health Monitoring Task Group and Structural Monitoring and Control at the Harbin Institute of Technology in China for accessing their experimental data sets.

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) received no financial support for the research, authorship, and/or publication of this article.

ORCID iDs

Alireza Entezami

Hassan Sarmadi

Ali Nadir Arslan

References

Farrar

Worden

. Structural health monitoring: a machine learning perspective. Chichester: John Wiley & Sons, 2013.

Sarmadi

Entezami

Ghalehnovi

(2020) On model-based damage detection by an enhanced sensitivity function of modal flexibility and LSMR-Tikhonov method under incomplete noisy modal data. Eng Comput. doi:10.1007/s00366-020-01041-8.

Rezaiee-Pajand

Sarmadi

Entezami

(2021) A hybrid sensitivity function and Lanczos bidiagonalization-Tikhonov method for structural model updating: Application to a full-scale bridge structure. Appl Math Model 89:860-884. doi: https://doi.org/10.1016/j.apm.2020.07.044

Amezquita-Sanchez

Adeli

. Signal processing techniques for vibration-based health monitoring of smart structures. Arch Comput Methods Eng 2016; 23: 1–15.

Avendaño-Valencia

Chatzi

Koo

, et al. Gaussian process time-series models for structures under operational variability. Front Built Environ 2017; 3: 69.

Entezami

Shariatmadar

. An unsupervised learning approach by novel damage indices in structural health monitoring for damage localization and quantification. Struct Health Monit 2018; 17: 325–345.

Liu

Wang

Bornn

, et al. Robust structural health monitoring under environmental and operational uncertainty with switching state-space autoregressive models. Struct Health Monit 2019; 18: 435–453.

Entezami

Sarmadi

Behkamal

, et al. Health Monitoring of Large-Scale Civil Structures: An Approach Based on Data Partitioning and Classical Multidimensional Scaling. Sensors 2021; 21: 1646. DOI: 10.3390/s21051646

Poulimenos

Sakellariou

. A transmittance-based methodology for damage detection under uncertainty: an application to a set of composite beams with manufacturing variability subject to impact damage and varying operating conditions. Struct Health Monit 2019; 18: 318–333.

10.

Entezami

Sarmadi

Behkamal

, et al. Big data analytics and structural health monitoring: a statistical pattern recognition-based approach. Sensors 2020; 20: 2328.

11.

Entezami

Shariatmadar

. Damage localization under ambient excitations and non-stationary vibration signals by a new hybrid algorithm for feature extraction and multivariate distance correlation methods. Struct Health Monit 2019; 18: 347–375.

12.

Wang

. Structural damage identification based on self-fitting ARMAX model and multi-sensor data fusion. Struct Health Monit 2014; 13: 445–460.

13.

Döhler

Hille

Mevel

, et al. Structural health monitoring with statistical methods during progressive damage test of S101 Bridge. Eng Struct 2014; 69: 183–193.

14.

Entezami

. Feature Extraction in Time Domain for Stationary Data. In: Structural Health Monitoring by Time Series Analysis and Statistical Distance Measures. Springer 2021; DOI: 10.1007/978-3-030-66259-2_2

15.

Entezami

Shariatmadar

Mariani

(2020) Early damage assessment in large-scale structures by innovative statistical pattern recognition methods based on time series modeling and novelty detection. Adv Eng Software 150:102923. doi:https://doi.org/10.1016/j.advengsoft.2020.102923

16.

Sarmadi

Karamodin

. A novel anomaly detection method based on adaptive Mahalanobis-squared distance and one-class kNN rule for structural health monitoring under environmental effects. Mech Syst Sig Process 2020; 140: 106495.

17.

Sarmadi

Entezami

Daneshvar Khorram

. Energy-based damage localization under ambient vibration and non-stationary signals by ensemble empirical mode decomposition and Mahalanobis-squared distance. J Vibrat Control 2020; 26: 1012–1027.

18.

Yeager

Gregory

Key

, et al. On using robust Mahalanobis distance estimations for feature discrimination in a damage detection scenario. Struct Health Monit 2019; 18: 245–253.

19.

Sarmadi

Entezami

Saeedi Razavi

, et al. Ensemble learning-based structural health monitoring by Mahalanobis distance metrics. Struct Contr Health Monit. 2020; e2663. DOI: 10.1002/stc.2663

20.

Modarres

Astorga

Droguett

, et al. Convolutional neural networks for automated damage recognition and damage type identification. Struct Contr Health Monit 2018; 25: e2230.

21.

Verstraete

Droguett

Meruane

, et al. Deep semi-supervised generative adversarial fault diagnostics of rolling element bearings. Struct Health Monit 2020; 19: 390–411.

22.

Sarmadi

Entezami

Salar

, et al. Bridge health monitoring in environmental variability by new clustering and threshold estimation methods. J Civ Struct Health Monit 2021; DOI: 10.1007/s13349-021-00472-1

23.

Chen

Zhou

Chen

, et al. Detection of double defects for plate-like structures based on a fuzzy c-means clustering algorithm. Struct Health Monit 2019; 18: 757–766.

24.

Sarmadi

Entezami

Saeedi Razavi

Yuen

K-V

. Ensemble learning - based structural health monitoring by Mahalanobis distance metrics. Struct Contr Health Monit. 2020;e2663. https://doi.org/10.1002/stc.2663.

25.

Demarie

Sabia

. A machine learning approach for the automatic long-term structural health monitoring. Struct Health Monit 2019; 18: 819–837.

26.

Entezami

Sarmadi

Saeedi Razavi

. An innovative hybrid strategy for structural health monitoring by modal flexibility and clustering methods. J Civ Struct Health Monit 2020; 10: 845–859.

27.

Tibaduiza

Mujica

Rodellar

. Damage classification in structural health monitoring using principal component analysis and self-organizing maps. Struct Contr Health Monit 2013; 20: 1303–1316.

28.

Meruane

Espinoza

Droguett

, et al. Impact identification using nonlinear dimensionality reduction and supervised learning. Smart Mater Struct 2019; 28: 115005.

29.

Khoa

Zhang

Wang

, et al. Robust dimensionality reduction and damage detection approaches in structural health monitoring. Struct Health Monit 2014; 13: 406–417.

30.

San Martin

López Droguett

Meruane

, et al. Deep variational auto-encoders: a promising tool for dimensionality reduction and ball bearing elements fault diagnosis. Struct Health Monit 2019; 18: 1092–1128.

31.

Mylonas

Abdallah

Chatzi

. Deep unsupervised learning for condition monitoring and prediction of high dimensional data with application on windfarm SCADA data. In: IMAC-XXXVII, a conference and exposition on structural dynamics, Orlando, FL, 19–22 February 2019, pp. 189–196. New York: Springer.

32.

Lopez

Sarigul-Klijn

. Distance similarity matrix using ensemble of dimensional data reduction techniques: vibration and aerocoustic case studies. Mech Syst Sig Process 2009; 23: 2287–2300.

33.

Mujica

Vehí

Ruiz

, et al. Multivariate statistics process control for dimensionality reduction in structural assessment. Mech Syst Sig Process 2008; 22: 155–171.

34.

Rebillat

Mechbal

. Dimension reduction algorithms in the damage indexes space for damage size quantification in aeronautic composite structures. In: Proceedings of the twelfth international workshop on structural health monitoring, Stanford, CA, 10–12 September 2019.

35.

Sarmadi

Entezami

. Application of supervised learning to validation of damage detection. Arch Appl Mech. Epub ahead of print 23 September 2020. DOI: 10.1007/s00419-020.

36.

Entezami

. Statistical Decision-Making by Distance Measures. In: Structural Health Monitoring by Time Series Analysis and Statistical Distance Measures. Springer 2021; DOI: 10.1007/978-3-030-66259-2_4

37.

Deraemaeker

Worden

. A comparison of linear approaches to filter out environmental effects in structural health monitoring. Mech Syst Sig Process 2018; 105: 1–15.

38.

Rébillat

Hmad

Kadri

, et al. Peaks over threshold–based detector design for structural health monitoring: application to aerospace structures. Struct Health Monit 2018; 17: 91–107.

39.

Forero

Giannakis

. Robust multi-dimensional scaling via outlier-sparsity control. In: 2011 conference record of the forty fifth asilomar conference on Signals, systems and computers (ASILOMAR), Pacific Grove, CA, 6–9 November 2011, pp. 1183–1187. New York: IEEE.

40.

Dyke

Bernal

Beck

, et al. Experimental phase II of the structural health monitoring benchmark problem. In: Proceedings of the 16th ASCE engineering mechanics conference, Seattle, WA, 16–18 July 2003. New York: American Society of Civil Engineer.

41.

Liu

, et al. SMC structural health monitoring benchmark problem using monitored data from an actual cable-stayed bridge. Struct Contr Health Monit 2014; 21: 156–172.

42.

Box

Jenkins

Reinsel

, et al. Time series analysis: forecasting and control. 5th ed. Chichester: John Wiley & Sons, 2015.

43.

Carden

Brownjohn

. ARMA modelled time-series classification for Structural Health Monitoring of civil infrastructure. Mech Syst Sig Process 2008; 22: 295–314.

44.

Borg

Groenen

. Modern multidimensional scaling: theory and applications. New York: Springer, 2005.

45.

Coles

Bawa

Trenner

, et al. An introduction to statistical modeling of extreme values. New York: Springer, 2001.

46.

Salvadori

De Michele

Kottegoda

, et al. Extremes in nature: an approach using copulas. New York: Springer, 2007.