Bolt damage identification based on orientation-aware center point estimation network

Abstract

With the development of deep learning, object detection algorithms based on horizontal box are widely used in the field of damage identification. However, damages can be in any direction and position, and they are not necessarily horizontal or vertical. This article proposes a bolt damage identification network, namely, orientation-aware center point estimation network, which models a damage as a center point of its rotated bounding box. The proposed orientation-aware center point estimation network uses deep layer aggregation network to search center points and regress to all other damage properties, such as size and angle. A loss function is designed to improve the optimization efficiency of network. Orientation-aware center point estimation network is applied to bolt damage detection, and comparison with the well-known Faster Region-Convolutional Neural Network (a benchmark using horizontal bounding box) demonstrates the accuracy of the proposed method. Finally, videos were utilized to verify the capability of the proposed orientation-aware center point estimation network in real-time detection of bolt damages.

Keywords

Bolt damage rotated bounding box deep learning image

Introduction

As important connecting components, bolts are widely used in various types of composite structures. Compared with other methods such as welding, bolt has the advantages of easy installation and high flexibility. However, composite structures are affected by loading, environment, and other factors, resulting in bolt damage, and even serious disasters. For example, more than 1800 safety incidents were caused by loose bolts with official records in the United Kingdom. Within 2 years, 25% of the bolts on the rail network need to be re-tightened to avoid bolt failure.¹ Therefore, it is necessary to monitor the condition of bolts. Structural health monitoring mainly relies on advanced sensor technology to identify various types of damages. These sensors provide technical support for structural health monitoring and examples include accelerometers, acoustic emission, ultrasound, and so on. They can collect the dynamic response from structures, identify the physical parameters of the structure, and finally achieve the purpose of damage detection. Many researchers have developed various types of smart sensors to detect damages.^2–6 Huo et al.⁷ proposed an electrical mechanical impedance method to investigate the preload condition of bolted connections. The experimental results showed that as the bolt preload decreased, the resonant frequency of the bolted connection decreased due to the influence on the impedance of the piezoelectric ceramic switch. Oregui et al.⁸ proposed a bolt looseness monitoring method using axle box acceleration measurements. Vehicle-borne monitoring systems may be used to automatically detect and assess the tightness condition of bolts at rail joints. Jiang et al.⁹ developed a stress wave–based active sensing method, which uses piezoelectric ceramic transducer to detect the damages caused by loose bolts in steel truss arch structure. When initial looseness damages occurred, the overall structural stiffness did not decrease significantly. However, the wavelet packet energy changed significantly, indicating that the method was superior to the stiffness-based method. Wang et al.¹⁰ developed a new entropy-based damage index using the multi-scale permutation entropy algorithm, which eliminates the saturation phenomenon of the current damage index and enhances the performance of bolt early looseness monitoring. Du et al.¹¹ proposed a guided wave method based on virtual time reversal and tightness index for bolt preload monitoring. The experimental results showed that the tightness index increased almost linearly as the bolt preload decreased. Tola et al.¹² proposed a bolt looseness damage detection method based on the ultrasonic wavefield energy in Lamb wave. Ho et al.¹³ used fiber Bragg grating sensors to monitor rock bolt loads. However, not all structures are equipped with sensor systems. Taking bridge as an example, most of the small- and medium-sized bridges are not installed with structural health monitoring systems. As a result, current structural safety inspections rely heavily on manual inspections, which require experience and is inefficient and costly.

Manual inspection relies mainly on percussion and observation to detect and record the condition of bolts in critical areas. For percussion, it mainly relies on the auditory sense to determine bolt damage states. The data set was constructed by capturing audio signals from different bolt damage states, and bolt damages were classified and identified using algorithms such as support-vector machines and neural networks.^14–17 For observation, the damage state of bolts was detected by vision. Bolts tend to be found in large-scale concentrations in structures. In order to improve the efficiency and accuracy of manual inspection, it is a feasible solution to use machine vision instead of human eyes for damage detection.^18–23 Therefore, machine vision–based structural damage detection methods have received substantial attention. Cha et al. proposed a bolt looseness damage detection method. Sliding windows and support-vector machines have been used to detect bolt large looseness damage.^24,25 Park et al.²⁶ extracted the edge information of the bolt using the Hough transform. According to the change of edge information, the looseness angle within 60° could be recognized. The difficulty with traditional machine vision methods is that it is necessary to choose the important features in each given image. As the number of classes to classify increases, feature extraction becomes more and more cumbersome.²⁷

With the rapid development of deep learning, convolutional neural networks (CNNs) have achieved remarkable results in the field of image processing. CNNs autonomously learn the feature information of targets in images through multiple convolutional layers, and realize the task of target recognition. Deep CNNs have substantially reduced the difficulty of machine vision task and improved the accuracy of target recognition. Therefore, deep CNNs have been widely used in many fields.^28–33 Zhang et al.³⁴ proposed a bolt looseness damage detection method based on the Faster Region-Convolutional Neural Network (Faster R-CNN), which can automatically identify bolt damage states. Wang et al.³⁵ used Faster R-CNN and You Only Look Once (YOLO) to detect bolt. Considering the purpose of the abnormal detection of bolts is to confirm whether the bolts are lost. Zhao et al.³⁶ proposed a bolt looseness angle detection method based on single shot multi-box detector (SSD), which can identify two classes. According to the localization information, the looseness angle within 180° could be recognized. Huynh et al.³⁷ used deep learning to identify the bolts in images, and then extracted the bolt edges using tilt correction and Hough transform. The bolt looseness angle detection could be achieved according to the change of edge information. Xie and Sun³⁸ used CNNs to locate the target regions. Three-dimensional (3D) reconstruction was performed to detect bolt loosening faults. The measurement repeatability and precision are less than 0.03 and 0.09 mm, respectively. Pham et al.³⁹ used 3D simulation software to create images of bolts, and trained the rendered images to generate a bolt detection model. Sun et al.⁴⁰ used CNNs to segment sub-pixel edges of bolt cap and local area of mounting surface, respectively. The 3D data of these edges were obtained by stereo matching and binocular vision model, and the distance between bolt cap and mounting surface was calculated. All the aforementioned identified bounding box–based bolt looseness recognition methods use horizontal detection algorithms for damage detection. However, the bolts may be in any position in the structure as a connecting member. Therefore, the assumption of horizontal or vertical bolts does not always hold in practice. The looseness bolts always have large aspect ratios; the horizontal bounding box usually contains more background and is not conducive to quantifying the bolt looseness damages.

In order to detect and locate multiple bolt damages, this article proposes a bolt damage identification method with consideration of arbitrary orientation of the bolts. Although the semantic segmentation algorithm can identify bolts at the pixel level, the training cost of semantic segmentation is high and the detection speed is slow. Identification of bounding box which is vigorously promoted by the industry is the most popular damage detection method. However, most target detection algorithms use horizontal bounding box for damage location.⁴¹ For large aspect ratio and densely distributed damages, positioning based on horizontal bounding box is not reliable. Therefore, this article proposes a bolt damage identification network, namely, orientation-aware center point estimation network (OA-CenterNet), which is used to detect arbitrarily oriented bolt damages. The proposed OA-CenterNet uses damage feature which is extracted and fused by a deep layer aggregation (DLA) structure to find center points and regress to all other damage properties, such as size and angle. A loss function is designed to improve the optimization efficiency of network. Section “Overall bolt damage identification network architecture” describes the overall procedure of the proposed bolt damage identification network. Section “Bolt damage detection model” shows the training and evaluation results of the model. Section “Bolt damage detection” presents the verification and results of the real-time detection system. Section “Discussion and conclusion” discusses and concludes the article.

Overall bolt damage identification network architecture

Machine vision–based approach can be used to locate and identify bolt damages quickly and accurately. Some large-scale structures have many bolts with wide distribution in various positions. When camera or unmanned aerial vehicle (UAV) is used to collect images of bolts, there may be multiple orientations of bolts in an image. Object recognition can be divided into horizontal bounding box detection and rotated bounding box detection, as shown in Figure 1. In horizontal bounding box detection, the bounding box is always horizontal with the X-axis of an image. It can be seen from this figure that the rotated bounding box can be more accurately fitted to the bolt region. Meanwhile, the height of the rotated bounding box can reflect the looseness degree of bolts.

Figure 1.

Identified bounding box: (a) horizontal bounding box and (b) rotated bounding box.

In order to identify and locate bolt damages more accurately, this article proposes a bolt damage identification network using OA-CenterNet, which is used to identify and locate the three states (loose, tight, and fell off) of bolts. The proposed OA-CenterNet models a damage as a center point of its rotated bounding box. It uses deep CNNs to search the center points and regress to all other damage properties, such as size and angle. According to the damage information, the damages can be accurately identified and located by the rotated bounding box. The framework of the proposed method is shown in Figure 2. First, a bolt damage data set is created. Then, the rotation detection algorithm is used to train a detection model. The damage images are fed into the detection model, which can estimate the center point, height, width, and angle of each damage in an image. In addition, the aspect ratio (v) of the rotated bounding box can be used as a damage parameter to quantify the bolt looseness damage. The quantification of bolt looseness damages serves as an indicator for rapid inspection and safety assessment

v = \frac{h}{w}

(1)

where h is the height of rotated bounding box, and w is the width of rotated bounding box.

Figure 2.

The framework of the proposed method.

The proposed OA-CenterNet mainly includes two modules: feature extraction and fusion and the generation of rotated bounding box. Feature map is the basis of classification network. In this article, a DLA structure is used to extract and fuse features of damages. Then, the final fusion features are fed into four convolution layers to obtain the center point, size, and angle of the damage, respectively. According to the damage information, its rotated bounding box can be drawn in an image. Moreover, OA-CenterNet is an anchor-free algorithm, which represents the target by its center point and then regresses the height, width, and other information about the target. It transforms the target detection problem into a central point estimation problem; its overall idea is very concise. In addition, a specific loss function is designed to improve the training efficiency of the model. The details of the OA-CenterNet are given below.

Feature extraction and fusion

Object detection can simultaneously complete the recognition and location of objects in an image. Therefore, it is necessary to extract both the feature and location information of the object. CNN can extract features of images autonomously. With the increase in the network layers, the higher-level features are extracted more abundantly. However, with the increase in network layers, the spatial information of the object is extremely compressed, resulting in poorer target localization. To resolve this problem, many pyramidal and deep aggregation networks have been proposed.^42–45 For example, skip connections can combine different scales and resolution information from shallow to deep, but these connections are linear and shallow. In this article, a DLA structure is used to iteratively fuse deeper, larger-scale information, as shown in Figure 3.

Figure 3.

Deep layer aggregation structure.

The images are resized from 960 × 544 × 3 to 512 × 512 × 3, and then images are fed into the DLA structure. The backbone network of this structure contains four convolutional groups. Each convolutional group includes a convolutional (Conv) layer, a batch normalization (BN) layer, and an activation layer (ReLU). Each convolutional group can generate a feature map, and their dimensions are 128 × 128, 64 × 64, 32 × 32, 16 × 16, respectively. As shown in Figure 3, the four feature maps are iteratively fused. The final output of DLA is a feature map with the downsampling coefficient of 4. The solid lines in the figure represent the downsampling layers, the black dashed lines in the figure represent the fusion layers, and the red dashed lines in the figure represent the upsampling layers. The parameters in the downsampling layer are initialized using the parameters from the ImageNet-based pre-training model, while the parameters in the upsampling layer are randomly initialized.

Rotated bounding box

Images are processed by the DLA structure to obtain a feature map of size 128 × 128 × 64, which is fed into four output modules, namely, heatmap, offset, size, and angle. The output modules are shown in Figure 4. Each module contains two convolutional groups. The first convolutional group includes a convolutional (Conv) layer and an activation layer (ReLU). The second convolutional group includes only a convolutional (Conv) layer.

Figure 4.

Output modules.

For each module, the feature map is processed with two convolutional layers. Finally, the heatmap module outputs a feature map of size 128 × 128 × 3, where 3 represents the number of classes. The offset module outputs a feature map of size 128 × 128 × 2, where 2 represents the bias in both directions. The size module outputs a feature map of size 128 × 128 × 2, where 2 represents the width and height of an object. The angle module outputs a feature map of size 128 × 128 × 1, where 1 represents the rotated angle of a horizontal box.

The network can estimate the center point coordinates, width, height, and rotated angle of damages in an image. According to these four parameters, the rotated bounding box of each category can be drawn in an image. An example of the drawing process is schematically shown in Figure 5.

Figure 5.

Drawing process of rotated bounding box.

In Figure 5, the X-axis is horizontal to the right and the Y-axis is to the vertical to the bottom. (x, y) is the center point of the horizontal box, w is the width of the horizontal box, h is the height of horizontal box, and α is the rotated angle of the horizontal box. According to the four parameters, the four corner coordinates of the horizontal box can be calculated, and they are given as follows

{\begin{matrix} (x_{1}, y_{1}) = (x - \frac{w}{2}, y - \frac{h}{2}) \\ (x_{2}, y_{2}) = (x + \frac{w}{2}, y - \frac{h}{2}) \\ (x_{3}, y_{3}) = (x - \frac{w}{2}, y + \frac{h}{2}) \\ (x_{4}, y_{4}) = (x + \frac{w}{2}, y + \frac{h}{2}) \end{matrix}

(2)

where (x₁, y₁) are the upper-left corner coordinates of the horizontal box, (x₂, y₂) are the upper-right corner coordinates of the horizontal box, (x₃, y₃) are the lower-left corner coordinates of the horizontal box, and (x₄, y₄) are the lower-right corner coordinates of the horizontal box.

The horizontal box rotated α clockwise around the center point is the final identified bounding box. When the rotated angle (α) is greater than 180°, α should be subtracted by 180°. It is because a rectangle box rotated 180° around the center point is itself. This method can also reduce the amount of network computing. Finally, the four corner coordinates of the rotated box can be readily obtained

\begin{matrix} {\begin{matrix} m_{i} = (x_{i} - x) \times \cos (α) - (y_{i} - y) \times \sin (α) + x \\ n_{i} = (x_{i} - x) \times \sin (α) + (y_{i} - y) \times \cos (α) + y \end{matrix}, & i = 1, 2, 3, 4 \end{matrix}

(3)

where (x_i, y_i), i = 1, 2, 3, and 4, are the four corner points of the horizontal box, and (m_i, n_i), i = 1, 2, 3, and 4, are the four corner points of the rotated box.

Loss function

The loss function is used to describe the difference between the predicted value of a model and the actual value. It also affects the update of the network parameters. Therefore, the loss function is an important component of deep learning. The OA-CenterNet has four output modules, namely, heatmap, size, offset, and angle. The heatmap reflects the position and number of keypoints in an image. The size includes the width and height of a target. The offset includes the bias of the predicted center point in the width and height directions. The angle is the rotated angle of the horizontal box. Each module has its own loss function, which will be given as follows.

The heatmap module produces a keypoint heatmap Y′∈ [0,1]^{128 × 128 × c}, where 128 × 128 is the size of the heatmap, and c is the number of keypoint types. Keypoint types include loose, tight, and fell off. During the training process, all the ground truth keypoints are filtered by a Gaussian filter to obtain the ground truth heatmap. The training is realized through penalty-reduced pixelwise logistic regression with loss function similar as focal loss

L_{heatmap} = {\begin{matrix} \frac{- 1}{N} \sum_{xyc} {(1 - {Y^{'}}_{xyc})}^{α} \log ({Y^{'}}_{xyc}) & Y_{xyc} = 1 \\ \frac{- 1}{N} \sum_{xyc} {(1 - Y_{xyc})}^{β} {({Y^{'}}_{xyc})}^{α} \log (1 - {Y^{'}}_{xyc}) & otherwise \end{matrix}

(4)

where N is the number of keypoints in an image, Y_xyc is the ground truth heatmap, and $Y {^{'}}_{xyc}$ is the estimation heatmap. When $Y {^{'}}_{xyc} = 1$ for a point, it implies that the point is the center point of the target, where x is the horizontal coordinate of the point and y is the vertical coordinate of the point. While $Y {^{'}}_{xyc} = 0$ for a point, it implies that the point belongs to the background. α and β are the hyperparameters, and 2 and 4 are used in this network, respectively.

The spatial resolution of the final feature map is one-fourth of the input image, that is, one pixel in the feature map corresponds to 16 pixels in the original image; this will lead to target positioning deviation. Therefore, the bias value is introduced into the network. During training process, the bias value is trained using the mean absolute error, and the bias loss is

L_{offset} = \frac{1}{N} \sum_{p} | {O^{'}}_{p^{'}} - (\frac{p}{R} - p^{'}) |

(5)

where N is the number of keypoints in an image, p represents the ground truth keypoint, R represents the downsampling coefficient, and p′ is the equivalent point of ground truth keypoint at low resolution. (p/R − p′) is the actual bias value, and $O {^{'}}_{p^{'}}$ is the estimation bias value of the network.

The rectangle box is used to label the objects in an image. For an object of category k, the two corner points of rectangle box are (x₁, y₁) and (x₂, y₂), respectively. The width–height information of the object is denoted by s_k=(x₂ − x₁, y₂ − y₁). The OA-CenterNet is used to estimate the center point of all objects. The width–height information is trained using the mean absolute error, and the size loss function is

L_{size} = \frac{1}{N} \sum_{k = 1}^{N} | {S^{'}}_{k} - s_{k} |

(6)

where N is the number of keypoints in an image, s_k is the actual width–height of an object, $S {^{'}}_{k}$ is the estimated width–height information of an object, and k is the kth keypoint.

In the rotation detection, the angle of the target box is already included in the labeling information. This is also the main difference between OA-CenterNet and CenterNet. The absolute error is used for angle regression, and the loss function of angle is given by

L_{angle} = {\begin{matrix} 0.5 {({θ^{'}}_{k} - θ_{k})}^{2} & | {θ^{'}}_{k} - θ_{k} | < 1 \\ | {θ^{'}}_{k} - θ_{k} | - 0.5 & otherwise \end{matrix}

(7)

where the estimated angle of the network is $θ {^{'}}_{k}$ , the actual angle is θ_k, and k is the kth keypoint.

The total loss function is the sum of the above four loss functions, and different weights are assigned. Compared with CenterNet, the total loss function of OA-CenterNet has an additional angle loss value. The total loss function is

L_{total} = λ_{heatmap} L_{heatmap} + λ_{size} L_{size} + λ_{offset} L_{offset} + λ_{angle} L_{angle}

(8)

where λ_heatmap is the weight of heatmap loss, λ_size is the weight of size loss, λ_offset is the weight of offset loss, and λ_angle is the weight of angle loss. It is found that the orders of magnitude of their initial value (L_heatmap, L_size, L_offset, and L_angle) are 10⁰, 10⁰, 10⁻¹, and 10¹, respectively. To ensure that these four modules are trained evenly, the weightings are selected to be inversely proportional to their orders: λ_heatmap and λ_size are set to 1, λ_offset is set to 10, and λ_angle is set to 0.1.

Evaluation indicators

For the object detection task, the mAP, F₁-score, and mean intersection over union (mIoU) are widely used to evaluate the performance of a method. mAP is the mean average precision of all categories

mAP = \frac{\sum_{i = 1}^{K} A P_{i}}{K}

(9)

where AP_i is the average precision of class i, and K is the number of classes.

F₁-score, also known as balanced score, is used to measure the accuracy of classification model. It considers the precision and recall of classification model simultaneously

F_{1} = 2 \times \frac{precision \times recall}{precision + recall}

(10)

where precision quantifies the number of positive class predictions that actually belong to the positive class, and recall quantifies the number of positive class predictions made out of all positive examples in the data set.

The network relies on the identified bounding box to locate the target. The smaller the difference between the identified bounding box and the actual box, the better the localization performance of model. Therefore, IoU which is the ratio of intersection to union is an important evaluation indicator

IoU = \frac{intersection}{union}

(11)

where intersection is the area of overlap of the identified bounding box and the actual box, and union is the area of union of the identified bounding box and the actual box, as shown in Figure 6. Furthermore, mIoU is the mean IoU of all categories.

Figure 6.

IoU diagram of the rotated bounding box.

Bolt damage detection model

Data set

The images used in this article were collected from a laboratory structure. The structure is a wooden structure with 12 bolt holes. Nine of the holes were randomly selected to install M18 bolts. The nine bolts were randomly set to “loose” and “tight.” The other three holes represented “fell off.” A total of 330 images with the size of 960 × 544 pixels were collected by smartphone. A labeling tool, roLabelImg (https://github.com/cgvict/roLabelImg), was used to label the three types of bolt states in images. We use this tool to draw bounding boxes in an image; these bounding boxes can be adjusted in angle and size in order to accurately locate the damages. This tool generates label data based on artificially drawn bounding boxes and categories. As a supervised learning method, the labeling process of data all needs to be carried out manually. When the data are wrongly marked, the learning effect of the network will be affected and the final recognition accuracy will be reduced. These images and annotations construct a bolt damage data set, which includes three categories: loose, tight, and fell off. The labeling process of data is shown in Figure 7. In horizontal target detection, the label data for the identified bounding box are the coordinates of the two corner points (upper-left and lower-right). In the case of rotation detection, the label data for the identified bounding box are midpoint coordinates, width, height, rotated angle, and category. 300 images from the data set were randomly selected as the training set and validation set, and 30 images as the testing set.

Figure 7.

Labeling process: (a) original image and (b) label rotated box.

Model training and evaluation

The proposed OA-CenterNet is used to train the data set. The main parameters are as follows: the learning rate is 0.001, the weight decrease is 1e−4, the batch size is 3, and the number of epochs is 100. The training loss and the validation loss are shown in Figure 8. As can be seen from Figure 8(a), the training loss of total, heatmap, size, offset, and angle gradually decreases and tends to be stable. It is noteworthy that the angle loss value is much larger than other loss values. This is because smooth absolute error is used for angle regression, while mean absolute error is used for heatmap, size, and offset regression. Angle regression is challenging and smooth absolute error can ensure the stability of angle regression. As can be seen from Figure 8(b), the validation loss of total, size, offset, and angle gradually decreases and tends to be stable. It is remarkable that the heatmap loss value first decreased, then increased, and finally reached a stable state. This phenomenon also appears in the training process based on CenterNet. However, the loss does not correspond to the final evaluation metric (mAP). The last model gives better mAP than the lowest-loss model. No overfitting was observed on the data set. Thus, the last model was chosen as the detection model, which was evaluated through three indicators. The evaluation result of the detection model is shown in Table 1. It can be seen from the table that the three indicators are very high, indicating that the model has a high capability for bolt damage detection.

Figure 8.

Loss curves: (a) training loss and (b) validation loss.

Table 1.

Model evaluation result.

Evaluation indicators	Value
mAP	0.9650
F₁-score	0.9471
mIoU	0.8536

mAP: mean average precision; mIoU: mean intersection over union.

According to the drawing method of rotated box, the identified bounding boxes of the model are drawn on the image. The identified bounding boxes for the three categories (loose, tight, and fell off) are set to blue, green, and red, respectively. The identification results of some samples are shown in Figure 9. The rotated bounding boxes in Figure 9(a) to (d) are the detection results of the proposed OA-CenterNet, and the horizontal bounding boxes in Figure 9(e) to (h) are the detection results of Faster R-CNN. The Faster R-CNN was trained with an initial learning rate of 0.0001, momentum of 0.9, batch size of 1, and iterations of 5000. The average precision of Faster R-CNN is 0.9315, which is slightly lower than OA-CenterNet’s average precision of 0.9650. Therefore, the identification accuracy of OA-CenterNet can meet the requirements of bolt damage detection. Compared with horizontal bounding box, rotated bounding box can be more accurately fitted to the target area. It can be seen from the figure that the proposed OA-CenterNet can accurately identify and locate three types of bolt states. For the loose bolts, no matter which orientation they are in, the rotated box generated by the model can accurately locate them. In the field of machine vision, occlusion is always a difficult problem to solve. The influence of occlusion can be minimized by multi-direction detection.

Figure 9.

Identification results of some samples.

Bolt damage detection

Influence of different illuminations

Bolts, as joints of the composite structure, can be installed in various positions of a structure. The lighting conditions in different positions are also different. In the field of machine vision, illumination variation which may affect the performance of deep learning model is a very important factor. In order to verify the stability of deep learning model, the images of bolts under darkness, unilateral illumination, and shadow were collected. The identification results of these images are shown in Figure 10. As can be seen from Figure 10(a), the targets in the back row were more ambiguous in darkness, and the individual bolt damage was poorly identified. However, most of the damage can still be identified and located with high precision. Figure 10(b) shows the identification results of bolt damages under unilateral illumination. The results showed that the two “tight” and one “fell off” were not identified, and the shadow of a bolt was misidentified as “fell off.” Under unilateral illumination, the shadow of the bolt complicates the background, and thus affects the recognition ability of the deep learning model. Figure 10(c) shows the identification results of bolt damages under shadow. The results showed that three “tight” were not identified, and no target was misidentified. In order to quantify the influence of different illuminations on bolt damage identification, the correct detection rate, the missed detection rate, and the false detection rate of bolt damage under three types of lighting conditions are calculated, respectively. The correct detection rate of bolt damage under normal lighting conditions is 100%. As can be seen from Table 2, lighting conditions can reduce the recognition accuracy of deep learning model. However, some bolt damages were miss detected mainly because of occlusion. In the field of machine vision, occlusion still has a significant impact on target identification. Hence, multiple cameras can be used to reduce the chance of objects being occluded.

Figure 10.

Identification results of bolts under different illuminations: (a) darkness, (b) unilateral illumination, and (c) shadow.

Table 2.

Bolt damage detection accuracy under different illuminations.

Lighting conditions	Correct detection rate	Missed detection rate	False detection rate
Darkness	79.17%	16.67%	4.17%
Unilateral illumination	83.33%	16.67%	4.17%
Shadow	87.50%	12.50%	0%

Quantification of looseness damage

The evaluation results showed that the model can accurately identify and locate the “loose,”“tight,” and “fell off.” In this article, the extension of screw is regarded as a type of bolt damage. However, the different extension lengths of screw represent different degrees of looseness damage. Therefore, in order to further quantify the degree of bolt looseness damage, a damage parameter is proposed. As the distance between the camera and the bolt changes, so does the size of the bolt in the image. Thus, the size of bolts should not be selected as the damage parameter. It is appropriate to use the aspect ratio of the identified bounding box as the damage parameter.

The aspect ratio of the identified bounding box is shown directly on the image for easy viewing by the inspector. The detection accuracy of the damage parameter is further illustrated by the validation experiment. First, four bolts with different looseness degrees were selected randomly, and their aspect ratios were manually measured using a ruler. Then, multiple images were taken from different orientations using a smartphone, and the identification results are shown in Figure 11. It can be seen from the figure that the model can complete the bolt damage identification with high precision, and output the damage parameters of bolt looseness damages in real time. Table 3 shows the damage parameters of bolt looseness damages in the four images. T is the actual value; P is the estimated value. As a vision-based approach, the proposed method will be affected by the sharp camera angles, whereas the identification results of bolts, especially those with small aspect ratio, are sensitive to low vertical angles. Thus, the error of No. 2 bolt damage parameter is larger than others. This is also the limitation of the proposed method. The average identification error of the other three bolt damage parameters is less than 10%. The bolt damage identified in this article is large looseness damage. The aim is to complete rapid macroscopic damage location and identification instead of manual inspection. The damage parameters of bolt looseness damages will be affected by the elevation angle, which will reduce the detection accuracy of damage parameters. For macroscopic damage, it is sufficient to determine whether it belongs to damage. This article attempts to further quantify the characterization of looseness damage, in order to provide reference for infrastructure management and maintenance.

Figure 11.

Quantitative results of looseness damage.

Table 3.

Measurement error of looseness damage parameter: the bolts with looseness damage are numbered from left to right as 1, 2, 3, 4.

Number	T	(a)		(b)		(c)		(d)
		P	Error	P	Error	P	Error	P	Error
1	1.97	1.82	7.79%	1.85	6.27%	1.85	6.27%	1.77	10.32%
2	1.45	1.52	5.02%	1.73	19.53%	1.61	11.24%	1.69	16.76%
3	1.87	1.7	9.01%	1.75	6.34%	1.79	4.20%	1.89	1.15%
4	2.24	2.08	7.01%	1.96	12.38%	2.16	3.44%	2.03	9.25%

Real-time damage monitoring

When vision-based damage detection method is used instead of manual inspection, the detection accuracy and the speed are important factors to be considered. A real-time bolt damage detection system is established in this article. The system transmits the video captured by the camera into the deep learning model, which can accurately identify damages in real time. The detection results are saved in the form of images. Figure 12 shows the bolt damage identification results of 10 consecutive frames. The results showed that the model still has high identification accuracy when the video frames are processed continuously. In the process of real-time detection, detection time is an important performance parameter. Generally, frames per second (FPS) is used to measure the real-time performance of the model. On an Nvidia Quadro P4000 GPU with CUDA^® Deep Neural Network library (cuDNN) acceleration, this model runs at 66.67 ms per frame, or 15 FPS. Thus, the deep learning–based damage identification method can meet the needs of real-time detection.

Figure 12.

Identification results of 10 consecutive frames.

Discussion and conclusion

In this article, a bolt damage identification using OA-CenterNet is proposed. This method overcomes the problem of inaccurate positioning of non-vertical bolts in horizontal box detection. Although the semantic segmentation algorithm can identify targets at the pixel level, semantic segmentation data sets are extremely insufficient and the models are difficult to train. Therefore, rotated bounding box not only can locate the target more accurately but also does not substantially increase the labeling cost of data set. The proposed method uses machine vision to complete the real-time monitoring of bolt damages. However, it also has some limitations and disadvantages. First, when the aspect ratio of bolts may be only minor variation or even unvaried, the bolts in images are almost unchanged. As a vision-based approach, the proposed method can only identify bolt damages through the changes of bolts in images. Therefore, the proposed method cannot detect the micro looseness damage of bolts. Second, vision-based approach will be affected by the sharp camera angles. Thus, a varying horizontal angle was considered to overcome this issue. Third, occlusion is the basic element that limits the information in an image. Objects that are heavily occluded are more difficult for a classification algorithm than unobstructed objects. In the field of machine vision, occlusion still has a significant impact on target identification. Hence, multiple cameras can be used to reduce the chance of objects being occluded.

The accuracy of the proposed method was validated by a bolt damage data set collected in the laboratory. The experimental results showed that it can accurately identify and locate three types of bolt states—mAP is 0.9650, F₁-score is 0.9471, and mIoU is 0.8536. The stability of the model was verified under different lighting conditions. However, deep learning can improve detection performance by learning a large number of sample data. When the diversity of samples in the data set increases, the identification and generalization ability of deep learning model will be further improved. To refine the detection results, the aspect ratio of the rotated bounding box is used as the damage parameter of the bolt looseness damage. Finally, the videos recorded by the camera were fed into the detection model to complete the real-time bolt damage detection. The proposed method has the potential to replace manual inspection and quickly count the number and status of bolts in a structure, thus providing data support for subsequent structural maintenance.

Footnotes

Declaration of conflicting interests

The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding

The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This research was supported by the Science and Technology Development Fund of the Macau SAR under research grant SKL-IOTSC-2018-2020.

ORCID iDs

Yang Zhang

Ka-Veng Yuen

References

Stephen

JT.

Condition monitoring of bolted joints. Doctoral Dissertation, The University of Sheffield, Sheffield, 2015.

Bao

Sun

, et al. Compressive sampling–based data loss recovery for wireless sensor networks used in civil structural health monitoring. Struct Health Monit 2013; 12(1): 78–95.

Hasni

Jiao

Alavi

, et al. Structural health monitoring of steel frames using a network of self-powered strain and acceleration sensors: a numerical study. Automat Constr 2018; 85: 344–357.

Spencer

Jr Park

Mechitov

, et al. Next generation wireless smart sensors toward sustainable civil infrastructure. Procedia Engineer 2017; 171: 5–13.

Chang

Flatau

Liu

SC.

Health monitoring of civil infrastructure. Struct Health Monit 2003; 2(3): 257–267.

Yin

Wang

Yang

, et al. A smart washer for bolt looseness monitoring based on piezoelectric active sensing method. Appl Sci 2016; 6(11): 320.

Huo

Chen

Liang

, et al. Impedance based bolt pre-load monitoring using piezoceramic smart washer. Smart Mater Struct 2017; 26(5): 057004.

Oregui

E-BM

Nunez

VAA

, et al. Monitoring bolt tightness of rail joints using axle box acceleration measurements. Struct Control Hlth 2017; 24(2): e1848.

Jiang

Wang

, et al. Monitoring of bolt looseness-induced damage in steel truss arch structure using piezoceramic transducers. IEEE Sens J 2018; 18(16): 6677–6685.

10.

Wang

SCM

Song

Monitoring of early looseness of multi-bolt connection: a new entropy-based active sensing method without saturation. Smart Mater Struct 2019; 28(10): 10LT01.

11.

Zhang

A bolt preload monitoring method based on the refocusing capability of virtual time reversal. Struct Control Hlth 2019; 26(8): e2370.

12.

Tola

Lee

Park

, et al. Bolt looseness detection based on ultrasonic wavefield energy analysis using an Nd:YAG pulsed laser scanning system. Struct Control Hlth 2020; 27(9): e2590.

13.

SCM

Wang

, et al. A load measuring anchor plate for rock bolt using fiber optic sensor. Smart Mater Struct 2017; 26(5): 057003.

14.

Zhang

Zhao

Sun

, et al. Bolt loosening detection based on audio classification. Adv Struct Eng 2019; 22(13): 2882–2891.

15.

Kong

Zhu

SCM

, et al. Tapping and listening: a new approach to bolt looseness monitoring. Smart Mater Struct 2018; 27(7): 07LT02.

16.

Yuan

Kong

, et al. Percussion-based bolt looseness monitoring using intrinsic multiscale entropy analysis and BP neural network. Smart Mater Struct 2019; 28(12): 125001.

17.

Wang

SCM

Song

Modeling and analysis of an impact-acoustic method for bolt looseness identification. Mech Syst Signal Pr 2019; 133: 106249.

18.

Feng

MQ.

Computer vision for SHM of civil infrastructure: from dynamic response measurement to damage detection—a review. Eng Struct 2018; 156: 105–117.

19.

Davoudi

Miller

Kutz

JN.

Data-driven vision-based inspection for reinforced concrete beams and slabs: quantitative damage and load estimation. Automat Constr 2018; 96: 292–309.

20.

Davoudi

Miller

Calvi

, et al. Computer vision–based damage and stress state estimation for reinforced concrete and steel fiber–reinforced concrete panels. Struct Health Monit 2020; 19(6): 1645–1665.

21.

Cha

Chen

Büyüköztürk

Output-only computer vision based damage detection using phase-based optical flow and unscented Kalman filters. Eng Struct 2017; 132: 300–313.

22.

Dworakowski

Kohut

Gallina

, et al. Vision-based algorithms for damage detection and localization in structural health monitoring. Struct Control Hlth 2016; 23(1): 35–50.

23.

Shi

Wang

, et al. Beam damage detection using computer vision technology. Nondestruct Test Eva 2010; 25(3): 189–204.

24.

Ramana

Choi

Cha

YJ.

Fully automated vision-based loosened bolt detection using the Viola–Jones algorithm. Struct Health Monit 2019; 18(2): 422–434.

25.

Cha

You

Choi

Vision-based detection of loosened bolts using the Hough transform and support vector machines. Automat Constr 2016; 71: 181–188.

26.

Park

Huynh

Choi

, et al. Vision-based technique for bolt-loosening detection in wind turbine tower. Wind Struct 2015; 21(6): 709–726.

27.

O’Mahony

Campbell

Carvalho

, et al. Deep learning vs. traditional computer vision. In: Proceedings of the science and information conference, Las Vegas, NV, 25–26 April 2019, pp. 128–144. Cham: Springer.

28.

Kim

Huh

JH.

Consistency of medical data using intelligent neuron faster R-CNN algorithm for smart health care application. Healthcare 2020; 8(2): 185.

29.

Kamnitsas

Ledig

Newcombe

, et al. Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Med Image Anal 2017; 36: 61–78.

30.

Mortazi

Bagci

. Automatically designing CNN architectures for medical image segmentation. In: Proceedings of the international workshop on machine learning in medical imaging, Granada, 16 September 2018, pp. 98–106. Cham: Springer.

31.

Kisilev

Sason

Barkan

, et al. Medical image description using multi-task-loss CNN. In: Carneiro

Mateus

Loïc

, et al. (eds) Deep learning and data labeling for medical applications. Cham: Springer, 2016, pp. 121–129.

32.

Dai

, et al. Learning traffic as images: a deep convolutional neural network for large-scale transportation network speed prediction. Sensors 2017; 17(4): 818.

33.

Faro

Giordano

Spampinato

Evaluation of the traffic parameters in a metropolitan area by fusing visual perceptions and CNN processing of webcam images. IEEE T Neural Networ 2008; 19(6): 1108–1129.

34.

Zhang

Sun

Loh

, et al. Autonomous bolt loosening detection using deep learning. Struct Health Monit 2020; 19(1): 105–122.

35.

Wang

Gao

, et al. Bolt detection of key component for high-speed trains based on deep learning. In: Proceedings of the 2019 far east NDT new technology & application forum (FENDT), Qingdao, China, 24–27 June 2019, pp. 192–196. New York: IEEE.

36.

Zhao

Zhang

Wang

Bolt loosening angle detection technology using deep learning. Struct Control Hlth 2019; 26(1): e2292.

37.

Huynh

Park

Jung

, et al. Quasi-autonomous bolt-loosening detection method using vision-based deep learning and image processing. Automat Constr 2019; 105: 102844.

38.

Xie

Sun

. On-line bolt-loosening detection method of key components of running trains using binocular vision. In: Proceedings of the LIDAR imaging detection and target recognition, Changchun, China, 23–25 July 2017, vol. 10605, p. 1060513. Bellingham, WA: SPIE, the International Society for Optics and Photonics.

39.

Pham

Kim

, et al. Bolt-loosening monitoring framework using an image-based deep learning and graphical model. Sensors 2020; 20(12): 3382.

40.

Sun

Xie

Cheng

A fast bolt-loosening detection method of running train’s key components based on binocular vision. IEEE Access 2019; 7: 32227–32239.

41.

Zhou

Wang

Krähenbühl

Objects as points, 2019, https://arxiv.org/abs/1904.07850

42.

Lin

Dollár

Girshick

, et al. Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, 21–26 July 2017, pp. 2117–2125. New York: IEEE.

43.

Redmon

Divvala

Girshick

, et al. You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, Las Vegas, NV, 27–30 June 2016, pp. 779–788. New York: IEEE.

44.

Ren

Girshick

, et al. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis & Machine Intelligence 2017; 39(6): 1137–1149.

45.

Wang

Shelhamer

, et al. Deep layer aggregation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Salt Lake City, UT, 18–23 June 2018, pp. 2403–2412. New York: IEEE.