1. State Key Laboratory of Hydroscience and Engineering, Tsinghua University, Beijing 100084, China
2. China Renewable Energy Engineering Institute, Beijing 100120, China
3. State Key Laboratory of Stimulation and Regulation of Water Cycles in River Basins, China Institute of Water Resources and Hydropower Research, Beijing 100038, China
4. Hunan Provincial Water Resources Development & Investment Co., Ltd., Changsha 410007, China
liuyaoru@tsinghua.edu.cn
Show less
History+
Received
Accepted
Published
2023-02-09
2023-07-25
2024-05-15
Issue Date
Revised Date
2024-05-24
PDF
(24475KB)
Abstract
Regular detection and repair for lining cracks are necessary to guarantee the safety and stability of tunnels. The development of computer vision has greatly promoted structural health monitoring. This study proposes a novel encoder–decoder structure, CrackRecNet, for semantic segmentation of lining segment cracks by integrating improved VGG-19 into the U-Net architecture. An image acquisition equipment is designed based on a camera, 3-dimensional printing (3DP) bracket and two laser rangefinders. A tunnel concrete structure crack (TCSC) image data set, containing images collected from a double-shield tunnel boring machines (TBM) tunnel in China, was established. Through data preprocessing operations, such as brightness adjustment, pixel resolution adjustment, flipping, splitting and annotation, 2880 image samples with pixel resolution of 448 × 448 were prepared. The model was implemented by Pytorch in PyCharm processed with 4 NVIDIA TITAN V GPUs. In the experiments, the proposed CrackRecNet showed better prediction performance than U-Net, TernausNet, and ResU-Net. This paper also discusses GPU parallel acceleration effect and the crack maximum width quantification.
Currently, many projects successively transitioned from the construction phase to the operation and maintenance phase [1]. Affected by comprehensive factors, such as construction flaws and concrete deterioration, different types of defects, including cracking, spalling, dislocation, and exposed rebar, may appear in concrete structures in operation [2]. Among them, cracks are the most typical defect, commonly found in pavements, buildings, bridges, and tunnels, and they may lead to serious accidents unless repaired in a timely way [3]. Cracks occur in most concrete structures. Therefore, regular structural health monitoring (SHM) and maintenance are necessary during the operation period to guarantee the long-term stability of the engineering structure [4].
Tunnel boring machines (TBMs) are advanced construction equipment, and are widely used in the excavation of long and large-scale tunnels [5]. In TBM tunnels, the surrounding rock is usually supported by lining segments. Currently, human-based visual inspection is still the prevalent SHM method for detecting tunnel lining cracks worldwide [6]. However, in traditional manual inspection, qualified inspectors are required to manually check the lining segment condition and record information on the defect types and location. This method is labor-intensive and inefficient, and the detection accuracy largely relies on the experience of inspectors. Additionally, due to the low efficiency, the inspector usually needs to stay inside the tunnel for a long time, which is accompanied by certain safety risks. To solve the above-mentioned problems, in recent years, various computer vision (CV) technologies have been widely used for SHM. According to the development trend, crack detection methods or technologies can be summarized as digital image processing (DIP) technologies, traditional machine learning (ML) methods and deep learning (DL) methods.
DIP technology utilizes the greyscale difference between pixels to distinguish between cracks and backgrounds [7]. Classic DIP technologies include bounded histogram, threshold segmentation, edge detection and wavelet transform [8,9,10]. For example, Abdel-Qader et al. [11] implemented crack detection for concrete bridges using a fast Haar transform algorithm. Oliveira and Correia [12] used a first dynamic thresholding (DT) to obtain the threshold image for entropy computation, and then realized the crack segmentation by applying a second process of DT to the resulting entropy blocks matrix. The effects of DIP technology rely on hand-engineered features and parameters [13]. However, owing to the complex surface conditions of lining segments, achieving satisfying crack detection and segmentation effects is difficult in some cases.
To overcome the shortcomings of DIP technology and improve prediction accuracy and robustness, many classical ML models, e.g., supervised learning and unsupervised clustering, have been introduced in the field of crack detection [14]. Shi et al. [15] applied the random structured forests model to realize road crack segmentation. Noh et al. [16] performed bridge crack segmentation using fuzzy c-means clustering and applying the mask filtering of three sizes to remove three types of noises. Li et al. [17] realized the recognition of bridge cracks based on a support vector machine optimized by a greedy search algorithm. However, traditional ML models still require hand-engineered features, and this requirement relies on the experience of technicians, with characterization ability that is often poor.
With the development of DL methods, especially the breakthrough of convolutional neural networks (CNNs), the accuracy of crack or other defects detection and segmentation has been greatly improved [18,19]. CNNs have strong feature extraction and learning ability, thus being capable of effectively extracting features of cracks or other defects from images with complex backgrounds [20,21]. Therefore, various CNNs and their variants have been proposed for SHM [22–27]. Zhang et al. [28] established a deep CNN model, which demonstrated superior crack detection performance than existing manual methods. Zou et al. [29] proposed the Deepcrack, which can fuse the deep and shallow features to improve crack segmentation accuracy. Bang et al. [30] established an encoder–decoder network for crack detection, where the encoder is composed of convolution (conv) layers containing residual units, and the decoder is composed of deconvolutions layers to locate the pixel position of cracks in an input image. O'Brien et al. [31] developed the CNN model by incorporating deep transfer learning approaches to detect tunnel lining cracks. Dang et al. [32] established a tunnel lining crack segmentation model by replacing the encoder of U-Net with ResNet-152 model. Feng et al. [33] developed 40 leakage segmentation models by coupling U-Net and U-Net + + with six types of CNN models. Xue et al. [34] adopted Mask-RNN as the baseline to establish the leakage defects recognition model, and improved the model performance by introducing various optimization strategies. Manjunatha et al. [35] proposed the CrackDenseLinkNet for semantic segmentation of concrete surface cracks. Que et al. [36] established an improved VGG model to realize crack classification. Currently, various improved CNN models are under development to achieve stronger prediction capacity.
A representative data set is of great importance for the DL models to achieve good learning effect. Currently, the common open-source data sets of concrete structure crack defects are mostly for roads, pavements and bridges [37–39]. However, the data set for tunnel lining segment crack defects in TBM tunnels is relatively small. Although transfer learning [40] is a feasible solution to this problem, the discrepancy between common data sets and real data sets will also influence model performance to a certain extent. Furthermore, compared with the internal environments of roads, pavements, buildings, and bridges, etc., the internal environment of tunnels is complex, and the image data acquisition is relatively difficult. From the perspective of the model application scenario, a lining segment surface has a high level of noise interference [41], arising from features such as bolt holes, joints, segment marks, grout holes and artificial marks. From the perspective of data acquisition, tunnels constructed by TBM are often located in deep mountains, and the lighting conditions inside are generally poor. Moreover, tunnels are mostly used for traffic and water diversion, and the time window for defect detection is short. Therefore, a specialized data set of the lining segment cracks is required for meaningful and high-accuracy segmentation of tunnel lining cracks.
To facilitate rapid and effective structure health monitoring, various automatic inspection devices of tunnel lining have been developed [42–44]. Ai and Yuan [45] used a cart equipped with non-metric cameras to acquire lining surface images of metro tunnels. Zhou et al. [46] developed an inspection device to collect point cloud data for tunnel lining. Currently, most of the existing studies utilize the track in the tunnel to move the inspection devices. However, automatic inspection devices suitable for tunnels without tracks remain to be further developed.
To handle the special application scenario of TBM tunnels, a CNN-based model is required to have stronger feature extraction ability, enabling effective crack feature extraction in the presence of complex noise interference. This study proposes an encoder–decoder network, i.e., CrackRecNet, to realize pixel-level crack segmentation of tunnel lining segments. The feature extraction ability is strengthened by integrating improved VGG-19 into the encoder of U-Net. To address the aforementioned lack of data sets of tunnel lining segment cracks in current studies, image acquisition equipment is designed based on a camera, 3-dimensional printing (3DP) bracket and two laser rangefinders. On this basis, a tunnel concrete structure crack (TCSC) image data set, containing images of lining segments collected from the DXL tunnel project in western China, is presented. Through data preprocessing operations such as brightness adjustment, pixel resolution adjustment, flipping and splitting, 2880 tunnel lining crack images all with the same size of 448 × 448 were obtained. Subsequently, LabelMe was used to conduct the binary annotation operation for the preprocessed images. The training, validation and test sets were divided by random sampling method with the ratio of 6:1:1. In addition to the proposed CrackRecNet, three other models were established for comparison, including U-Net, TernausNet [47], and ResU-Net [48]. All models were implemented by Pytorch in PyCharm processed with 4 NVIDIA TITAN V GPUs. Additionally, the quantification of crack width and the limitations of this study are discussed.
2 Methodology
2.1 Overall architecture of CrackRecNet
In this study, inspired by U-Net, an encoder–decoder network called CrackRecNet is proposed for crack semantic segmentation. Fig.1 shows the overall structure of CrackRecNet, which adopts a structure mainly composed of encoder (encoding path) and decoder (decoding path). Detail parameters of CrackRecNet are listed as Tab.1. The core modifications of the proposed network structure involve integration of the improved VGG-19 (removing the last three fully connected layers) into the encoding path of U-Net to obtain stronger feature extraction ability.
2.2 Encoder and decoder
The encoder structure of CrackRecNet is shown in the left of Fig.1. Its function is to perform the down-sampling operation for the input image. Through feature extraction by multiple convolution operations, the encoder gradually obtains the deep image features. Meanwhile, max pooling operations are adopted to compress the extracted features, which can simplify model parameters and improve training efficiency.
Many studies have shown that feature extraction ability affects the performance of CNNs to a great extent [49]. VGGNet is a kind of CNN with a strong feature extraction ability proposed by scholars at Oxford University, in 2014 [50]. In VGG-19, the convolution kernel depth increases from 64 to 512, and this improves the extraction of the image feature vector. Therefore, the encoder of the CrackRecNet model uses the main structure of VGG-19. Due to most model parameters of VGG-19 originating from the last 3 fully connected layers (FCLs), these FCLs are removed to improve the model training speed. Furthermore, rectified linear unit (ReLU) [51] is adopted as activation function. The decoder structure is shown on the right of Fig.1.
2.3 Normalization layer
Normalization layers can scale widely distributed and scattered training data to a smaller range, and this can reduce the influence of extreme samples (also called outliers) in the data set on the prediction results and can thereby improve model training effect [52]. Batch normalization (BatchNorm or BN) is an effective component in DL [53]. In the calculation process, each image (RGB with three channels) is represented as three large-size two-dimensional matrices, which occupy a large portion of system memory. The effect of the Group Normalization (GroupNorm or GN) method is better than that of the BN method where there is a small batch size [54]. Therefore, GN is adopted in each convolution block. Fig.2 shows the composition of each convolution block in the encoder.
2.4 Copy and concatenate
Due to the irreversibility of the convolution operation, data information loss is inevitable in the upsampling process [55]. Multi-level feature fusion is a good solution to this problem [20]. Therefore, the CrackRecNet model adopts the copy and concatenate operation. Fig.3 shows the schematic diagram of copy and concatenate. This operation copies the deep image information extracted in the downsampling process and combines it with the shallow information obtained from the transpose convolution layer so as to improve the semantic segmentation accuracy.
2.5 Output layer and loss function
The output layer of CrackRecNet adopts a 1 × 1 convolution kernel, which can accept an input image of an arbitrary size, providing great convenience for the subsequent testing and application of the model. Additionally, the cross-entropy (CE) loss function [56,57] is selected for model training. In this study, the mini-batch training way is adopted. The corresponding CE loss function is as below:
where n is pixel number of the input image, b is the size of mini-batch (i.e., the batchsize), y is the real value of a pixel point, and is the predicted value of the pixel point corresponding to y.
3 Data set preparation and preprocessing
3.1 Project description
The DXL tunnel is a highway tunnel in Western China, with a maximum buried depth of 820 m and a total length of 4789 m. About 94% (4489 m) of the tunnel is TBM construction section, with an excavation diameter of 9.13 m. TBM tunnelling construction started on May 2, 2016, and successfully arrived at the scheduled end location on August 26, 2017. The rock mass class of the site is mainly III, and there are relatively small amounts of class IV and class V in some areas. In the TBM construction section, each ring of tunnel lining is assembled from multiple precast concrete segments, and rings are connected in a staggered way [58]. Fig.4 shows photographs of lining segments at the project site. The thickness and length of each ling segment are 35 cm and 1.8 m, respectively.
3.2 Data set acquisition of lining segment cracks
In the concrete structure, various forms of defects, such as cracking, spalling, dislocation, exposed rebar, and abrasion, may occur [3,20]. Crack is the main form of defect in the DXL tunnel. Fig.5 shows the cracking defects and noise interferences in tunnel lining segments. In addition to cracks, other sources of noise interference are present on the surface of lining segments, including bolt holes, joints, segment marks, artificial marks, and grout holes. The resulting noise interference will affect the prediction outcome of classifiers.
Currently, there are many studies and open-source data sets on the crack semantic segmentation of concrete structures, most of which concern roads and bridges [20,59]. However, there are relatively few open-source data sets about precast lining segment cracks in tunnels. Therefore, to make the model learn more accurately about applicable image features, a targeted image data set of tunnel lining segment cracks should be established.
In this study, images of lining segment cracks were collected in the DXL tunnel. To realize the image acquisition and non-contact crack width measurement, image acquisition equipment was designed based on a camera (Nikon D90 with 18–105 mm focal length), 3DP bracket and two laser rangefinders. Fig.6 shows the image acquisition equipment and image acquisition photograph at the tunnel project site. The detailed non-contact measurement principle can be seen in Subsection 5.1. As the DXL tunnel had not been opened to traffic and no lamp had been installed inside, the light in the tunnel was very dim. Therefore, the designed image acquisition equipment, camera tripod, and flashlight were used to complete the on-site image acquisition of lining segment cracks. At the same crack location, 2–3 images were taken with different shooting angles. After eliminating blurred and repeated images, a total of 580 original crack images and 580 original non-crack images were collected, each with a size of 4288 × 2848 pixels. The images contained various noise interference features common at the project site. Finally, the TCSC image data set was established. In addition to the lining segment crack images of the DXL tunnel, several crack images of dams, galleries, and roads were also collected in the TCSC data set to further expand the application of the data set in subsequent research.
3.3 Data preprocessing
The imaging pixel resolution of the acquired images was 4288 × 2848, being relatively large. If the original images of this size were directly input into the CNN model, system memory overflow could easily occur. Therefore, the original image sizes were reduced before the model training. The large initial images were first adjusted to the images with a size of 1372 × 911 by computer reduction of pixel resolution, and then were split to the images with a size of a 448 × 448 by cropping.
A sufficient number of samples is important for model training [60]. The tunnel lining segments with crack defects were relatively few. Data augmentation is an effective procedure to increase diversity and quantity of the samples, and can form a more comprehensive set of possible data points [61]. Herein, the operations of brightness adjustment, flipping, and splitting were used for data augmentation. Fig.7 shows the data preprocessing process of the lining segment crack images. First, the brightness and pixel resolutions of the original images were adjusted. Second, more images were obtained by flipping horizontally or vertically to obtained more images. Third, several sliding windows of a 448 × 448 pixel resolution were used to split the images. Here, is crack pixel number in a sliding window. When , the split image contains cracks; otherwise, when , the split image does not contain cracks. Fig.8 shows the example of data preprocessing for a lining segment crack image. Based on the 1160 collected original images, 2880 preprocessed images with a size of 448 × 448 were obtained by the operations of brightness adjustment, horizontal and vertical flipping, and sliding windows. The ratio of images with cracks to images without cracks was 1:1.
3.4 Image annotation
Image annotation is an important task in the application of DL models [62]. For a semantic segmentation task, pixel-level annotation is needed, and the annotation accuracy affects the learning effect of image features. Fig.9 shows the tunnel lining crack image annotation process. By means of LabelMe, the annotated RGB image with three channels was first obtained. Then, the RGB images were converted to binary images (i.e., grayscale images with only two gray values) by reserving one channel and setting the gray value of annotated crack pixels as 255 (white) and the gray value of annotated background pixels as 0 (black), respectively. Fig.10 shows several examples of the image annotation.
4 Experiments and results
4.1 Data set partition and model implement
According to Section 3, after the preprocessing of the original images, 2880 tunnel lining segment images with the same size of 448 × 448 were finally obtained. Subsequently, through random sampling, the preprocessed images were divided into training, validation, and test sets with the ratio of 6:1:1. Additionally, the proportion of images with cracks and without cracks was 1:1.
Tab.2 lists the computing environment of the CrackRecNet in the experiment. The operation system was the version of Windows 10 Pro for Workstations. The DL framework was Pytorch, which is popular in the CV field. The corresponding program was realized using Python language and the several Pytorch packages.
The Adam optimizer was adopted to update CrackRecNet parameters in the training process. By continuously adjusting the hyperparameters and verifying the prediction effect on the validation set, optimal hyperparameters of the CrackRecNet model were finally obtained, as shown in Tab.3. The learning rate was set to 0.1 with attenuation coefficient of 0.1. Batch size was set to 16, and the image number allocated to each graphics processing unit (GPU) for each training was 4. To control model complexity and reduce the risk of over-fitting, L2 regularization [63] was used in CrackRecNet with weight decay of 10−4.
4.2 Accuracy evaluation metrics
In the ML field, the performance of classifiers is usually evaluated by a confusion matrix (CM) [64], as shown in Tab.4. Using basic CM, several metrics were proposed for accurate evaluation [65,66], including recall (Recall), precision (Precision), F1-score (F1), pixel accuracy (PA), Dice coefficient (Dice), and intersection over union (IoU). The calculation formulas of the above metrics are as follows:
4.3 Experiment results
The proposed CrackRecNet was trained based on the 2160 preprocessed training samples. During model training, the hyperparameters were tuned based on the 360 validation samples. Subsequently, the optimal model hyperparameter combination was determined. Fig.11 shows the variation of loss with training epochs. When training epochs were about 22, the training loss and validation loss converged to a small value. After model training, the optimal model was selected and tested with the test set.
The semantic segmentation results of several images in the test set, with cracks and without cracks, are shown in Fig.12 and Fig.13, respectively. In the figures, several samples of the input images and corresponding prediction masks under different noise interference and shooting conditions are shown. The experiment results showed that.
1) CrackRecNet demonstrated good semantic segmentation performance for tunnel lining cracks. The prediction masks under different light conditions, shooting angle, and shooting distance had good consistency with the actual cracks, indicating that the proposed model had strong generalization and robustness.
2) The CrackRecNet model had strong anti-interference ability for different backgrounds and most noises in tunnel lining images. It could accurately identify cracks from complex backgrounds and realize the effective pixel-level segmentation. For a tunnel lining image without cracks, the CrackRecNet model could accurately identify various noise interferences, including bolt holes, joints, segment marks, artificial marks, and grout holes, without mistakenly identifying them as cracks.
3) For a crack image with a complete shape and clear visibility, the prediction masks were continuous and accurate. As for a crack image with a broken shape and complex background, the prediction masks could clearly show the fracture condition and distribution of the crack. Additionally, for small and irregular cracks, the prediction mask may have been broken at some positions along the crack path, resulting in discontinuous prediction results. However, the prediction masks could still clearly show the general shape and location of cracks, retaining effectiveness for the detection of lining cracks.
The CrackRecNet model had good mask prediction performance on most test image samples. However, the segmentation effect was still poor on some image samples. Fig.14 shows several output examples with poor crack segmentation effect. As shown in Fig.14(a), the CrackRecNet model may have incorrectly identified the construction residual concrete on the lining surface as cracks, indicating that the model was still affected by some types of noise. Fig.14(b) shows that the CrackRecNet model could not completely identify all pixels of the crack in blurred images. As shown in Fig.14(c), for the shaded part of a bulge on the lining surface, the CrackRecNet model may have incorrectly identified this feature as a crack. Additionally, as shown in Fig.14(d), for some images with cracks, the prediction mask was incomplete and could not completely cover the cracks. In addition, broken joints between the precast segments were also incorrectly identified as cracks. the problem of these false identifications can be resolved by collecting and adding more images with accurate annotation into the current data set or by further optimizing the proposed CrackRecNet model.
4.4 Analysis of graphics processing unit parallel acceleration effect
In terms of specific applications, DL technology involves a dense stack of matrix operations, and its computational efficiency is limited by computer hardware [67]. Even a small-scale network sometimes needs a long training process. Thus, training a network with a complex architecture on a big data set is time-consuming. A GPU is a special processor for image creation and processing on computers, workstations, and other devices. Compared with CPU, GPU (especially multi-GPU) is outstanding in parallel computing and can greatly improve the training and testing efficiencies of deep neural networks [68]. In this study, four NVIDIA TITAN V GPUs (multi-GPU) were used for parallel training and testing of image data. Additionally, as a comparison, the processing speed of CPU and single GPU were tested. Tab.5 shows the training time cost under different processor configurations.
During CrackRecNet training, the speedup ratio of single-GPU relative to CPU is 5.85, and the processing time is reduced from 1.55 to 0.226 s per image. The speedup ratio of multi-GPU relative to CPU is 14.50, which shows that multi-GPU parallel operation can effectively reduce the training time. It should be noted that the processing speed of using four GPUs is about 2.26 times that of using a single GPU, rather than 4 times. The reason for this is that deploying the training samples to each GPU takes a certain time for completion by the model.
4.5 Comparison with other models
In addition to CrackRecNet, three other models, i.e., the classic U-Net model and two very recent crack semantic segmentation models, TernausNet [47] and ResU-Net [48], were applied for comparison. The three models were trained, and their hyperparameters were set the same as the proposed CrackRecNet. Also, the models were tested with the same data set as CrackRecNet. Tab.6 lists the evaluation metric statistics of four established models.
It can be concluded by analyzing Tab.6 that.
1) Compared with the classic U-Net model, the three improved models (i.e., TernausNet, ResU-Net, and the proposed CrackRecNet) all had significantly better crack segmentation performance. The value of each evaluation metric of U-Net was less than that of the other three improved models. This indicates that adding residual units or integrating the module with strong feature extraction ability into the encoding path are both helpful for improving the crack segmentation performance.
2) The PA values of TernausNet, ResU-Net, and CrackRecNet were basically the same (98.68%, 98.67%, and 98.66%, respectively). The hierarchical relationship of the Recall value was: TernausNet > ResU-Net > CrackRecNet. For the relationship of Precision value the hierarchy was: TernausNet < ResU-Net < CrackRecNet. The F1 values of TernausNet, ResU-Net, and CrackRecNet were relatively close, at 72.65%, 72.58% and 73.01%, respectively. The above results show that the three improved U-Net models all had strong crack recognition ability, which could effectively distinguish cracks from different sources of noise interference.
3) Dice and IoU are the commonly used metrics for semantic segmentation tasks, and reflect the pixel-level recognition accuracy of crack location. The higher the values of the two metrics are, the more accurate the pixel position recognition of cracks. In this study, the Dice values for TernausNet, ResU-Net, and CrackRecNet were 68.51%, 68.09%, and 69.85%, respectively. The IoU values for TernausNet, ResU-Net, and CrackRecNet are 56.71%, 56.45%, and 58.21%, respectively. The IoU and Dice of CrackRecNet were higher than those of TernausNet and ResU-Net, indicating that CrackRecNet was more sensitive to crack pixel position and had better pixel-level segmentation effect of tunnel lining cracks.
It should be noted that, due to the possible local optimum of each model during the training process, the prediction performance of models may not be fairly compared based only on the trials and evaluation metrics showed in Tab.6. More crack images should be acquired for comparing model performance in future work.
CrackRecNet and TernausNet are both the improved models of U-Net. Therefore, classic U-Net can be regarded as baseline model for comparison. Fig.15 shows the comparison of several test samples using U-Net, TernausNet, and CrackRecNet.
As shown in Fig.15(a), part of the edge of a grout hole is incorrectly identified as a crack by U-Net. TernausNet and the proposed CrackRecNet show more accurate crack segmentation performance. For small cracks in a slightly blurred image (Fig.15(b)) or dim image (Fig.15(c)), the performance of CrackRecNet is slightly better than that of TernausNet and significantly better than that of the baseline model. In images with crack-like interference (Fig.15(d)–Fig.15(g)), the proposed CrackRecNet shows strong ability of feature extraction and anti-interference. As shown in Fig.15(e), all three models mistakenly identified segment joints as cracks to varying degrees, but the proposed CrackRecNet only had a small range of misjudgements. It can be seen from Fig.15 that the disturbances influenced the prediction accuracy to some extent. Among them, the classic U-Net had the most incorrectly predicted pixels, followed by TernausNet, and the proposed CrackRecNet had the least incorrectly predicted pixels. The above analysis shows the proposed CrackRecNet had the strongest accuracy in feature-extraction and crack segmentation. Overall, the proposed CrackRecNet was more universally effective on various crack defects in the presence of complex backgrounds.
5 Discussion
5.1 Discussion on quantification of crack width
The quantification of the crack size is important for evaluation of the concrete structure risk [69]. Maximum allowable crack width is specified in various technical standards in China for concrete structures [70]. First, the binary segmentation image is obtained by CrackRecNet. Subsequently, the medial axis transform algorithm [71] is used to extract crack skeleton with a single pixel width. The crack pixel width is obtained by integrating in the direction perpendicular to the skeleton, from which the maximum value is selected as the maximum pixel width. Furthermore, the one-side Hausdorff distance method is used to remove the burr noise [72]. Fig.16 provides a schematic diagram of the crack skeleton. Fig.17 shows an example of the crack skeleton extraction process of a lining segment crack image.
After extracting a crack skeleton, the maximum pixel width can be calculated as below:
where is the maximum pixel width, dl is the pixel width perpendicular to the direction of the skeleton. (x, y) is the coordinates of pixel points.
Fig.18 shows the schematic diagram of the calculation principle of maximum actual width of crack. The maximum actual crack width can be converted based on the camera parameters and the object distance indirectly obtained by two laser rangefinders. The specific calculated formulas are as follows:
where k is the scaling, which represents the actual size of unit pixel; u is object distance of camera; and are the readings of two laser rangefinders; s is the distance between laser rangefinder reference points (see A1 and B1 in Fig.18) and lens plane; v is the image distance; f is the focal distance; d is sensor size (transverse or longitudinal); p is camera resolution (horizontal and vertical); and is the maximum actual width of the crack.
Herein, 15 lining segment crack instances at the project site were used to study the effectiveness of the above width quantification method. Fig.19 shows the calculated width value, actual width value, and absolute relative error of 15 lining segment crack instances. The calculated width values of the 15 instances are consistent with the actual width values. The absolute relative errors (AREs) between the calculated and actual widths are all less than 10%, and the mean ARE is approximately 6.91%, which is acceptable for the crack defect detection of tunnel projects. The analysis results above show that the proposed crack width quantification method is effective.
It should be noted that the method described above needs to ensure that lens plane is as parallel as possible to the lining surface at crack position. The object distance measurement accuracy is also an influencing factor of crack width quantification performance. Moreover, the length direction of the crack to be measured should be approximately parallel to a line between the two laser points (see A2 and B2 in Fig.18). In addition to width of crack, length is also a key parameter to reflect crack risk level. It should be noted that the prediction accuracy of crack length data is more sensitive to inappropriate shooting angle than that of crack width.
5.2 Limitations and future work
In this study, a TCSC image data set was constructed. An encoder–decoder network, CrackRecNet, was established. The network shows good performance on the semantic segmentation of tunnel lining cracks. The limitations and future work are as below.
1) Further improvement of the CrackRecNet. Although the proposed CrackRecNet shows good crack segmentation accuracy, prediction errors still appear in some images with complex noise. Pixel-level imbalance is also a limiting factor for prediction performance; the improvement of loss function is a feasible way to overcome this problem. Future research can improve the model in terms of network structure, regularization method, hyper-parameter optimization, loss function improvement and activation function improvement, etc.
2) Further expansion and enrichment of the image data set. In this study, the image data set was collected from only one tunnel project, being insufficient for widespread applications. In future work, more tunnel lining crack images of different project types, such as highways, hydraulic engineering, mines, and subway tunnels, can be collected so as to increase the diversity of the data set.
3) Design and develop the automatic measuring equipment and system for tunnel lining defects. Unmanned defect detection using robots and other devices is safer and more efficient, and will become a future trend.
4) Representative and specialized data set for achieving better segmentation results. However, when the specialized data set is small, overfitting easily occurs. Transfer learning is a strategy to achieve good segmentation accuracy under small data set [40]. As mentioned in Subsection 3.2, besides lining segment crack images, the TCSC data set also includes crack images of dam, gallery and road, which will be updated continuously. Based on the established TCSC data set, the effectiveness of transfer learning should be further investigated.
5) Study of the influence of coupled defects (such as crack and spalling, crack and water leakage, crack and corrosion) on the model prediction performance. Several references [13,73] about tunnel surface defects detection have shown that the coupled damage (e.g., leakage and crack, leaking and spalling) will influence the model prediction accuracy to some extent. However, in this study, the concrete defect produced in the tunnel construction process was mainly cracks with almost no other defects. Therefore, the research on the impact of coupled defects will be conducted in future work.
6 Conclusions
This study proposes a novel CrackRecNet for crack segmentation of tunnel lining segments. By conducting data preprocessing and annotation, 2880 lining segment crack image samples were prepared for model training and testing. In the experiment, the F1, PA, IoU, and Dice of the proposed CrackRecNet were 73.01%, 98.66%, 58.21%, and 69.85%, respectively, showing better accuracy than another three semantic segmentation models, these being U-Net, TernausNet, and ResU-Net. Additionally, the GPU parallel acceleration effect, the quantification of crack maximum width, the limitations and the future work are discussed. The specific contributions are as below.
1) CrackRecNet model integrated the improved VGG-19 into the encoding path of U-Net, effectively enhancing the ability to extract image features under complex background.
2) To realize the image acquisition and non-contact crack width measurement, image acquisition equipment was designed based on a camera, 3DP bracket, and two laser rangefinders to indirectly determine object distance.
3) In view of the lack of open-source data sets in tunnel lining crack detection, a specialized TCSC image data set of lining segment cracks was collected and constructed.
Lei M F, Liu L H, Shi C H, Tan Y, Lin Y X, Wang W D. A novel tunnel-lining crack recognition system based on digital image technology. Tunnelling and Underground Space Technology, 2021, 108: 103724
[2]
YangYWangL FZhangY FHanX J. Multi-feature fusion based classification algorithm of surface disease image of concrete structure. Journal of Chang’an University: Natural Science Edition, 2021, 41(3): 64−74 (in Chinese)
[3]
Zhao S, Shadabfar M, Zhang D M, Chen J Y, Huang H W. Deep learning-based classification and instance segmentation of leakage-area and scaling images of shield tunnel linings. Structural Control and Health Monitoring, 2021, 28(6): e2732
[4]
Yao Y, Tung S T E, Glisic B. Crack detection and characterization techniques—An overview. Structural Control and Health Monitoring, 2014, 21(12): 1387–1413
[5]
Liu Y R, Hou S K, Li C Y, Zhou H W, Jin F, Qin P X, Yang Q. Study on support time in double-shield TBM tunnel based on self-compacting concrete backfilling material. Tunnelling and Underground Space Technology, 2020, 96: 103212
[6]
Koch C, Georgieva K, Kasireddy V, Akinci B, Fieguth P. A review on computer vision based defect detection and condition assessment of concrete and asphalt civil infrastructure. Advanced Engineering Informatics, 2015, 29(2): 196–210
[7]
Liu J, Yang X, Lau S, Wang X, Luo S, Lee V C S, Ding L. Automated pavement crack detection and segmentation based on two-step convolutional neural network. Computer-Aided Civil and Infrastructure Engineering, 2020, 35(11): 1291–1305
[8]
Valença J, Julio E. MCrack-Dam: the scale-up of a method to assess cracks on concrete dams by image processing. The case study of Itaipu Dam, at the Brazil–Paraguay border. Journal of Civil Structural Health Monitoring, 2018, 8(5): 857–866
[9]
Wang Y, Zhang J Y, Liu J X, Zhang Y, Chen Z P, Li C G, Yan R B. Research on crack detection algorithm of the concrete bridge based on image processing. Procedia Computer Science, 2019, 154: 610–616
[10]
Ying L, Salari E. Beamlet transform-based technique for pavement crack detection and classification. Computer-Aided Civil and Infrastructure Engineering, 2010, 25(8): 572–580
[11]
Abdel-Qader I, Abudayyeh O, Kelly M E. Analysis of edge-detection techniques for crack identification in bridges. Journal of Computing in Civil Engineering, 2003, 17(4): 255–263
[12]
OliveiraHCorreiaP L. Automatic road crack segmentation using entropy and image dynamic thresholding. In: Proceedings of the 17th European Signal Processing Conference. Glasgow: IEEE, 2009: 622–626
[13]
Xu Y, Li D, Xie Q, Wu Q, Wang J. Automatic defect detection and segmentation of tunnel surface using modified Mask R-CNN. Measurement, 2021, 178: 109316
[14]
Feng C, Zhang H, Wang H, Wang S, Li Y. Automatic pixel-level crack detection on dam surface using deep convolutional network. Sensors, 2020, 20(7): 2069
[15]
Shi Y, Cui L, Qi Z, Meng F, Chen Z. Automatic road crack detection using random structured forests. IEEE Transactions on Intelligent Transportation Systems, 2016, 17(12): 3434–3445
[16]
NohYKooDKangY MParkDLeeD. Automatic crack detection on concrete images using segmentation via fuzzy C-means clustering. In: Proceedings of the International conference on applied system innovation (ICASI). Sapporo: IEEE, 2017: 877–880
[17]
Li G, Zhao X, Du K, Ru F, Zhang Y. Recognition and evaluation of bridge cracks with modified active contour model and greedy search-based support vector machine. Automation in Construction, 2017, 78: 51–61
[18]
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Li F F. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 2015, 115(3): 211–252
[19]
Park S, Bang S, Kim H, Kim H. Patch-based crack detection in black box images using convolutional neural networks. Journal of Computing in Civil Engineering, 2019, 33(3): 04019017
[20]
ZhangCChangC CJamshidiM. Simultaneous pixel-level concrete defect detection and grouping using a fully convolutional model. Structural Health Monitoring, 2021, 20(4): 147592172098543
[21]
Cha Y J, Choi W, Suh G, Mahmoudkhani S, Büyüköztürk O. Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types. Computer-Aided Civil and Infrastructure Engineering, 2018, 33(9): 731–747
[22]
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521(7553): 436–444
[23]
Chen J, He Y. A novel U-shaped encoder–decoder network with attention mechanism for detection and evaluation of road cracks at pixel level. Computer-Aided Civil and Infrastructure Engineering, 2022, 37(13): 1721–1736
[24]
Zhou Z, Zhang J, Gong C. Hybrid semantic segmentation for tunnel lining cracks based on Swin Transformer and convolutional neural network. Computer-Aided Civil and Infrastructure Engineering, 2023, 38(17): 2491–2510
[25]
XuYFanY LLiH. Lightweight semantic segmentation of complex structural damage recognition for actual bridges. Structural Health Monitoring, 2023, 22(5): 14759217221147015
[26]
ArafinPBillahA MIssaA. Deep learning-based concrete defects classification and detection using semantic segmentation. Structural Health Monitoring, 2023: 14759217231168212
[27]
Chen L, Yao H, Fu J, Ng C T. The classification and localization of crack using lightweight convolutional neural network with CBAM. Engineering Structures, 2023, 275: 115291
[28]
ZhangLYangFZhangY DZhuY J. Road crack detection using deep convolutional neural network. In: Proceedings of the 2016 IEEE international conference on image processing (ICIP). Phoenix: IEEE, 2016: 3708–3712
[29]
Zou Q, Zhang Z, Li Q, Qi X, Wang Q, Wang S. Deepcrack: Learning hierarchical convolutional features for crack detection. IEEE Transactions on Image Processing, 2018, 28(3): 1498–1512
[30]
Bang S, Park S, Kim H, Kim H. Encoder-decoder network for pixel-level road crack detection in black-box images. Computer-Aided Civil and Infrastructure Engineering, 2019, 34(8): 713–727
[31]
O’Brien D, Osborne J A, Perez-Duenas E, Cunningham R, Li Z L. Automated crack classification for the CERN underground tunnel infrastructure using deep learning. Tunnelling and Underground Space Technology, 2023, 131: 104668
[32]
Dang L M, Wang H X, Li Y F, Park Y, Oh C, Nguyen T N, Moon H. Automatic tunnel lining crack evaluation and measurement using deep learning. Tunnelling and Underground Space Technology, 2022, 124: 104472
[33]
Feng S J, Feng Y, Zhang X L, Chen Y H. Deep learning with visual explanations for leakage defect segmentation of metro shield tunnel. Tunnelling and Underground Space Technology, 2023, 136: 105107
[34]
Xue Y D, Jia F, Cai X Y, Shadabfar M, Huang H W. An optimization strategy to improve the deep learning-based recognition model of leakage in shield tunnels. Computer-Aided Civil and Infrastructure Engineering, 2022, 37(3): 386–402
[35]
ManjunathaPMasriS FNakanoAWellfordL C. CrackDenseLinkNet: A deep convolutional neural network for semantic segmentation of cracks on concrete surface images. Structural Health Monitoring, 2023: 14759217231173305
[36]
Que Y, Dai Y, Ji X, Leung A K, Chen Z, Jiang Z L, Tang Y C. Automatic classification of asphalt pavement cracks using a novel integrated generative adversarial networks and improved VGG model. Engineering Structures, 2023, 277: 115406
[37]
Guo L, Li R, Jiang B, Shen X. Automatic crack distress classification from concrete surface images using a novel deep-width network architecture. Neurocomputing, 2020, 397: 383–392
[38]
Xu H, Su X, Wang Y, Cai H, Cui K, Chen X. Automatic bridge crack detection using a convolutional neural network. Applied Sciences, 2019, 9(14): 2867
[39]
Huyan J, Li W, Tighe S, Xu Z C, Zhai J Z. CrackU-net: A novel deep convolutional neural network for pixelwise pavement crack detection. Structural Control and Health Monitoring, 2020, 27(8): e2551
[40]
Zhang K, Cheng H D, Zhang B. Unified approach to pavement crack and sealed crack detection using preclassification based on transfer learning. Journal of Computing in Civil Engineering, 2018, 32(2): 04018001
[41]
HouS KOuZ GQinP XWangY LLiuY R. Image-based crack recognition of tunnel lining using residual U-Net convolutional neural network. In: Proceedings of the IOP Conference Series: Earth and Environmental Science. Jakarta: IOP Publishing, 2021: 072001
[42]
Attard L, Debono C J, Valentino G, Di Castro M. Tunnel inspection using photogrammetric techniques and image processing: A review. ISPRS Journal of Photogrammetry and Remote Sensing, 2018, 144: 180–188
[43]
Ai Q, Yuan Y, Bi X L. Acquiring sectional profile of metro tunnels using charge-coupled device cameras. Structure and Infrastructure Engineering, 2016, 12(9): 1065–1075
[44]
Zhao S, Zhang D M, Huang H W. Deep learning-based image instance segmentation for moisture marks of shield tunnel lining. Tunnelling and Underground Space Technology, 2020, 95: 103156
[45]
Ai Q, Yuan Y. Rapid acquisition and identification of structural defects of metro tunnel. Sensors, 2019, 19(19): 4278
[46]
Zhou M L, Cheng W, Huang H W, Chen J Y. A novel approach to automated 3D spalling defects inspection in railway tunnel linings using laser intensity and depth information. Sensors, 2021, 21(17): 5725
[47]
IglovikovVShvetsA. Ternausnet: U-net with vgg11 encoder pre-trained on imagenet for image segmentation. 2018, arXiv:1801.05746
[48]
BenzCDebusPHaH KRodehorstV. Crack segmentation on UAS-based imagery using transfer learning. In: Proceedings of the 2019 International Conference on Image and Vision Computing New Zealand (IVCNZ). Dunedin: IEEE, 2019: 1–6
[49]
JoginMMadhulikaM SDivyaG DMeghanaR KApoorvaS. Feature extraction using convolution neural networks (CNN) and deep learning. In: Proceedings of the 2018 3rd IEEE international conference on recent trends in electronics, information & communication technology (RTEICT). Bangalore: IEEE, 2018: 2319–2323
[50]
SimonyanKZissermanA. Very deep convolutional networks for large-scale image recognition. 2014, arXiv:1409.1556
[51]
NairVHintonG E. Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning (ICML-10). Madison, WI: ACM, 2010: 807–814
[52]
Swiderska-Chadaj Z, de Bel T, Blanchet L, Baidoshvili A, Vossen D, van der Laak J, Litjens G. Impact of rescanning and normalization on convolutional neural network performance in multi-center, whole-slide classification of prostate cancer. Scientific Reports, 2020, 10(1): 1–14
[53]
IoffeSSzegedyC. Batch normalization: Accelerating deep network training by reducing internal covariate shift. International conference on machine learning. Stockholm: ICML, 2015: 448–456
[54]
WuYHeK. Group normalization. In: Proceedings of the European conference on computer vision (ECCV), Munich: Springer, 2018: 3–19
[55]
RonnebergerOFischerPBroxT. U-Net: convolutional Networks for biomedical image segmentation. In: Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention. Munish: Springer, 2015: 234–241
[56]
Deng L Y. The Cross-Entropy method: a unified approach to combinatorial optimization, Monte-Carlo simulation, and machine learning. Technometrics, 2006, 48(1): 147–148
[57]
Qu Z, Mei J, Liu L, Zhou D Y. Crack detection of concrete pavement with cross-entropy loss function and improved VGG16 network model. IEEE Access : Practical Innovations, Open Solutions, 2020, 8: 54564–54573
[58]
Li C Y, Hou S K, Liu Y R, Qin P X, Yang Q. Analysis on the crown convergence deformation of surrounding rock for double-shield TBM tunnel based on advance borehole monitoring and inversion analysis. Tunnelling and Underground Space Technology, 2020, 103: 103513
[59]
Wang W J, Su C. Semi-supervised semantic segmentation network for surface crack detection. Automation in Construction, 2021, 128: 103786
[60]
Bejani M M, Ghatee M. A systematic review on overfitting control in shallow and deep neural networks. Artificial Intelligence Review, 2021, 54(8): 1–48
[61]
Shorten C, Khoshgoftaar T M. A survey on image data augmentation for deep learning. Journal of Big Data, 2019, 6(1): 1–48
[62]
Yang X B, Chen R, Zhang F Q, Zhang L, Fan X J, Ye Q L, Fu L Y. Pixel-level automatic annotation for forest fire image. Engineering Applications of Artificial Intelligence, 2021, 104: 104353
[63]
Coelho F, Neto J P. A method for regularization of evolutionary polynomial regression. Applied Soft Computing, 2017, 59: 223–228
[64]
Ohsaki M, Wang P, Matsuda K, Katagiri S, Watanabe H, Ralescu A. Confusion-matrix-based kernel logistic regression for imbalanced data classification. IEEE Transactions on Knowledge and Data Engineering, 2017, 29(9): 1806–1819
[65]
Ren Y, Huang J, Hong Z, Lu W, Yin J, Zou L, Shen X. Image-based concrete crack detection in tunnels using deep fully convolutional networks. Construction & Building Materials, 2020, 234: 117367
[66]
Sokolova M, Lapalme G. A systematic analysis of performance measures for classification tasks. Information Processing & Management, 2009, 45(4): 427–437
[67]
Liu Z Q, Wang Y Z, Hua X G, Zhu H P, Zhu Z W. Optimization of wind turbine TMD under real wind distribution countering wake effects using GPU acceleration and machine learning technologies. Journal of Wind Engineering and Industrial Aerodynamics, 2021, 208: 104436
[68]
Dong S, Zhao P, Lin X, Kaeli D. Exploring GPU acceleration of Deep Neural Networks using Block Circulant Matrices. Parallel Computing, 2020, 100: 102701
[69]
Wang J J, Liu Y F, Nie X, Mo Y L. Deep convolutional neural networks for semantic segmentation of cracks. Structural Control and Health Monitoring, 2022, 29(1): e2850
[70]
RongY. The research on the key technology of the crack controlling of reinforced concrete lining of undersea tunnel. Dissertation for the Doctoral Degree. Shanghai: Toingji University, 2007 (in Chinese)
[71]
Yang X C, Li H, Yu Y T, Luo X C, Huang T, Yang X. Automatic pixel-level crack detection and measurement using fully convolutional network. Computer-Aided Civil and Infrastructure Engineering, 2018, 33(12): 1090–1109
[72]
SunFChoiY KYuYWangW. Medial meshes for volume approximation. 2013, arXiv:1308.3917
[73]
Huang H W, Li Q T, Zhang D M. Deep learning based image recognition for crack and leakage defects of metro shield tunnel. Tunnelling and Underground Space Technology, 2018, 77: 166–176
RIGHTS & PERMISSIONS
Higher Education Press
AI Summary 中Eng×
Note: Please be aware that the following content is generated by artificial intelligence. This website is not responsible for any consequences arising from the use of this content.