Transfer learning-based encoder-decoder model with visual explanations for infrastructure crack segmentation: New open database and comprehensive evaluation

Fangyu Liu; Wenqi Ding; Yafei Qiao; Linbing Wang

doi:10.1016/j.undsp.2023.09.012

Underground Space ›› 2024, Vol. 17 ›› Issue (4) :60 -81. DOI: 10.1016/j.undsp.2023.09.012

Research article

research-article

Transfer learning-based encoder-decoder model with visual explanations for infrastructure crack segmentation: New open database and comprehensive evaluation

Author information +

History +

PDF (7318KB)

Abstract

Contemporary demands necessitate the swift and accurate detection of cracks in critical infrastructures, including tunnels and pavements. This study proposed a transfer learning-based encoder-decoder method with visual explanations for infrastructure crack segmentation. Firstly, a vast dataset containing 7089 images was developed, comprising diverse conditions—simple and complex crack patterns as well as clean and rough backgrounds. Secondly, leveraging transfer learning, an encoder-decoder model with visual explanations was formulated, utilizing varied pre-trained convolutional neural network (CNN) as the encoder. Visual explanations were achieved through gradient-weighted class activation mapping (Grad-CAM) to interpret the CNN segmentation model. Thirdly, accuracy, complexity (computation and model), and memory usage assessed CNN feasibility in practical engineering. Model performance was gauged via prediction and visual explanation. The investigation encompassed hyperparameters, data augmentation, deep learning from scratch vs. transfer learning, segmentation model architectures, segmentation model encoders, and encoder pre-training strategies. Results underscored transfer learning's potency in enhancing CNN accuracy for crack segmentation, surpassing deep learning from scratch. Notably, encoder classification accuracy bore no significant correlation with CNN segmentation accuracy. Among all tested models, UNet-EfficientNet_B7 excelled in crack segmentation, harmonizing accuracy, complexity, memory usage, prediction, and visual explanation.

Keywords

Crack segmentation / Transfer learning / Visual explanation / Infrastructure / Database

Cite this article

Download citation ▾

Fangyu Liu, Wenqi Ding, Yafei Qiao, Linbing Wang. Transfer learning-based encoder-decoder model with visual explanations for infrastructure crack segmentation: New open database and comprehensive evaluation. Underground Space, 2024, 17(4): 60-81 DOI:10.1016/j.undsp.2023.09.012

登录浏览全文

4963

注册一个新账户忘记密码

Declaration of competing interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgement

The authors would like to acknowledge the National Natural Science Foundation of China (Grant Nos. 52090083 and 52378405) and Key Technology R&D Plan of Yunnan Provincial Department of Science and Technology (Grant No. 202303AA080003) for their financial support.

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Amhaz, R., Chambon, S., Idier, J., & Baltazart, V. (2016). Automatic crack detection on two-dimensional pavement images: An algorithm based on minimal path selection. IEEE Transactions on Intelligent Transportation Systems, 17(10), 2718-2729.

[2]	Ayele, Y. Z., Aliyari, M., Griffiths, D., & Droguett, E. L. (2020). Automatic crack segmentation for UAV-assisted bridge inspection. Energies, 13(23), 6250.

[3]	Badrinarayanan, V., Kendall, A., & Cipolla, R. (2017). Segnet: A deep convolutional encoder-decoder architecture for image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39 (12), 2481-2495.

[4]	Bianco, S., Cadene, R., Celona, L., & Napoletano, P. (2018). Benchmark analysis of representative deep neural network architectures. IEEE Access, 6, 64270-64277.

[5]	Brock, A., De, S., Smith, S. L., & Simonyan, K. (2021). High-performance large-scale image recognition without normalization. In Proceedings of the 38th International Conference on Machine Learning (pp.1059-1071).

[6]	Chen, L. C., Papandreou, G., Schroff, F., & Adam, H. (2017). Rethinking atrous convolution for semantic image segmentation. preprint. https://arxiv.org/abs/1706.05587.

[7]	Chen, L. C., Zhu, Y. K., Papandreou, G., Schroff, F., & Adam, H. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 833-851).

[8]	Dais, D., Bal, I. E., Smyrou, E., & Sarhosis, V. (2021). Automatic crack classification and segmentation on masonry surfaces using convolutional neural networks and transfer learning. Automation in Construction, 125, 103606.

[9]	Deng, J., Dong, W., Socher, R., Li, L. J., Li, K., & Li, F. F. (2009). Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition (pp. 248-255).

[10]	Desai, S., & Ramaswamy, H. G. (2020). Ablation-CAM: Visual explanations for deep convolutional network via gradient-free localization. In Proceedings of the 2020 IEEE Winter Conference on Applications of Computer Vision (WACV) (pp. 972-980).

[11]	Dung, C. V., Sekiya, H., Hirano, S., Okatani, T., & Miki, C. (2019). A vision-based method for crack detection in gusset plate welded joints of steel bridges using deep convolutional neural networks. Automation in Construction, 102, 217-229.

[12]

Eisenbach, M., Stricker, R., Seichter, D., Amende, K., Debes, K., & Sesselmann, M., Ebersbach, D., Stoeckert, U., & Gross, H. M. (2017). How to get pavement distress detection ready for deep learning? A systematic approach. In Proceedings of the 2017 International Joint Conference on Neural Networks (IJCNN) (pp. 2039-2047).

[13]	Gao, Y. Q., & Mosalam, K. M. (2018). Deep transfer learning for imagebased structural damage recognition. Computer-Aided Civil and Infrastructure Engineering, 33(9), 748-768.

[14]	Garcia-Garcia, A., Orts-Escolano, S., Oprea, S., Villena-Martinez, V., & Garcia-Rodriguez, J. (2017). A review on deep learning techniques applied to semantic segmentation. preprint. https://arxiv.org/abs/1704.06857.

[15]	Gopalakrishnan, K., Khaitan, S. K., Choudhary, A., & Agrawal, A. (2017). Deep convolutional neural networks with transfer learning for computer vision-based data-driven pavement distress detection. Construction and Building Materials, 157, 322-330.

[16]	He, K. M., Zhang, X. Y., Ren, S. Q., & Sun, J. (2016). eep residual learning for image recognition. In DProceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp.770-778).

[17]	Hu, X., Chu, L. Y., Pei, J., Liu, W. Q., & Bian, J. (2021). Model complexity of deep learning: A survey. Knowledge and Information Systems, 63(10), 2585-2619.

[18]	Huyan, J., Li, W., Tighe, S., Xu, Z. C., & Zhai, J. Z. (2020). CrackU-net: A novel deep convolutional neural network for pixelwise pavement crack detection. Structural Control and Health Monitoring, 27(8), e2551.

[19]	Kang, D., Benipal, S. S., Gopal, D. L., & Cha, Y. J. (2020). Hybrid pixellevel concrete crack segmentation and quantification across complex backgrounds using deep learning. Automation in Construction, 118, 103291.

[20]	Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. preprint. https://arxiv.org/abs/1412.6980.

[21]	Kolesnikov, A., Beyer, L., Zhai, X. H., Puigcerver, J., Yung, J., Gelly, S., & Houlsby, N. (2020). Big transfer (BiT): General visual representation learning. In Proceedings of the European Conference on Computer Vision (pp. 491-507).

[22]	Lin, T. Y., Dollár, P., Girshick, R., He, K. M., Hariharan, B., & Belongie, S. (2017). Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 936-944).

[23]	Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. L. (2014). Microsoft COCO: Common objects in context. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 740-755).

[24]	Liu, C. Q., Zhu, C. G., Xia, X., Zhao, J. K., & Long, H. H. (2022). FFEDN: Feature Fusion Encoder Decoder Network for Crack Detection. IEEE Transactions on Intelligent Transportation Systems, 23(9), 15546-15557.

[25]	Liu, F. Y., Liu, J., & Wang, L. B. (2022a). Asphalt Pavement Crack Detection Based on Convolutional Neural Network and Infrared Thermography. IEEE Transactions on Intelligent Transportation Systems, 23(11), 22145-22155.

[26]	Liu, F. Y., Liu, J., & Wang, L. B. (2022b). Asphalt pavement fatigue crack severity classification by infrared thermography and deep learning. Automation in Construction, 143, 104575.

[27]	Liu, F. Y., Liu, J., & Wang, L. B. (2022c). Deep learning and infrared thermography for asphalt pavement crack severity classification. Automation in Construction, 140, 104383.

[28]	Liu, F. Y., & Wang, L. B. (2022). UNet-based model for crack detection integrating visual explanations. Construction and Building Materials, 322, 126265.

[29]	Liu, F. Y., Ye, Z. J., & Wang, L. B. (2022d). Deep transfer learning-based vehicle classification by asphalt pavement vibration. Construction and Building Materials, 342, 127997.

[30]	Liu, Y. H., Yao, J., Lu, X. H., Xie, R. P., & Li, L. (2019). DeepCrack: A deep hierarchical feature learning architecture for crack segmentation. Neurocomputing, 338, 139-153.

[31]	Long, J., Shelhamer, E., & Darrell, T. (2015). Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 3431-3440).

[32]	Ma, N. N., Zhang, X. Y., Zheng, H. T., & Sun, J. (2018). ShuffleNet V2:Practical guidelines for efficient CNN architecture design. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 122-138).

[33]	Minaee, S., Boykov, Y., Porikli, F., Plaza, A., Kehtarnavaz, N., & Terzopoulos, D. (2021). Image segmentation using deep learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(7), 3523-3542.

[34]	Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (pp.807-814).

[35]	Oktay, O., Schlemper, J., Folgoc, L. L., Lee, M., Heinrich, M., Misawa, K., Mori, K., McDonagh, S., Hammerla, N. Y., Kainz, B., Glocker, B., & Rueckert, D. (2018). Attention U-Net: Learning where to look for the pancreas. preprint. https://arxiv.org/abs/1804.03999.

[36]	Özgenel, Ç. F. (2019). Concrete Crack Segmentation Dataset. In M. Data (Ed.).

[37]	Pan, S. J., & Yang, Q. (2010). A survey on transfer learning. IEEE Transactions on Knowledge and Data Engineering, 22(10), 1345-1359.

[38]	Paoletti, M. E., Haut, J. M., Tao, X. W., Plaza, J., & Plaza, A. (2020). FLOP-reduction through memory allocations within CNN for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 59(7), 5938-5952.

[39]	Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J.,Chanan, G., et al. (2019). PyTorch: An imperative style, high-performance deep learning library. In Proceedings of the 33rd International Conference on Neural Information Processing Systems (pp. 8026-8037).

[40]	Ren, Y. P., Huang, J. S., Hong, Z. Y., Lu, W., Yin, J.,Zou, L. J., et al. (2020). Image-based concrete crack detection in tunnels using deep fully convolutional networks. Construction and Building Materials, 234, 117367.

[41]	Ribeiro, M. T., Singh, S., & Guestrin, C. (2016). “Why Should I Trust You?”: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 1135-1144).

[42]	Ronneberger, O., Fischer, P., & Brox, T. (2015). U-Net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer- Assisted Intervention (pp. 234-241)

[43]	Selvaraju, R. R., Cogswell, M., Das, A., Vedantam, R., Parikh, D., & Batra, D. (2017). Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE International Conference on Computer Vision (ICCV) (pp. 618-626).

[44]	Shi, K. (2021). pytorch_memlab. from https://github.com/Stonesjtu/pytorch_memlab.

[45]	Shi, Y., Cui, L. M., Qi, Z. Q., Meng, F., & Chen, Z. S. (2016). Automatic road crack detection using random structured forests. IEEE Transactions on Intelligent Transportation Systems, 17(12), 3434-3445.

[46]	Shorten, C., & Khoshgoftaar, T. M. (2019). A survey on image data augmentation for deep learning. Journal of Big Data, 6, 1-48.

[47]	Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. preprint. https://arxiv.org/abs/1409.1556.

[48]	Sovrasov, V. (2022). Flops counter for neural networks in pytorch framework. from https://github.com/sovrasov/flops-counter.pytorch.

[49]	Tan, M. X., & Le, Q. (2019). Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the 36th International Conference on Machine Learning (pp. 6105-6114).

[50]	Vinogradova, K., Dibrov, A., & Myers, G. (2020). Towards interpretable semantic segmentation via gradient-weighted class activation mapping (student abstract). In Proceedings of the AAAI Conference on Artificial Intelligence (pp.13943-13944).

[51]	Wang, H. F., Naidu, R., Michael, J., & Kundu, S. S. (2020). SS-CAM: Smoothed Score-CAM for sharper visual feature localization. preprint. https://arxiv.org/abs/2006.14255.

[52]	Wang, W. J., & Su, C. (2022). Automatic concrete crack segmentation model based on transformer. Automation in Construction, 139, 104275.

[53]	Xie, Q. Z., Luong, M. T., Hovy, E., & Le, Q. V. (2020). Self-training with noisy student improves imagenet classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 10684-10695).

[54]	Xie, X. Y., Cai, J. L., Wang, H. Z., Wang, Q., Xu, J. Y.,Zhou, Y. X., et al. (2022). Sparse-sensing and superpixel-based segmentation model for concrete cracks. Computer-Aided Civil and Infrastructure Engineering, 37(13), 1769-1784.

[55]	Yakubovskiy, P. (2020). Segmentation Models Pytorch. from https://github.com/qubvel/segmentation_models.pytorch.

[56]	Yang, F., Zhang, L., Yu, S. J., Prokhorov, D., Mei, X., & Ling, H. B. (2020). Feature pyramid and hierarchical boosting network for pavement crack detection. IEEE Transactions on Intelligent Transportation Systems, 21(4), 1525-1535.

[57]	Zhang, K., Danelljan, M., Li, Y. W., Timofte, R., Liu, J.,Tang, J., et al. (2020). AIM 2020 challenge on efficient super-resolution: Methods and results. In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 5-40).

[58]	Zhao, H. S., Shi, J. P., Qi, X. J., Wang, X. G., & Jia, J. Y. (2017). Pyramid scene parsing network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 6230-6239).

[59]	Zhou, B. L., Khosla, A., Lapedriza, A., Oliva, A., & Torralba, A. (2016). Learning deep features for discriminative localization. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 2921-2929).

[60]	Zhou, Z. W., Rahman Siddiquee, M. M., Tajbakhsh, N., & Liang, J. M. (2018). UNet++: A nested U-net architecture for medical image segmentation. In Proceedings of International Workshop on Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support (pp. 3-11).

[61]	Zou, Q., Cao, Y., Li, Q. Q., Mao, Q. Z., & Wang, S. (2012). CrackTree: Automatic crack detection from pavement images. Pattern Recognition Letters, 33(3), 227-238.