PDF
Abstract
Aiming at the existing semantic segmentation process due to the loss of pixel features and the complexity of calculating too many parameters, which leads to unsatisfactory segmentation results and too long time, this paper proposes a lightweight semantic segmentation algorithm based on the fusion of multiple modules. The algorithm is based on the pyramid scene parsing network (PSPNet). Firstly, MobileNetV2 network is chosen as the feature extraction network to construct the lightweight network structure. In the training of the network, a freeze and thaw method is used, and the Focal Loss function is added to balance the proportion of positive and negative samples. After that, spatial and channel reconstruction convolution (SCConv) is introduced in the pyramid pooling module to reduce the segmentation task. The computational cost due to redundant feature extraction is reduced. Finally, the coordinate attention (CA) and the efficient channel attention network (ECA-Net) are incorporated to make the multi-modules integrate with each other to enhance the salient features and improve the segmentation accuracy. Through the ablation and comparison experiments, the average pixel accuracy on PASCAL VOC 2012 dataset reaches 85.23%, the computation amount is reduced by 45.79%, and the training speed is improved by 68.69%. The average pixel accuracy on Cityscapes dataset reaches 86.75%, the average intersection and merger ratio reaches 73.86%, and the interaction of multiple modules with correlation performance makes the algorithm improved and optimized, effectively solving the problems of low segmentation accuracy and slow training speed in the algorithm, which has a significant accuracy advantage in the lightweight model, and can generally improve the efficiency of image semantic segmentation.
Cite this article
Download citation ▾
Zhihao Guo, Dongmei Ma, Xiaoyun Luo.
A lightweight semantic segmentation algorithm integrating CA and ECA-Net modules.
Optoelectronics Letters, 2024, 20(9): 568-576 DOI:10.1007/s11801-024-3241-z
| [1] |
AsgariT S, AbhishekK, CohenJ P, et al.. Deep semantic segmentation of natural and medical images: a review. Artificial intelligence review, 2021, 54(1):137-178 J]
|
| [2] |
YuH, YangZ, TanL. Methods and datasets on semantic segmentation: a review. Neurocomputing, 2018, 304: 82-103 J]
|
| [3] |
BiL, KimJ, AhnE. Dermoscopic image segmentation via multistage fully convolutional networks. IEEE transactions on biomedical engineering, 2017, 64(9):2065-2074 J]
|
| [4] |
SiddiqueN, PahedingS, ElkinC P. U-Net and its variants for medical image segmentation: a review of theory and applications. IEEE access, 2021, 9: 82031-82057 J]
|
| [5] |
YuanW, WangJ, XuW. Shift pooling PSPNet: rethinking PSPNet for building extraction in remote sensing images from entire local feature pooling. Remote sensing, 2022, 14(19):4889 J]
|
| [6] |
YuD, XuQ, GuoH. An efficient and lightweight convolutional neural network for remote sensing image scene classification. Sensors, 2020, 20(7): 1999 J]
|
| [7] |
CaoJ, TianX, ChenZ. Ancient mural segmentation based on a deep separable convolution network. Heritage science, 2022, 10(1): 11 J]
|
| [8] |
ÖztürkC, TaşyürekM, TürkdamarM U. Transfer learning and fine-tuned transfer learning methods’ effectiveness analyse in the CNN-based deep learning models. Concurrency and computation: practice and experience, 2023, 35(4): e7542 J]
|
| [9] |
LinT Y, GoyalP, GirshickR. Focal loss for dense object detection. IEEE transactions on pattern analysis and machine intelligence, 2020, 42(2): 318-327 J]
|
| [10] |
HouQ, ZhouD, FengJ. Coordinate attention for efficient mobile network design. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 20–25, 2021, Nashville, TN, USA, 2021, New York, IEEE: 9577301[C]
|
| [11] |
HanG, ZhangM, WuW. Improved U-Net based insulator image segmentation method based on attention mechanism. Energy reports, 2021, 7: 210-217 J]
|
| [12] |
YangQ, KuT, HuK. Efficient attention pyramid network for semantic segmentation. IEEE access, 2021, 9: 18867-18875 J]
|
| [13] |
WeiH B, YunJ, JiaX L, et al.. In-situ detection method of Jellyfish based on improved faster R-CNN and FP16. IEEE access, 2023, 11: 81803-81814 J]
|
| [14] |
LiH, LuH, LiX. Mortar-FP8: morphing the existing FP32 infrastructure for high performance deep learning acceleration. IEEE transactions on computer-aided design of integrated circuits and systems, 2023, 1-1[J]
|
| [15] |
CordtsM, OmranM, RamosS, et al.. The Cityscapes dataset for semantic urban scene understanding. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27–30, 2016, Las Vegas, NV, USA, 2016, New York, IEEE: 7780719[C]
|
| [16] |
YongL, MaL, SunD, et al.. Application of MobileNetV2 to waste classification. PLOS one, 2023, 18(3):e0282336 J]
|
| [17] |
LuJ, LeeS H, KimI W, et al.. Small foreign object detection in automated sugar dispensing processes based on lightweight deep learning networks. Electronics, 2023, 12(22): 4621 J]
|
| [18] |
WangC, ZhongC. Adaptive feature pyramid networks for object detection. IEEE access, 2021, 9: 107024-107032 J]
|