Semantic segmentation of urban street scene images based on improved U-Net network

Fuzhen Zhu; Jingyi Cui; Bing Zhu; Huiling Li; Yan Liu

doi:10.1007/s11801-023-2128-8

Optoelectronics Letters ›› 2023, Vol. 19 ›› Issue (3) : 179 -185. DOI: 10.1007/s11801-023-2128-8

Article

Semantic segmentation of urban street scene images based on improved U-Net network

Author information +

History +

PDF

Abstract

To balance the speed and accuracy in semantic segmentation of the urban street images for autonomous driving, we proposed an improved U-Net network. Firstly, to improve the model representation capability, our improved U-Net network structure was designed as three parts, shallow layer, intermediate layer and deep layer. Different attention mechanisms were used according to their feature extraction characteristics. Specifically, a spatial attention module was used in the shallow network, a dual attention module was used in the intermediate layer network and a channel attention module was used in the deep network. At the same time, the traditional convolution was replaced by depthwise separable convolution in above three parts, which can largely reduce the number of network parameters, and improve the network operation speed greatly. The experimental results on three datasets show that our improved U-Net semantic segmentation model for street images can get better results in both segmentation accuracy and speed. The average mean intersection over union (MIoU) is 68.8%, which is increased by 9.2% and the computation speed is about 38 ms/frame. We can process 27 frames images for segmentation per second, which meets the real-time process and accuracy requirements for semantic segmentation of urban street images.

Cite this article

Download citation ▾

Fuzhen Zhu, Jingyi Cui, Bing Zhu, Huiling Li, Yan Liu. Semantic segmentation of urban street scene images based on improved U-Net network. Optoelectronics Letters, 2023, 19(3): 179-185 DOI:10.1007/s11801-023-2128-8

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	ZHOU J M, LI B J, CHEN S Z. A real-time segmentation method of road scene based on multi-layer feature fusion[J]. Surveying and mapping bulletin, 2020, (1): 10–15.

[2]	MoY, WuY, YangX, et al.. Review the state-of-the-art technologies of semantic segmentation based on deep learning[J]. Neurocomputing, 2022, 493: 626-646

[3]	BaiJ, HaoP H, ChenS H. Traffic scene understanding using lightweight convolutional neural network image semantic segmentation[J]. Journal of automotive safety and energy, 2018, 9(04):433-440

[4]	ShelhamerE, LongJ, DarrellT. Fully con-volutional networks for semantic segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(4):640-651

[5]	LiuW M, XinY L, JiangX Y. Semantic segmentation of residual network image combined with jump connection[J]. Information technology, 2020, 44(06):5-9

[6]	YangC J. Image semantic segmentation based on convolutional neural network[D], 2020, Lanzhou, Northwest Normal University: 25

[7]	BadrinarayananV, KendallA, CipollaR, et al.. SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE transactions on pattern analysis and machine intelligence, 2017, 39(12):2481-2495

[8]	YuF, KoltunV, FunkhouserT. Dilated residual network[C], 2017, New York, IEEE: 472-480

[9]	CHEN L C, PAPANDREOU G, KOKKINOS I, et al. Semantic image segmentation with deep convolutional nets and fully connected CRFs[EB/OL]. (2014-12-22) [2022-06-20]. https://arxiv.org/abs/1412.7062.

[10]	ChenL C, PapandreouG, KokkinosI, et al.. DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE transactions on pattern analysis and machine intelligence, 2016, 40(4):834-848

[11]	ZhangY H, LiuH, TianW, et al.. A method of rain cloud cluster segmentation in Tibet based on Dee-pLabV3[J]. Journal of computer applications, 2020, 40(09):2781-2788

[12]	KumarP, ShankarH A. Convolutional neural network with batch normalisation for fault detection in squirrel cage induction motor[J]. IET electric power applications, 2021, 15(1):39-50

[13]	FuJ, LiuJ, TianH, et al.. Dual attention network for scene segmentation[C], 2019, New York, IEEE: 3146-3154

[14]	RonnebergerO, FischerP, BroxT. U-Net: convolutional network for biomedical image segmention[C], 2015, Berlin, Heidelberg, Springer-Verlag: 234-241

[15]	ChenZ, LiD, FanW, et al.. Self-attention in reconstruction bias U-Net for semantic segmentation of building rooftops in optical remote sensing images[J]. Remote sensing, 2021, 13(13):2524

[16]	XiaoJ Q. Semantic segmentation of road scene based on deep learning[D], 2019, Changchun, Jilin University: 23-27

[17]	WuT. Research on road scene semantic segmentation algorithm based on fully convolutional neural net-work[D], 2020, Chongqing, Southwest University: 14-16

[18]	YuF. Research and implementation of multi-scene image semantic segmentation based on fully convolutional neural network[C], 2019, Paris, Atlantis Press: 156-161

[19]	ChenZ, LiD, FanW, et al.. Self-attention in reconstruction bias U-Net for semantic segmentation of building rooftops in optical remote sensing images[J]. Remote sensing, 2021, 13(13):2524

[20]	LuoP F. Research on semantic segmentation of autonomous driving city scene[D], 2019, Wuhan, Wuhan University: 16-22

[21]	YuanX, ShiJ, GuL. A review of deep learning methods for semantic segmentation of remote sensing imagery[J]. Expert systems with applications, 2021, 169: 114417