SDFSeg: multiscale perception and deformable feature fusion for coastal ecosystem

Xinjing Wang; Ziying Wu; Yuwen Wang; Haomiao Zhang; Shiyi Han; Ying Gao

doi:10.1007/s44295-025-00074-3

Intelligent Marine Technology and Systems ›› 2025, Vol. 3 ›› Issue (1) DOI: 10.1007/s44295-025-00074-3

Research Paper

research-article

SDFSeg: multiscale perception and deformable feature fusion for coastal ecosystem

Author information +

History +

PDF

Abstract

Monitoring coastal ecosystems is essential for mitigating pollution, preserving biodiversity, and understanding the impacts of climate change. However, existing approaches, such as fully convolutional network (FCN) and Transformer-based models, often struggle with challenges such as low-class variance, difficulty in detecting small targets, and loss of boundary information. To handle large variations in target scales, we propose a semantic segmentation framework, SDFSeg, which integrates three key modules: the scale aware conv, dynamic deformable sample, and fusion perceiver. The scale aware conv is designed to improve multiscale feature extraction by incorporating convolutional layers with varying dilation rates; the dynamic deformable sample precisely aligns target boundaries, focuses on small features, and enables adaptive dynamic sampling for improved small target detection and boundary segmentation; and the fusion perceiver effectively fuses local and global information. Extensive experiments on benchmark datasets demonstrate that our method achieves a superior performance while reducing the computational overhead, confirming its practical applicability.

Keywords

Semantic segmentation / Multiscale feature extraction / Coastal ecosystem monitoring / Boundary segmentation

Cite this article

Download citation ▾

Xinjing Wang, Ziying Wu, Yuwen Wang, Haomiao Zhang, Shiyi Han, Ying Gao. SDFSeg: multiscale perception and deformable feature fusion for coastal ecosystem. Intelligent Marine Technology and Systems, 2025, 3 (1) : DOI:10.1007/s44295-025-00074-3

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Badrinarayanan V, Kendall A, Cipolla R (2017) SegNet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495

[2]	Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018a) DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/TPAMI.2017.2699184

[3]

Chen LC, Zhu YK, Papandreou G, Schroff F, Adam H (2018b) Encoder-decoder with atrous separable convolution for semantic image segmentation. In: Ferrari V et al (eds) Computer Vision–ECCV 2018. Lecture notes in computer science, vol 11211. Springer, Cham, pp 833–851. https://doi.org/10.1007/978-3-030-01234-2_49

[4]	Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning. PMLR, pp 1597–1607

[5]	ChenYX, LiuZH, ChenZQ. AMS: a hyperspectral image classification method based on SVM and multi-modal attention network. Knowl-Based Syst, 2025, 314113236.

[6]	Cheng BW, Misra I, Schwing AG, Kirillov A, Girdhar R (2022) Masked-attention mask Transformer for universal image segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 1280–1289

[7]	Fu J, Liu J, Tian HJ, Li Y, Bao YJ, Fang ZW, Lu HQ (2019) Dual attention network for scene segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 3141–3149

[8]	HeidlerK, MouLC, BaumhoerC, DietzA, ZhuXX. HED-UNet: combined segmentation and edge detection for monitoring the antarctic coastline. IEEE Trans Geosci Remote Sens, 2022, 60: 1-14.

[9]	Huang ZL, Wang XG, Huang LC, Huang C, Wei YC, Liu W (2019) CCNet: criss-cross attention for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE, pp 603–612

[10]	JamesRK, KeyzerLM, van de VeldeSJ, HermanPMJ, van KatwijkMM, BoumaTJ. Climate change mitigation by coral reefs and seagrass beds at risk: how global change compromises coastal ecosystem services. Sci Total Environ, 2023, 857159576.

[11]	LiHY, MaoDH, WangZM, HuangX, LiL, JiaMM. Invasion of Spartina alterniflora in the coastal zone of mainland China: control achievements from 2015 to 2020 towards the Sustainable Development Goals. J Environ Manage, 2022, 323116242.

[12]	Li JW, Shi KY, Xie GS, Liu XF, Zhang J, Zhou TF (2024) Label-efficient few-shot semantic segmentation with unsupervised meta-training. In: Proceedings of the AAAI Conference on Artificial Intelligence. AAAI, pp 3109–3117. https://doi.org/10.1609/aaai.v38i4.28094

[13]	Lin TY, Dollár P, Girshick R, He KM, Hariharan B, Belongie S (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 936–944

[14]	Liu Z, Lin YT, Cao Y, Hu H, Wei YX, Zhang Z et al (2021) Swin Transformer: hierarchical vision transformer using shifted windows. In: Proceedings of the IEEE/CVF International Conference on Computer Vision. IEEE, pp 9992–10002

[15]	Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 3431–3440

[16]	LuoJJ, ZhaoTH, CaoL, BiljeckiF. Semantic Riverscapes: perception and evaluation of linear landscapes from oblique imagery using computer vision. Landsc Urban Plan, 2022, 228104569.

[17]	Lyu Y, Vosselman G, Xia GS, Yilmaz A, Yang MY (2020) UAVid: a semantic segmentation dataset for UAV imagery. ISPRS J Photogramm Remote Sens 165:108–119

[18]	MaGY, YueXF. An improved whale optimization algorithm based on multilevel threshold image segmentation using the Otsu method. Eng Appl Artif Intell, 2022, 113104960.

[19]	Oktay O, Schlemper J, Le Folgoc L, Lee M, Heinrich M, Misawa K et al (2018) Attention U-Net: learning where to look for the pancreas. Preprint at arXiv:1804.03999

[20]	PelletierC, WebbGI, PetitjeanF. Temporal convolutional neural network for the classification of satellite image time series. Remote Sens, 2019, 115532.

[21]	RezaeeM, MahdianpariM, ZhangY, SalehiB. Deep convolutional neural network for complex wetland classification using optical remote sensing imagery. IEEE J Sel Top Appl Earth Observ Remote Sens, 2018, 11(9): 3030-3039.

[22]	Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. Preprint at arXiv:1505.04597

[23]	Sun K, Xiao B, Liu D, Wang JD (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 5686–5696

[24]	Vo XT, Nguyen DL, Priadana A, Jo KH (2025) Efficient vision Transformers with partial attention. In: Leonardis A et al (eds) Computer Vision–ECCV 2024. Lecture notes in computer science, vol 15141. Springer, Cham, pp 298–317. https://doi.org/10.1007/978-3-031-73010-8_18

[25]	Wang JD, Sun K, Cheng TH, Jiang BR, Deng CR, Zhao Y et al (2021) Deep high-resolution representation learning for visual recognition. IEEE Trans Pattern Anal Mach Intell 43(10):3349–3364. https://doi.org/10.1109/TPAMI.2020.2983686

[26]	YekeenST, BalogunA, YusofKBW. A novel deep learning instance segmentation model for automated marine oil spill detection. ISPRS J Photogramm Remote Sens, 2020, 167: 190-200.

[27]	Yuan YH, Chen XL, Wang JD (2020) Object-contextual representations for semantic segmentation. In: Vedaldi A et al (eds) Computer Vision–ECCV 2020. Lecture notes in Computer science, vol 12351. Springer, Cham, pp 173–190. https://doi.org/10.1007/978-3-030-58539-6_11

[28]	Zhao HS, Shi JP, Qi XJ, Wang XG, Jia JY (2017) Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 6230–6239