Underwater image enhancement based on transformer and structure guidance

Xue Chen; Quanxiang Jiang; Haohao Zhang; Yuting Yang

doi:10.1007/s44295-026-00095-6

Intelligent Marine Technology and Systems ›› 2026, Vol. 4 ›› Issue (1) :5 DOI: 10.1007/s44295-026-00095-6

Research Paper

research-article

Underwater image enhancement based on transformer and structure guidance

Author information +

History +

PDF

Abstract

Underwater images play an increasingly important role in scientific research and industrial fields such as marine military, marine environmental protection, and marine engineering. However, owing to nonuniform lighting conditions, the quality of underwater imaging is often degraded by remarkable color distortion and detail loss. Although existing traditional underwater image enhancement methods have advanced, they are still limited by scarce and low-quality samples, making it difficult to achieve satisfactory results. In this study, a novel network based on the structure-guided former is proposed to effectively address the challenges of color correction and illumination enhancement in underwater images. The proposed cross-axial compression transformer block preserves the powerful global modeling capacity of transformers while significantly enhancing local feature extraction. In addition, the introduction of structural prior information not only guides the feature reconstruction process in the decoder but also effectively corrects color casts and enhances high-frequency details. Comparative experiments and ablation experiments on publicly available datasets have validated the effectiveness of the proposed method.

Keywords

Underwater image enhancement / Transformer / Structural-guided reconstruction

Cite this article

Download citation ▾

Xue Chen, Quanxiang Jiang, Haohao Zhang, Yuting Yang. Underwater image enhancement based on transformer and structure guidance. Intelligent Marine Technology and Systems, 2026, 4 (1) : 5 DOI:10.1007/s44295-026-00095-6

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Ancuti CO, Ancuti C, De Vleeschouwer C, Bekaert P. Color balance and fusion for underwater image enhancement. IEEE Trans Image Proc, 2017, 27(1): 379-393

[2]	Ancuti C, Ancuti CO, Haber T, Bekaert P (2012) Enhancing underwater images and videos by fusion. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 81–88

[3]	Barron JT (2019) A general and adaptive robust loss function. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 4331–4339

[4]	Chen WL, Liao HG, Lin RF, Zhao TS, Gu K, Le Callet P. Utility-centered underwater image quality evaluation. IEEE J Ocean Eng, 2025, 50(2): 743-757

[5]	Dakshinamurthi V, Suja GP, Murugan P, Hussain SR. Deep learning-based image dehazing and visibility enhancement for improved visual perception. ICTACT J Image Video Proc, 2023, 14(2): 3122

[6]	Dreier T, Romell J, Amato C, Gkoumas S. Exploring image quality improvements in high-speed dual threshold photon-counting micro-CT. J Nondestruct Eval, 2025, 443 108

[7]	Drews P, Hernández E, Elfes A, Nascimento ER, Campos M (2016) Real-time monocular obstacle avoidance using underwater dark channel prior. IEEE, pp 4672–4677

[8]	Fu XY, Zhuang PX, Huang Y, Liao YH, Zhang XP, Ding XH (2014) A retinex-based enhancing approach for single underwater image. In: 2014 IEEE International Conference on Image Processing (ICIP). IEEE, pp 4572–4576

[9]	He KM, Sun J, Tang XO. Single image haze removal using dark channel prior. IEEE Trans Pattern Anal Mach Intell, 2010, 33122341-2353

[10]	Hong L, Shu X, Wang Q, Ye H, Shi JL, Liu CS. CCM-Net: color compensation and coordinate attention guided underwater image enhancement with multi-scale feature aggregation. Opt Lasers Eng, 2025, 184 108590

[11]	Hua CJ, Zou XT, Ling Y, Chen Y. Visual saliency detection via a recurrent residual convolutional neural network based on densely aggregated features. Comput Graph-UK, 2022, 104: 72-85

[12]	Huynh-Thu Q, Ghanbari M. Scope of validity of PSNR in image/video quality assessment. Electron Lett, 2008, 44(13): 800-801

[13]	Jiang Q, Zhang YF, Bao FX, Zhao XY, Zhang CM, Liu PD. Two-step domain adaptation for underwater image enhancement. Pattern Recognit, 2022, 122 108324

[14]	Jiang ZY, Li ZX, Yang SZ, Fan X, Liu RS. Target oriented perceptual adversarial fusion network for underwater image enhancement. IEEE Trans Circuits Syst Video Technol, 2022, 32(10): 6584-6598

[15]	Johnson J, Alahi A, Li FF (2016) Perceptual losses for real-time style transfer and super-resolution. In: Computer Vision–ECCV 2016, PT II. Springer, pp 694–711

[16]	Khan V, Singh AP, Latake SP. Image quality assessment: from error visibility to structural similarity. Int J Creat Res Thought, 2018, 6(1): 1400-1403

[17]	Lecun Y, Bottou L, Bengio Y, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE, 1998, 86(11): 2278-2324

[18]	Li CY, Guo CL, Ren WQ, Cong RM, Hou JH, Kwong Set al.. An underwater image enhancement benchmark dataset and beyond. IEEE Trans Image Proc, 2019, 29: 4376-4389

[19]	Li CY, Guo JC, Cong RM, Pang YW, Wang B. Underwater image enhancement by dehazing with minimum information loss and histogram distribution prior. IEEE Trans Image Proc, 2016, 26(12): 5664-5677

[20]	Li JR, Zhu X, Zheng YC, Lu HM, Li YJ. Underwater image restoration based on light attenuation prior and color-contrast adaptive correction. Image vis Comput, 2024, 150 105217

[21]	Li Z, Yan KX, Zhou DM, Wang CC, Quan JR. A novel highland and freshwater-circumstance dataset: advancing underwater image enhancement. Vis Comput, 2024, 40(10): 7471-7489

[22]	Liu Z, Lin YT, Cao Y, Hu H, Wei YX, Zhang Z et al (2021) Swin Transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV 2021). IEEE, pp 9992–10002

[23]	Maroun G, Bekhouche SE, Charafeddine J, Dornaika F. Integrating convnext and vision transformers for enhancing facial age estimation. Comput Vis Image Underst, 2025, 262 104542

[24]	Panetta K, Gao C, Agaian S. Human-visual-system-inspired underwater image quality measures. IEEE J Ocean Eng, 2016, 41(3): 541-551

[25]	Peng LT, Zhu CL, Bian LH. U-shape transformer for underwater image enhancement. IEEE Trans Image Proc, 2023, 32: 3066-3079

[26]	Pizer SM, Amburn EP, Austin JD, Cromartie R, Geselowitz A, Greer Tet al.. Adaptive histogram equalization and its variations. Comput Vis Graph Image Proc, 1987, 39(3): 355-368

[27]	Reza AM. Realization of the contrast limited adaptive histogram equalization (CLAHE) for real-time image enhancement. J VLSI Signal Proc Syst Signal Image Video Technol, 2004, 38(1): 35-44

[28]	Siddique N, Paheding S, Elkin CP, Devabhaktuni V. U-net and its variants for medical image segmentation: a review of theory and applications. IEEE Access, 2021, 9: 82031-82057

[29]	Stark JA. Adaptive image contrast enhancement using generalizations of histogram equalization. IEEE Trans Image Proc, 2000, 9(5): 889-896

[30]	Zhao WX, Zhang XF, Wang J, Gu Y (2023) An end-to-end deep convolutional neural network for image restoration of sparse aperture imaging system in geostationary orbit. In: Conference on Optoelectronic Imaging and Multimedia Technology IX. SPIE, 123170X

[31]	Zheng JZ, Huntrakul C, Guo X, Wang C, Xie GM. Electric sense based pose estimation and localization for small underwater robots. IEEE Robot Autom Lett, 2022, 7(2): 2835-2842