Underwater image enhancement based on transformer and structure guidance
Xue Chen , Quanxiang Jiang , Haohao Zhang , Yuting Yang
Intelligent Marine Technology and Systems ›› 2026, Vol. 4 ›› Issue (1) : 5
Underwater image enhancement based on transformer and structure guidance
Underwater images play an increasingly important role in scientific research and industrial fields such as marine military, marine environmental protection, and marine engineering. However, owing to nonuniform lighting conditions, the quality of underwater imaging is often degraded by remarkable color distortion and detail loss. Although existing traditional underwater image enhancement methods have advanced, they are still limited by scarce and low-quality samples, making it difficult to achieve satisfactory results. In this study, a novel network based on the structure-guided former is proposed to effectively address the challenges of color correction and illumination enhancement in underwater images. The proposed cross-axial compression transformer block preserves the powerful global modeling capacity of transformers while significantly enhancing local feature extraction. In addition, the introduction of structural prior information not only guides the feature reconstruction process in the decoder but also effectively corrects color casts and enhances high-frequency details. Comparative experiments and ablation experiments on publicly available datasets have validated the effectiveness of the proposed method.
Underwater image enhancement / Transformer / Structural-guided reconstruction
| [1] |
|
| [2] |
Ancuti C, Ancuti CO, Haber T, Bekaert P (2012) Enhancing underwater images and videos by fusion. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, pp 81–88 |
| [3] |
Barron JT (2019) A general and adaptive robust loss function. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, pp 4331–4339 |
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
Drews P, Hernández E, Elfes A, Nascimento ER, Campos M (2016) Real-time monocular obstacle avoidance using underwater dark channel prior. IEEE, pp 4672–4677 |
| [8] |
Fu XY, Zhuang PX, Huang Y, Liao YH, Zhang XP, Ding XH (2014) A retinex-based enhancing approach for single underwater image. In: 2014 IEEE International Conference on Image Processing (ICIP). IEEE, pp 4572–4576 |
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
Johnson J, Alahi A, Li FF (2016) Perceptual losses for real-time style transfer and super-resolution. In: Computer Vision–ECCV 2016, PT II. Springer, pp 694–711 |
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
|
| [21] |
|
| [22] |
Liu Z, Lin YT, Cao Y, Hu H, Wei YX, Zhang Z et al (2021) Swin Transformer: hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF International Conference on Computer Vision (ICCV 2021). IEEE, pp 9992–10002 |
| [23] |
|
| [24] |
|
| [25] |
|
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
|
| [30] |
Zhao WX, Zhang XF, Wang J, Gu Y (2023) An end-to-end deep convolutional neural network for image restoration of sparse aperture imaging system in geostationary orbit. In: Conference on Optoelectronic Imaging and Multimedia Technology IX. SPIE, 123170X |
| [31] |
|
The Author(s)
/
| 〈 |
|
〉 |