Fusion-oriented registration of UAV-acquired RGB–infrared images for maritime target detection and segmentation

Zhenyi Li; Xiaogang Yang; Tianxu Zhao; Shengke Wang

doi:10.1007/s44295-026-00103-9

Intelligent Marine Technology and Systems ›› 2026, Vol. 4 ›› Issue (1) :11 DOI: 10.1007/s44295-026-00103-9

Research Paper

research-article

Fusion-oriented registration of UAV-acquired RGB–infrared images for maritime target detection and segmentation

Author information +

History +

PDF

Abstract

This study introduces a novel target-aware registration framework for infrared and visible images acquired by unmanned aerial vehicles (UAVs), designed to improve feature-matching accuracy and robustness. The proposed method begins with image segmentation to remove irrelevant background regions, thereby retaining only the target objects that require registration. This step significantly suppresses background interference during the registration process. The segmentation stage is further guided by bounding boxes generated from a target-detection model, improving the accuracy and stability of the segmentation results. In addition, we propose a novel evaluation strategy for assessing infrared–visible image registration performance. This metric segments both the original and registered images and then computes the mean intersection over union between the segmented regions and the original bounding boxes. Furthermore, we incorporate image-fusion metrics from downstream post-registration tasks to provide a more comprehensive assessment of registration quality. Extensive experimental results demonstrate that the proposed method outperforms existing approaches in terms of both registration accuracy and stability, providing a robust solution for infrared–visible image alignment.

Keywords

UAV multimodal registration / Maritime surveillance / Target-aware segmentation / Image fusion / Deep learning

Cite this article

Download citation ▾

Zhenyi Li, Xiaogang Yang, Tianxu Zhao, Shengke Wang. Fusion-oriented registration of UAV-acquired RGB–infrared images for maritime target detection and segmentation. Intelligent Marine Technology and Systems, 2026, 4(1): 11 DOI:10.1007/s44295-026-00103-9

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Bay H, Ess A, Tuytelaars T, Van Gool L. Speeded-up robust features (SURF). Computer Vision and Image Understanding, 2008, 110(3): 346-359

[2]	Chen ZW, Wei J, Li R (2022) Unsupervised multi-modal medical image registration via discriminator-free image-to-image translation. Preprint at arXiv:2204.13656

[3]	DeTone D, Malisiewicz V, Rabinovich A (2018) SuperPoint: self-supervised interest point detection and description. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, pp 337–349. https://doi.org/10.1109/CVPRW.2018.00060

[4]	Edstedt J, Athanasiadis I, Wadenbäck M, Felsberg M (2023) DKM: dense kernelized feature matching for geometry estimation. In: 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 17765–17775. https://doi.org/10.1109/CVPR52729.2023.01704

[5]	Edstedt J, Bökman G, Wadenbäck M, Felsberg M (2024a) DeDoDe: detect, don’t describe–describe, don’t detect for local feature matching. In: 2024 International Conference on 3D Vision (3DV). IEEE, pp 148–157. https://doi.org/10.1109/3DV62453.2024.00035

[6]	Edstedt J, Bökman G, Zhao ZJ (2024b) DeDoDe v2: analyzing and improving the DeDoDe keypoint detector. In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). IEEE, pp 4245–4253. https://doi.org/10.1109/CVPRW63382.2024.00428

[7]	Fischler MA, Bolles RC. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Commun ACM, 1981, 24(6): 381-395

[8]	He KM, Gkioxari G, Dollár P, Girshick R (2017) Mask R-CNN. In: 16th IEEE International Conference on Computer Vision (ICCV). IEEE, pp 2980–2988. https://doi.org/10.1109/ICCV.2017.322

[9]	He KM, Zhang XY, Ren SQ, Sun J (2016) Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 770–778. https://doi.org/10.1109/CVPR.2016.90

[10]	Jin Z, Xue P, Zhang YY, Cao XH, Shen DG (2022) Semantic-aware registration with weakly-supervised learning. In: MICCAI Workshop on Cancer Prevention through Early Detection. Springer, pp 159–168

[11]	Khanam R, Hussain M (2024) YOLOv11: an overview of the key architectural enhancements. Preprint at arXiv:2410.17725

[12]	Lindeberg T. Scale invariant feature transform. Scholarpedia, 2012, 7(5): 1049

[13]	Peterson LE. K-nearest neighbor. Scholarpedia, 2009, 4(2): 188

[14]	Ravi N, Gabeur V, Hu YT, Hu RH, Ryali C, Ma TY et al (2024) SAM 2: segment anything in images and videos. Preprint at arXiv:2408.00714

[15]	Rublee E, Rabaud V, Konolige K, Bradski G (2011) ORB: an efficient alternative to SIFT or SURF. In: 2011 International Conference on Computer Vision. IEEE, pp 2564–2571. https://doi.org/10.1109/ICCV.2011.6126544

[16]	Sun JM, Shen ZH, Wang Y, Bao HJ, Zhou XW (2021) LoFTR: detector-free local feature matching with transformers. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 8918–8927. https://doi.org/10.1109/CVPR46437.2021.00881

[17]	Sun YM, Cao B, Zhu PF, Hu QH. Drone-based rgb-infrared cross-modality vehicle detection via uncertainty-aware learning. IEEE Trans Circuits Syst Video Technol, 2022, 32(10): 6700-6713

[18]	Wang D, Liu JY, Fan X, Liu RS (2022) Unsupervised misaligned infrared and visible image fusion via cross-modality image generation and registration. Preprint at arXiv:2205.11876

[19]	Wang YF, He XY, Peng SD, Tan DL, Zhou XW (2024) Efficient LoFTR: semi-dense local feature matching with sparse-like speed. In: 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, pp 21666–21675. https://doi.org/10.1109/CVPR52733.2024.02047

[20]	Zhu JY, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: 16th IEEE International Conference on Computer Vision (ICCV). IEEE, pp 2242–2251. https://doi.org/10.1109/ICCV.2017.244