Marine organism classification method based on hierarchical multi-scale attention mechanism

Haotian Xu , Yuanzhi Cheng , Dong Zhao , Peidong Xie

Optoelectronics Letters ›› 2025, Vol. 21 ›› Issue (6) : 354 -361.

PDF
Optoelectronics Letters ›› 2025, Vol. 21 ›› Issue (6) : 354 -361. DOI: 10.1007/s11801-025-4076-y
Article

Marine organism classification method based on hierarchical multi-scale attention mechanism

Author information +
History +
PDF

Abstract

We propose a hierarchical multi-scale attention mechanism-based model in response to the low accuracy and inefficient manual classification of existing oceanic biological image classification methods. Firstly, the hierarchical efficient multi-scale attention (H-EMA) module is designed for lightweight feature extraction, achieving outstanding performance at a relatively low cost. Secondly, an improved EfficientNetV2 block is used to integrate information from different scales better and enhance inter-layer message passing. Furthermore, introducing the convolutional block attention module (CBAM) enhances the model’s perception of critical features, optimizing its generalization ability. Lastly, Focal Loss is introduced to adjust the weights of complex samples to address the issue of imbalanced categories in the dataset, further improving the model’s performance. The model achieved 96.11% accuracy on the intertidal marine organism dataset of Nanji Islands and 84.78% accuracy on the CIFAR-100 dataset, demonstrating its strong generalization ability to meet the demands of oceanic biological image classification.

Cite this article

Download citation ▾
Haotian Xu, Yuanzhi Cheng, Dong Zhao, Peidong Xie. Marine organism classification method based on hierarchical multi-scale attention mechanism. Optoelectronics Letters, 2025, 21(6): 354-361 DOI:10.1007/s11801-025-4076-y

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

LecunY, BottouL, BengioY, et al.. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 1998, 86(11): 2278-2324 J]

[2]

KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25.

[3]

RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International journal of computer vision, 2014: 1–42.

[4]

SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. Computer science, 2014.

[5]

HeK, ZhangX, RenS, et al.. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27–July 1, 2016, Las Vegas, Nevada, USA, 2016, New York, IEEE: 770-778 C]

[6]

TanM, LeQ. EfficientNet: rethinking model scaling for convolutional neural networks. International Conference on Machine Learning (ICML), June 10–15, 2019, Long Beach, California, USA, 20196105-6114[C]

[7]

TanM, ChenB, PangR, et al.. Mnasnet: platform-aware neural architecture search for mobile. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 16–20, 2019, Long Beach, California, USA, 2019, New York, IEEE: 2820-2828[C]

[8]

TanM, LeQ. Efficientnetv2: smaller models and faster training. International Conference on Machine Learning (ICML), July 18–24, 2021, Vienna, Austria, 202110096-10106[C]

[9]

DingX, ZhangX, HanJ, et al.. Scaling up your kernels to 31×31: revisiting large kernel design in cnns. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 19–24, 2022, New Orleans, Louisiana, USA, 2022, New York, IEEE: 11963-11975[C]

[10]

ChenJ, KaoS, HeH, et al.. Run, don’t walk: chasing higher FLOPS for faster neural networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 18–22, 2023, Vancouver, British Columbia, Canada, 2023, New York, IEEE: 12021-12031[C]

[11]

XiongY, VaradarajanB, WuL, et al.. EfficientSAM: leveraged masked image pretraining for efficient segment anything. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 16–20, 2024, Seattle, Washington, USA, 2024, New York, IEEE[C]

[12]

HuJ, ShenL, SunG. Squeeze-and-excitation networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 18–22, 2018, Salt Lake City, Utah, USA, 2018, New York, IEEE: 7132-7141[C]

[13]

HouQ, ZhouD, FengJ. Coordinate attention for efficient mobile network design. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 20–25, 2021, Nashville, TN, USA, 2021, New York, IEEE: 13713-13722[C]

[14]

ZhangH, WuC, ZhangZ, et al.. ResNest: split-attention networks. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 19–24, 2022, New Orleans, Louisiana, USA, 2022, New York, IEEE: 2736-2746[C]

[15]

SiC, YuW, ZhouP, et al.. Inception transformer. Advances in neural information processing systems, 2022, 35: 23495-23509[J]

[16]

OuyangD, HeS, ZhangG, et al.. Efficient multi-scale attention module with cross-spatial learning. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), June 4–6, 2023, Rhodes, Greece, 2023, New York, IEEE: 1-5[C]

[17]

ZhuL, WangX, KeZ, et al.. Biformer: vision transformer with bi-level routing attention. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 18–22, 2023, Vancouver, British Columbia, Canada, 2023, New York, IEEE: 10323-10333[C]

[18]

SelvarajuR R, CogswellM, DasA, et al.. Grad-cam: visual explanations from deep networks via gradient-based localisation. Proceedings of the IEEE International Conference on Computer Vision (ICCV), October 22–27, 2017, Venice, Italy, 2017, New York, IEEE: 618-626[C]

[19]

LIU Y, SUN G, QIU Y, et al. Transformer in convolutional neural networks[EB/OL]. (2021-06-06) [2023-12-23]. https://arxiv.org/abs/2106.03180v1.

RIGHTS & PERMISSIONS

Tianjin University of Technology

AI Summary AI Mindmap
PDF

175

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/