Visual feature inter-learning for sign language recognition in emergency medicine

Chao Wei , Yunpeng Li , Jingze Liu

Optoelectronics Letters ›› 2025, Vol. 21 ›› Issue (10) : 619 -625.

PDF
Optoelectronics Letters ›› 2025, Vol. 21 ›› Issue (10) : 619 -625. DOI: 10.1007/s11801-025-4214-6
Original Paper
research-article

Visual feature inter-learning for sign language recognition in emergency medicine

Author information +
History +
PDF

Abstract

Accessible communication based on sign language recognition (SLR) is the key to emergency medical assistance for the hearing-impaired community. Balancing the capture of both local and global information in SLR for emergency medicine poses a significant challenge. To address this, we propose a novel approach based on the inter-learning of visual features between global and local information. Specifically, our method enhances the perception capabilities of the visual feature extractor by strategically leveraging the strengths of convolutional neural network (CNN), which are adept at capturing local features, and visual transformers which perform well at perceiving global features. Furthermore, to mitigate the issue of overfitting caused by the limited availability of sign language data for emergency medical applications, we introduce an enhanced short temporal module for data augmentation through additional subsequences. Experimental results on three publicly available sign language datasets demonstrate the efficacy of the proposed approach.

Keywords

A

Cite this article

Download citation ▾
Chao Wei, Yunpeng Li, Jingze Liu. Visual feature inter-learning for sign language recognition in emergency medicine. Optoelectronics Letters, 2025, 21(10): 619-625 DOI:10.1007/s11801-025-4214-6

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

GuoL M, XueW L, GuoQ, et al.. Distilling cross-temporal contexts for continuous sign language recognition. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 17–24, 2023, Vancouver, Canada, 2023, New York. IEEE. 1077110780[C]

[2]

GuoL M, XueW L, LiuB, et al.. Gloss prior guided visual feature learning for continuous sign language recognition. IEEE transactions on image processing, 2024, 33: 3486-3495. J]

[3]

LiuJ Z, XueW L, ZhangK H, et al.. TB-Net: intra-and inter-video correlation learning for continuous sign language recognition. Information fusion, 2024, 109102438. J]

[4]

DosovitskiyA, BeyerL, KolesnikovA, et al.. An image is worth 16×16 words: transformers for image recognition at scale. Proceedings of the International Conference on Learning Representations, May 3–7, 2021, Virtual, 2021, Austria. ICLR. [C]

[5]

ZhouH, ZhouW G, ZhouY, et al.. Spatial-temporal multi-cue network for sign language recognition and translation. IEEE transactions on multimedia, 2022, 24: 768-779. J]

[6]

PengZ L, GuoZ H W, HuangW, et al.. Conformer: local features coupling global representations for recognition and detection. IEEE transactions on pattern analysis and machine intelligence, 2023, 45(8): 9454-9468. J]

[7]

NiuZ, MakB. Stochastic fine-grained labeling of multi-state sign glosses for continuous sign language recognition. Proceedings of the European Conference on Computer Vision, August 23–28, 2020, Glasgow, UK, 2020, Cham. Springer. 172186[C]

[8]

KollerO, ForsterJ, NeyH. Continuous sign language recognition: towards large vocabulary statistical recognition systems handling multiple signers. Computer vision and image understanding, 2015, 141: 108-125. J]

[9]

CamgozN Z, HadfieldS, KollerO, et al.. Neural sign language translation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18–23, 2018, Salt Lake City, USA, 2018, New York. IEEE. 77847793[C]

[10]

ZhouH, ZhouW G, QiW Z, et al.. Improving sign language translation with monolingual data by sign back-translation. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 20–25, 2021, Nashville, USA, 2021, New York. IEEE. 13161325[C]

[11]

MinY, HaoA, ChaiX, et al.. Visual alignment constraint for continuous sign language recognition. Proceedings of the IEEE/CVF International Conference on Computer Vision, October 10–17, 2021, Montreal, Canada, 2021, New York. IEEE. 1152211531[C]

[12]

HaoA, MinY, ChenX. Self-mutual distillation learning for continuous sign language recognition. Proceedings of the IEEE/CVF International Conference on Computer Vision, October 10–17, 2021, Montreal, Canada, 2021, New York. IEEE. 1128311292[C]

[13]

ZhengJ, WangY, TanC, et al.. CVT-SLR: contrastive visual-textual transformation for sign language recognition with variational alignment. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 17–24, 2023, Vancouver, Canada, 2023, New York. IEEE. 2314123150[C]

[14]

HuL, GaoL, LiuZ, et al.. Continuous sign language recognition with correlation network. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 17–24, 2023, Vancouver, Canada, 2023, New York. IEEE. 25292539[C]

[15]

XueW L, KangZ, GuoL, et al.. Continuous sign language recognition for hearing-impaired consumer-communication via self-guidance network. IEEE transactions on consumer electronics, 2024, 70(1): 535-542. J]

[16]

HuangZ G, XueW L, ZhouY X, et al.. Dual-stage temporal perception network for continuous sign language recognition. The visual computer, 2025, 41(3): 1971-1986. J]

[17]

XueW L, LiuJ Z, YanS Y, et al.. Alleviating data insufficiency for Chinese sign language recognition. Visual intelligence, 2023, 126. J]

RIGHTS & PERMISSIONS

Tianjin University of Technology

AI Summary AI Mindmap
PDF

127

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/