Monocular 3D gaze estimation using feature discretization and attention mechanism

Tong Sha , Jinglin Sun , Siohang Pun , Yu Liu

Optoelectronics Letters ›› 2023, Vol. 19 ›› Issue (5) : 301 -306.

PDF
Optoelectronics Letters ›› 2023, Vol. 19 ›› Issue (5) : 301 -306. DOI: 10.1007/s11801-023-2203-1
Article

Monocular 3D gaze estimation using feature discretization and attention mechanism

Author information +
History +
PDF

Abstract

Gaze estimation has become an important field of image and information processing. Estimating gaze from full-face images using convolutional neural network (CNN) has achieved fine accuracy. However, estimating gaze from eye images is very challenging due to the less information contained in eye images than in full-face images, and it’s still vital since eye-image-based methods have wider applications. In this paper, we propose the discretization-gaze network (DGaze-Net) to optimize monocular three-dimensional (3D) gaze estimation accuracy by feature discretization and attention mechanism. The gaze predictor of DGaze-Net is optimized based on feature discretization. By discretizing the gaze angle into K bins, a classification constraint is added to the gaze predictor. In the gaze predictor, the gaze angle is pre-applied with a binned classification before regressing with the real gaze angle to improve gaze estimation accuracy. In addition, the attention mechanism is applied to the backbone to enhance the ability to extract eye features related to gaze. The proposed method is validated on three gaze datasets and achieves encouraging gaze estimation accuracy.

Cite this article

Download citation ▾
Tong Sha, Jinglin Sun, Siohang Pun, Yu Liu. Monocular 3D gaze estimation using feature discretization and attention mechanism. Optoelectronics Letters, 2023, 19(5): 301-306 DOI:10.1007/s11801-023-2203-1

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

CANIGUERAL R, HAMILTON A F D C. The role of eye gaze during natural social interactions in typical and autistic people[J]. Frontiers in psychology, 2019, 10.

[2]

LiB, ZhangY, ZhengX, et al.. A smart eye tracking system for virtual reality[C], 2019, New York, IEEE: 1-3

[3]

WangH, DongX, ChenZ, et al.. Hybrid gaze/EEG brain computer interface for robot arm control on a pick and place task[C], 2015, New York, IEEE: 1476-1479

[4]

SiroheyS, RosenfeldA, DuricZ. A method of detecting and tracking irises and eyelids in video[J]. Pattern recognition, 2002, 35(6):1389-1401

[5]

WuL, XuX, ShenC. Eye detection and tracking using IR source[J]. Optoelectronics letters, 2006, 2: 145-147

[6]

HirotakeY, AkiraU, TomokoY, et al.. Remote gaze estimation with a single camera based on facial-feature tracking without special calibration actions[C], 2008, New York, ACM: 245-250

[7]

ZhangX, SuganoY, FritzM, et al.. Appearance-based gaze estimation in the wild[C], 2015, New York, IEEE: 4511-4520

[8]

ZhangX, SuganoY, FritzM. MPIIGaze: real-world dataset and deep appearance-based gaze estimation[J]. IEEE transactions on pattern analysis and machine intelligence, 2019, 41(1):162-175

[9]

ChengY, LuF, ZhangX. Appearance-based gaze estimation via evaluation-guided asymmetric regression[C], 2018, Berlin, Heidelberg, Springer

[10]

ChenZ, ShiB E. Appearance-based gaze estimation using dilated-convolutions[C], 2019, Berlin, Heidelberg, Springer: 309-324

[11]

ZhangX, SuganoY, FritzM, et al.. It’s written all over your face: full-face appearance-based gaze estimation[C], 2017, New York, IEEE

[12]

LiuS, LiuD, WuH. Gaze estimation with multi-scale channel and spatial attention[C], 2020, New York, ACM: 303-309

[13]

MurthyL R D, BiswasP. Appearance-based gaze estimation using attention and difference mechanism[C], 2021, New York, IEEE: 3143-3152virtual

[14]

AbdelrahmanA, HempelT, KhalifaA, et al.. L2CS-Net: fine-grained gaze estimation in unconstrained environments[C], 2022, New York, IEEE

[15]

SimonyanK, ZissermanA. Very deep convolutional networks for large-scale image recognition[C], 2015, Banff, Computational and Biological Learning Society

[16]

HeK, ZhangX, RenS, et al.. Deep residual learning for image recognition[C], 2016, New York, IEEE

[17]

WooS, ParkJ, LeeJ Y, et al.. CBAM: convolutional block attention module[C], 2018, Berlin, Heidelberg, Springer

[18]

KendallA, GalY, CipollaR. Multi-task learning using uncertainty to weigh losses for scene geometry and semantics[C], 2018, New York, IEEE: 7482-7491

[19]

FischerT, ChangH, DemirisY. RT-GENE: real-time eye gaze estimation in natural environments[C], 2018, Berlin, Heidelberg, Springer

[20]

SuganoY, MatsushitaY, SatoY. Learning-by-synthesis for appearance-based 3D gaze estimation[C], 2014, New York, IEEE: 1821-1828

[21]

XiongY, KimH, SinghV. Mixed effects neural networks (MeNets) with applications to gaze estimation[C], 2019, New York, IEEE: 7735-7744

[22]

BernardV, WannousH, VandeborreJ P. Eye-gaze estimation using a deep capsule-based regression network[C], 2021, New York, IEEE: 1-6

[23]

MAHMUD Z, HUNGLER P, ETEMAD A. Multistream gaze estimation with anatomical eye region isolation by synthetic to real transfer learning[EB/OL]. (2022-06-18) [2022-11-12]. https://arxiv.org/abs/2206.09256.

AI Summary AI Mindmap
PDF

171

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/