Fine-grained sequence-to-sequence lip reading based on self-attention and self-distillation
Junxiao XUE, Shibo HUANG, Huawei SONG, Lei SHI
Fine-grained sequence-to-sequence lip reading based on self-attention and self-distillation
[1] |
Xiao J, Yang S, Zhang Y, Shan S, Chen X . Deformation flow based two-stream network for lip reading. In: Proceedings of the 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), 2020, 364–370
|
[2] |
Assael Y M, Shillingford B, Whiteson S, De Freitas N . LipNet: End-to-end sentence-level lipreading. 2017, arXiv preprint arXiv: 1611, 0159, 9
|
[3] |
Chung J S, Senior A, Vinyals O, .
|
[4] |
Xu K, Li D, Cassimatis N, Wang X . LCANet: End-to-end lipreading with cascaded attention-CTC. In: Proceedings of the 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), 2018, 548–555
|
[5] |
Zhang Y, Yang S, Xiao J, .
|
[6] |
Luo M, Yang S, Shan S, Chen X . Pseudo-convolutional policy gradient for sequence-to-sequence lip-reading. In: Proceedings of the 15th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2020), 2020, 273–280
|
[7] |
Zhang X, Cheng F, Wang S . Spatio-temporal fusion based convolutional sequence learning for lip reading. In: Proceedings of 2019 IEEE/CVF International Conference on Computer Vision, 2019, 713–722
|
/
〈 | 〉 |