Evaluating quality of motion for unsupervised video object segmentation

Guanjun Cheng, Huihui Song

Optoelectronics Letters ›› 2024, Vol. 20 ›› Issue (6) : 379-384. DOI: 10.1007/s11801-024-3207-1
Article

Evaluating quality of motion for unsupervised video object segmentation

Author information +
History +

Abstract

Current mainstream unsupervised video object segmentation (UVOS) approaches typically incorporate optical flow as motion information to locate the primary objects in coherent video frames. However, they fuse appearance and motion information without evaluating the quality of the optical flow. When poor-quality optical flow is used for the interaction with the appearance information, it introduces significant noise and leads to a decline in overall performance. To alleviate this issue, we first employ a quality evaluation module (QEM) to evaluate the optical flow. Then, we select high-quality optical flow as motion cues to fuse with the appearance information, which can prevent poor-quality optical flow from diverting the network’s attention. Moreover, we design an appearance-guided fusion module (AGFM) to better integrate appearance and motion information. Extensive experiments on several widely utilized datasets, including DAVIS-16, FBMS-59, and YouTube-Objects, demonstrate that the proposed method outperforms existing methods.

Cite this article

Download citation ▾
Guanjun Cheng, Huihui Song. Evaluating quality of motion for unsupervised video object segmentation. Optoelectronics Letters, 2024, 20(6): 379‒384 https://doi.org/10.1007/s11801-024-3207-1

References

[[1]]
Maddern W, Pascoe G, Linegar C, et al.. 1 year, 1000 km: the oxford robotcar dataset[J]. The international journal of robotics research, 2017, 36(1): 3-15,
CrossRef Google scholar
[[2]]
Xiong L, Tang G. Multi-object tracking based on deep associated features for UAV applications[J]. Optoelectronics letters, 2023, 19(2): 105-111,
CrossRef Google scholar
[[3]]
Karray F, Alemzadeh M, Abou Saleh J, et al.. Human-computer interaction: overview on state of the art[J]. International journal on smart sensing and intelligent systems, 2008, 1(1): 137-159,
CrossRef Google scholar
[[4]]
Yang S, Zhang L, Qi J, et al.. . Learning motion-appearance co-attention for zero-shot video object segmentation[C], 2021 New York IEEE 1564-1573
[[5]]
Pei G, Shen F, Yao Y, et al.. . Hierarchical feature alignment network for unsupervised video object segmentation[C], 2022 Cham Springer Nature 596-613
[[6]]
Ji G P, Fu K, Wu Z, et al.. . Full-duplex strategy for video object segmentation[C], 2021 New York IEEE 4922-4933
[[7]]
Zhou T, Wang S, Zhou Y, et al.. Motion-attentive transition for zero-shot video object segmentation[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(07): 13066-13073,
CrossRef Google scholar
[[8]]
Cui Y, Jiang C, Wang L, et al.. . MixFormer: end-to-end tracking with iterative mixed attention[C], 2022 New York IEEE 13608-13618
[[9]]
Xie E, Wang W, Yu Z, et al.. SegFormer: simple and efficient design for semantic segmentation with transformers[J]. Advances in neural information processing systems, 2021, 34: 12077-12090
[[10]]
Xu N, Yang L, Fan Y, et al.. . Youtube-vos: sequence-to-sequence video object segmentation[C], 2018 Berlin, Heidelberg Springer-Verlag 585-601
[[11]]
Perazzi F, Pont-Tuset J, Mcwilliams B, et al.. . A benchmark dataset and evaluation methodology for video object segmentation[C], 2016 New York IEEE 724-732
[[12]]
Teed Z, Deng J. . Raft: recurrent all-pairs field transforms for optical flow[C], 2020 Berlin, Heidelberg Springer-Verlag 402-419
[[13]]
Ochs P, Malik J, Brox T. Segmentation of moving objects by long term video analysis[J]. IEEE transactions on pattern analysis and machine intelligence, 2013, 36(6): 1187-1200,
CrossRef Google scholar
[[14]]
Prest A, Leistner C, Civera J, et al.. . Learning object class detectors from weakly annotated video[C], 2012 New York IEEE 3282-3289
[[15]]
Zhang K, Zhao Z, Liu D, et al.. . Deep transport network for unsupervised video object segmentation[C], 2021 New York IEEE 8781-8790
[[16]]
Ren S, Liu W, Liu Y, et al.. . Reciprocal transformations for unsupervised video object segmentation[C], 2021 New York IEEE 15455-15464
[[17]]
Schmidt C, Athar A, Mahadevan S, et al.. . D2conv3d: dynamic dilated convolutions for object segmentation in videos[C], 2022 New York IEEE 1200-1209
[[18]]
Cho S, Lee M, Lee S, et al.. . Treating motion as option to reduce motion dependency in unsupervised video object segmentation[C], 2023 New York IEEE 5140-5149
[[19]]
Wang W, Lu X, Shen J, et al.. . Zero-shot video object segmentation via attentive graph neural networks[C], 2019 New York IEEE 9236-9245

Accesses

Citations

Detail

Sections
Recommended

/