Evaluating quality of motion for unsupervised video object segmentation

Guanjun Cheng, Huihui Song

Optoelectronics Letters ›› 2024, Vol. 20 ›› Issue (6) : 379-384.

Optoelectronics Letters ›› 2024, Vol. 20 ›› Issue (6) : 379-384. DOI: 10.1007/s11801-024-3207-1
Article

Evaluating quality of motion for unsupervised video object segmentation

Author information +
History +

Abstract

Current mainstream unsupervised video object segmentation (UVOS) approaches typically incorporate optical flow as motion information to locate the primary objects in coherent video frames. However, they fuse appearance and motion information without evaluating the quality of the optical flow. When poor-quality optical flow is used for the interaction with the appearance information, it introduces significant noise and leads to a decline in overall performance. To alleviate this issue, we first employ a quality evaluation module (QEM) to evaluate the optical flow. Then, we select high-quality optical flow as motion cues to fuse with the appearance information, which can prevent poor-quality optical flow from diverting the network’s attention. Moreover, we design an appearance-guided fusion module (AGFM) to better integrate appearance and motion information. Extensive experiments on several widely utilized datasets, including DAVIS-16, FBMS-59, and YouTube-Objects, demonstrate that the proposed method outperforms existing methods.

Cite this article

Download citation ▾
Guanjun Cheng, Huihui Song. Evaluating quality of motion for unsupervised video object segmentation. Optoelectronics Letters, 2024, 20(6): 379‒384 https://doi.org/10.1007/s11801-024-3207-1

References

[1]
MaddernW, PascoeG, LinegarC, et al.. 1 year, 1000 km: the oxford robotcar dataset[J]. The international journal of robotics research, 2017, 36(1):3-15
CrossRef Google scholar
[2]
XiongL, TangG. Multi-object tracking based on deep associated features for UAV applications[J]. Optoelectronics letters, 2023, 19(2):105-111
CrossRef Google scholar
[3]
KarrayF, AlemzadehM, Abou SalehJ, et al.. Human-computer interaction: overview on state of the art[J]. International journal on smart sensing and intelligent systems, 2008, 1(1):137-159
CrossRef Google scholar
[4]
YangS, ZhangL, QiJ, et al.. Learning motion-appearance co-attention for zero-shot video object segmentation[C], 2021, New York, IEEE: 1564-1573
[5]
PeiG, ShenF, YaoY, et al.. Hierarchical feature alignment network for unsupervised video object segmentation[C], 2022, Cham, Springer Nature: 596-613
[6]
JiG P, FuK, WuZ, et al.. Full-duplex strategy for video object segmentation[C], 2021, New York, IEEE: 4922-4933
[7]
ZhouT, WangS, ZhouY, et al.. Motion-attentive transition for zero-shot video object segmentation[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(07):13066-13073
CrossRef Google scholar
[8]
CuiY, JiangC, WangL, et al.. MixFormer: end-to-end tracking with iterative mixed attention[C], 2022, New York, IEEE: 13608-13618
[9]
XieE, WangW, YuZ, et al.. SegFormer: simple and efficient design for semantic segmentation with transformers[J]. Advances in neural information processing systems, 2021, 34: 12077-12090
[10]
XuN, YangL, FanY, et al.. Youtube-vos: sequence-to-sequence video object segmentation[C], 2018, Berlin, Heidelberg, Springer-Verlag: 585-601
[11]
PerazziF, Pont-TusetJ, McwilliamsB, et al.. A benchmark dataset and evaluation methodology for video object segmentation[C], 2016, New York, IEEE: 724-732
[12]
TeedZ, DengJ. Raft: recurrent all-pairs field transforms for optical flow[C], 2020, Berlin, Heidelberg, Springer-Verlag: 402-419
[13]
OchsP, MalikJ, BroxT. Segmentation of moving objects by long term video analysis[J]. IEEE transactions on pattern analysis and machine intelligence, 2013, 36(6):1187-1200
CrossRef Google scholar
[14]
PrestA, LeistnerC, CiveraJ, et al.. Learning object class detectors from weakly annotated video[C], 2012, New York, IEEE: 3282-3289
[15]
ZhangK, ZhaoZ, LiuD, et al.. Deep transport network for unsupervised video object segmentation[C], 2021, New York, IEEE: 8781-8790
[16]
RenS, LiuW, LiuY, et al.. Reciprocal transformations for unsupervised video object segmentation[C], 2021, New York, IEEE: 15455-15464
[17]
SchmidtC, AtharA, MahadevanS, et al.. D2conv3d: dynamic dilated convolutions for object segmentation in videos[C], 2022, New York, IEEE: 1200-1209
[18]
ChoS, LeeM, LeeS, et al.. Treating motion as option to reduce motion dependency in unsupervised video object segmentation[C], 2023, New York, IEEE: 5140-5149
[19]
WangW, LuX, ShenJ, et al.. Zero-shot video object segmentation via attentive graph neural networks[C], 2019, New York, IEEE: 9236-9245

Accesses

Citations

Detail

Sections
Recommended

/