Evaluating quality of motion for unsupervised video object segmentation

Guanjun Cheng , Huihui Song

Optoelectronics Letters ›› 2024, Vol. 20 ›› Issue (6) : 379 -384.

PDF
Optoelectronics Letters ›› 2024, Vol. 20 ›› Issue (6) : 379 -384. DOI: 10.1007/s11801-024-3207-1
Article

Evaluating quality of motion for unsupervised video object segmentation

Author information +
History +
PDF

Abstract

Current mainstream unsupervised video object segmentation (UVOS) approaches typically incorporate optical flow as motion information to locate the primary objects in coherent video frames. However, they fuse appearance and motion information without evaluating the quality of the optical flow. When poor-quality optical flow is used for the interaction with the appearance information, it introduces significant noise and leads to a decline in overall performance. To alleviate this issue, we first employ a quality evaluation module (QEM) to evaluate the optical flow. Then, we select high-quality optical flow as motion cues to fuse with the appearance information, which can prevent poor-quality optical flow from diverting the network’s attention. Moreover, we design an appearance-guided fusion module (AGFM) to better integrate appearance and motion information. Extensive experiments on several widely utilized datasets, including DAVIS-16, FBMS-59, and YouTube-Objects, demonstrate that the proposed method outperforms existing methods.

Cite this article

Download citation ▾
Guanjun Cheng, Huihui Song. Evaluating quality of motion for unsupervised video object segmentation. Optoelectronics Letters, 2024, 20(6): 379-384 DOI:10.1007/s11801-024-3207-1

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

MaddernW, PascoeG, LinegarC, et al.. 1 year, 1000 km: the oxford robotcar dataset[J]. The international journal of robotics research, 2017, 36(1):3-15

[2]

XiongL, TangG. Multi-object tracking based on deep associated features for UAV applications[J]. Optoelectronics letters, 2023, 19(2):105-111

[3]

KarrayF, AlemzadehM, Abou SalehJ, et al.. Human-computer interaction: overview on state of the art[J]. International journal on smart sensing and intelligent systems, 2008, 1(1):137-159

[4]

YangS, ZhangL, QiJ, et al.. Learning motion-appearance co-attention for zero-shot video object segmentation[C], 2021, New York, IEEE: 1564-1573

[5]

PeiG, ShenF, YaoY, et al.. Hierarchical feature alignment network for unsupervised video object segmentation[C], 2022, Cham, Springer Nature: 596-613

[6]

JiG P, FuK, WuZ, et al.. Full-duplex strategy for video object segmentation[C], 2021, New York, IEEE: 4922-4933

[7]

ZhouT, WangS, ZhouY, et al.. Motion-attentive transition for zero-shot video object segmentation[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(07):13066-13073

[8]

CuiY, JiangC, WangL, et al.. MixFormer: end-to-end tracking with iterative mixed attention[C], 2022, New York, IEEE: 13608-13618

[9]

XieE, WangW, YuZ, et al.. SegFormer: simple and efficient design for semantic segmentation with transformers[J]. Advances in neural information processing systems, 2021, 34: 12077-12090

[10]

XuN, YangL, FanY, et al.. Youtube-vos: sequence-to-sequence video object segmentation[C], 2018, Berlin, Heidelberg, Springer-Verlag: 585-601

[11]

PerazziF, Pont-TusetJ, McwilliamsB, et al.. A benchmark dataset and evaluation methodology for video object segmentation[C], 2016, New York, IEEE: 724-732

[12]

TeedZ, DengJ. Raft: recurrent all-pairs field transforms for optical flow[C], 2020, Berlin, Heidelberg, Springer-Verlag: 402-419

[13]

OchsP, MalikJ, BroxT. Segmentation of moving objects by long term video analysis[J]. IEEE transactions on pattern analysis and machine intelligence, 2013, 36(6):1187-1200

[14]

PrestA, LeistnerC, CiveraJ, et al.. Learning object class detectors from weakly annotated video[C], 2012, New York, IEEE: 3282-3289

[15]

ZhangK, ZhaoZ, LiuD, et al.. Deep transport network for unsupervised video object segmentation[C], 2021, New York, IEEE: 8781-8790

[16]

RenS, LiuW, LiuY, et al.. Reciprocal transformations for unsupervised video object segmentation[C], 2021, New York, IEEE: 15455-15464

[17]

SchmidtC, AtharA, MahadevanS, et al.. D2conv3d: dynamic dilated convolutions for object segmentation in videos[C], 2022, New York, IEEE: 1200-1209

[18]

ChoS, LeeM, LeeS, et al.. Treating motion as option to reduce motion dependency in unsupervised video object segmentation[C], 2023, New York, IEEE: 5140-5149

[19]

WangW, LuX, ShenJ, et al.. Zero-shot video object segmentation via attentive graph neural networks[C], 2019, New York, IEEE: 9236-9245

AI Summary AI Mindmap
PDF

131

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/