Video summarization via global feature difference optimization

Yunzuo Zhang , Yameng Liu

Optoelectronics Letters ›› 2023, Vol. 19 ›› Issue (9) : 570 -576.

PDF
Optoelectronics Letters ›› 2023, Vol. 19 ›› Issue (9) : 570 -576. DOI: 10.1007/s11801-023-2212-0
Article

Video summarization via global feature difference optimization

Author information +
History +
PDF

Abstract

Video summarization aims at selecting valuable clips for browsing videos with high efficiency. Previous approaches typically focus on aggregating temporal features while ignoring the potential role of visual representations in summarizing videos. In this paper, we present a global difference-aware network (GDANet) that exploits the feature difference across frame and video as guidance to enhance visual features. Initially, a difference optimization module (DOM) is devised to enhance the discriminability of visual features, bringing gains in accurately aggregating temporal cues. Subsequently, a dual-scale attention module (DSAM) is introduced to capture informative contextual information. Eventually, we design an adaptive feature fusion module (AFFM) to make the network adaptively learn context representations and perform feature fusion effectively. We have conducted experiments on benchmark datasets, and the empirical results demonstrate the effectiveness of the proposed framework.

Cite this article

Download citation ▾
Yunzuo Zhang, Yameng Liu. Video summarization via global feature difference optimization. Optoelectronics Letters, 2023, 19(9): 570-576 DOI:10.1007/s11801-023-2212-0

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

ApostolidisE, AdamantidouE, MetsaiA I, et al.. Video summarization using deep neural networks: a survey[J]. Proceedings of the IEEE, 2021, 109(11):1838-1863

[2]

LeiJ, LuanQ, SongX, et al.. Action parsing-driven video summarization based on reinforcement learning[J]. IEEE transactions on circuits and systems for video technology, 2018, 29(7):2126-2137

[3]

HuangC, WangH. A novel key-frames selection framework for comprehensive video summarization[J]. IEEE transactions on circuits and systems for video technology, 2019, 30(2):577-589

[4]

YuanL, TayF E H, LiP, et al.. Cycle-SUM: cycle-consistent adversarial LSTM networks for unsupervised video summarization[C]//Proceedings of the AAAI Conference on Artificial Intelligence, January 27-February 1, 2019, Hawaii, USA. Washington: AAAI, 2019, 33(01):9143-9150

[5]

ChuW S, SongY, JaimesA. Video co-summarization: video summarization by visual co-occurrence[C], 2015, New York, IEEE: 3584-3592

[6]

MeiS, GuanG, WangZ, et al.. L2,0 constrained sparse dictionary selection for video summarization[C], 2014, New York, IEEE: 1-6

[7]

ZhangK, ChaoW L, ShaF, et al.. Video summarization with long short-term memory[C], 2016, Berlin, Springer: 766-782

[8]

Yue-HeiN G J, HausknechtM, VijayanarasimhanS, et al.. Beyond short snippets: deep networks for video classification[C], 2015, New York, IEEE: 4694-4702

[9]

ZhaoB, LiX, LuX. Hierarchical recurrent neural network for video summarization[C], 2017, New York, ACM: 863-871

[10]

ZhaoB, LiX, LuX. HSA-RNN: hierarchical structure-adaptive RNN for video summarization[C], 2018, New York, IEEE: 7405-7414

[11]

JungY, ChoD, KimD, et al.. Discriminative feature learning for unsupervised video summarization[C]//Proceedings of the AAAI Conference on Artificial Intelligence, January 27–February 1, 2019, Hawaii, USA. Washington: AAAI, 2019, 33(01):8537-8544

[12]

FuH, WangH. Self-attention binary neural tree for video summarization[J]. Pattern recognition letters, 2021, 143: 19-26

[13]

KanafaniH, GhauriJ A, HakimovS, et al.. Unsupervised video summarization via multi-source features[C], 2021, New York, ACM: 466-470

[14]

SzegedyC, LiuW, JiaY, et al.. Going deeper with convolutions[C], 2015, New York, IEEE: 1-9

[15]

VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[J]. Advances in neural information processing systems, 2017, 30.

[16]

PotapovD, DouzeM, HarchaouiZ, et al.. Category-specific video summarization[C], 2014, Berlin, Springer: 540-555

[17]

GygliM, GrabnerH, RiemenschneiderH, et al.. Creating summaries from user videos[C], 2014, Berlin, Springer: 505-520

[18]

SongY, VallmitjanaJ, StentA, et al.. TVSUM: summarizing web videos using titles[C], 2015, New York, IEEE: 5179-5187

[19]

De AvilaS E F, LopesA P B, Da LuzJ R A, et al.. VSUMM: a mechanism designed to produce static video summaries and a novel evaluation method[J]. Pattern recognition letters, 2011, 32(1):56-68

[20]

MahasseniB, LamM, TodorovicS. Unsupervised video summarization with adversarial LSTM networks[C], 2017, New York, IEEE: 202-211

[21]

ZhouK, QiaoY, XiangT. Deep reinforcement learning for unsupervised video summarization with diversity-representativeness reward[C], 2018, Washington, AAAI 32(1)

[22]

RochanM, YeL, WangY. Video summarization using fully convolutional sequence networks[C], 2018, Berlin, Springer: 347-363

[23]

ZhaoB, LiH, LuX, et al.. Reconstructive sequence-graph network for video summarization[J]. IEEE transactions on pattern analysis and machine intelligence, 2021, 44(5):2793-2801

[24]

LiuT, MengQ, HuangJ J, et al.. Video summarization through reinforcement learning with a 3D spatio-temporal U-Net[J]. IEEE transactions on image processing, 2022, 31: 1573-1586

AI Summary AI Mindmap
PDF

185

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/