Sales Forecasting of New Clothing Products Based on Hierarchical Multi-Modal Attention Recurrent Neural Network

Wenda SHI, Jinsong DU, Dichucheng LI

Journal of Donghua University(English Edition) ›› 2024, Vol. 41 ›› Issue (01) : 21-27.

PDF
Journal of Donghua University(English Edition) ›› 2024, Vol. 41 ›› Issue (01) : 21-27. DOI: 10.19884/j.1672-5220.202302017
Special Topic:Artificial Intelligence on Fashion and Textiles

Sales Forecasting of New Clothing Products Based on Hierarchical Multi-Modal Attention Recurrent Neural Network

Author information +
History +

Abstract

In the task of sales forecasting of new clothing products, the lack of historical sales data often necessitates the full utilization of data from other modalities as a supplement. However, multi-modal clothing data are usually redundant and heterogeneous. To solve the problems, a hierarchical multi-modal attention based recurrent neural network ( HMA-RNN ) including three main elements is proposed. The hierarchical structure separates high-level semantic information from low-level semantic information to avoid information redundancy. The multi-modal attention (MMA) is introduced in the fusion stage to mitigate inherent data non-alignment. The shared attention mechanism is utilized to build the dependencies across the multi-modal data. Experimental results on the Visuelle 2. 0 dataset show that the proposed approach achieves promising results with 72. 07 on the weighted average percentage error (WAPE) and 0. 80 on the mean absolute error (MAE), outperforming existing works significantly, which indicates the effectiveness of the proposed approach.

Keywords

clothing sales forecasting / multi-modal learning / deep learning / attention mechanism

Cite this article

Download citation ▾
Wenda SHI, Jinsong DU, Dichucheng LI. Sales Forecasting of New Clothing Products Based on Hierarchical Multi-Modal Attention Recurrent Neural Network. Journal of Donghua University(English Edition), 2024, 41(01): 21‒27 https://doi.org/10.19884/j.1672-5220.202302017

References

[[1]]
LOUREIRO A L D, MIGUéIS V L, DA SILVA L F M.Exploring the use of deep neural networks for sales forecasting in fashion retail[J]. Decision Support Systems, 2018, 114:81-93.
[[2]]
SKENDERI G, JOPPI C, DENITTO M, et al. Well googled is half done:multimodal forecasting of new fashion product sales with image-based google trends[EB/OL].(2021-09-20)[2023-1-30]. http://arxiv.org/abs/2109.09824.
[[3]]
PAPADOPOULOS S I, KOUTLIS C, PAPADOPOULOS S, et al. Multimodal quasi-autoregression:forecasting the visual popularity of new fashion products[J]. International Journal of Multimedia Information Retrieval, 2022, 11(4):717-729.
[[4]]
EKAMBARAM V, MANGLIK K, MUKHERJEE S, et al. Attention based multi-modal new product sales time-series forecasting[C]// Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. New York: ACM, 2020:3110-3118.
[[5]]
ZHAO J H, RU G H, YU Y, et al. Multimodal music emotion recognition with hierarchical cross-modal attention network[C]// 2022 IEEE International Conference on Multimedia and Expo. New York: IEEE, 2022:1-6.
[[6]]
TSAI Y H, BAI S, LIANG P P, et al. Multimodal transformer for unaligned multimodal language sequences[C]// Proceedings of the Conference Association for Computational Linguistics.Stroudsburg,Pennsylvania:ACL, 2019:6558-6569.
[[7]]
BHATTACHARJEE D, ZHANG T, SüSSTRUNK S, et al. MulT:an end-to-end multitask learning transformer[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition.Los Alamitos,California:IEEE, 2022:12021-12031.
[[8]]
SKENDERI G, JOPPI C, DENITTO M, et al. The multi-modal universe of fast-fashion:the Visuelle 2.0 benchmark[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops.Los Alamitos,California:IEEE, 2022:2240-2245.
[[9]]
VASWANI A, SHAZEER N, PARMAR N, et al. Attention is all you need[C]// Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017:6000-6010.
[[10]]
WU S T, ZHONG S H, LIU Y. Deep residual learning for image steganalysis[J]. Multimedia Tools and Applications, 2018, 77(9):10437-10453.
[[11]]
WANG S Y, HU Y, ZHU A A, et al. Residual network with enhanced positional attention and global prior for clothing parsing[J]. Journal of Donghua University (English Edition), 2022, 39(5):505-510.
[[12]]
HU Q H, ZHANG L J, ZHOU Y C, et al. Large-scale multimodality attribute reduction with multi-kernel fuzzy rough sets[J]. IEEE Transactions on Fuzzy Systems, 2018, 26(1):226-238.
[[13]]
HYNDMAN R J, ATHANASOPOULOS G. Forecasting:principles and practice[M]. 2nd ed.Australia: OTexts, 2018.
[[14]]
HEWAMALAGE H, BERGMEIR C, BANDARA K. Recurrent neural networks for time series forecasting:current status and future directions[J]. International Journal of Forecasting, 2021, 37(1):388-427.
[[15]]
MARTíNEZ F, FRíAS M P, PéREZ M D, et al. A methodology for applying k-nearest neighbor to time series forecasting[J]. Artificial Intelligence Review, 2019, 52(3):2019-2037.
PDF

Accesses

Citations

Detail

Sections
Recommended

/