Learned distributed image compression with decoder side information

Yin Yankai , Sun Zhe , Ruan Peiying , Li Ruidong , Duan Feng

›› 2025, Vol. 11 ›› Issue (2) : 349 -358.

PDF
›› 2025, Vol. 11 ›› Issue (2) : 349 -358. DOI: 10.1016/j.dcan.2024.06.001
Original article

Learned distributed image compression with decoder side information

Author information +
History +
PDF

Abstract

With the rapid development of digital communication and the widespread use of the Internet of Things, multi-view image compression has attracted increasing attention as a fundamental technology for image data communication. Multi-view image compression aims to improve compression efficiency by leveraging correlations between images. However, the requirement of synchronization and inter-image communication at the encoder side poses significant challenges, especially for constrained devices. In this study, we introduce a novel distributed image compression model based on the attention mechanism to address the challenges associated with the availability of side information only during decoding. Our model integrates an encoder network, a quantization module, and a decoder network, to ensure both high compression performance and high-quality image reconstruction. The encoder uses a deep Convolutional Neural Network (CNN) to extract high-level features from the input image, which then pass through the quantization module for further compression before undergoing lossless entropy coding. The decoder of our model consists of three main components that allow us to fully exploit the information within and between images on the decoder side. Specifically, we first introduce a channel-spatial attention module to capture and refine information within individual image feature maps. Second, we employ a semi-coupled convolution module to extract both shared and specific information in images. Finally, a cross-attention module is employed to fuse mutual information extracted from side information. The effectiveness of our model is validated on various datasets, including KITTI Stereo and Cityscapes. The results highlight the superior compression capabilities of our method, surpassing state-of-the-art techniques.

Keywords

Digital communication / Image compression / Side information / Channel-spatial attention module / Cross-attention module

Cite this article

Download citation ▾
Yin Yankai, Sun Zhe, Ruan Peiying, Li Ruidong, Duan Feng. Learned distributed image compression with decoder side information. , 2025, 11(2): 349-358 DOI:10.1016/j.dcan.2024.06.001

登录浏览全文

4963

注册一个新账户 忘记密码

CRediT authorship contribution statement

Yankai Yin: Investigation. Zhe Sun: Resources. Peiying Ruan: Writing - review & editing. Ruidong Li: Project administration. Feng Duan: Supervision.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgements

This research was supported by the National Natural Science Foundation of China (Key Program) (No. 11932013), and the Tianjin Science and Technology Plan Project (No. 22PTZWHZ00040).

References

[1]

D.B. Kurka, D. Gündüz, Bandwidth-agile image transmission with deep joint source-channel coding, IEEE Trans. Wirel. Commun. 20 (12) (2021) 8081-8095.

[2]

W. Zhang, H. Zhang, H. Ma, H. Shao, N. Wang, V.C.M. Leung, Predictive and adap-tive deep coding for wireless image transmission in semantic communication, IEEE Trans. Wirel. Commun. 22 (8) (2023) 5486-5501.

[3]

T. Barnett, S. Jain, U. Andra, T. Khurana, Cisco visual networking index (VNI) com-plete forecast update, 2017-2022, Americas/EMEAR Cisco Knowl. Netw. Present. (2018) 1-30.

[4]

P. Li, G. Cheng, X. Huang, J. Kang, R. Yu, Y. Wu, M. Pan, D. Niyato, Snowball: energy efficient and accurate federated learning with coarse-to-fine compression over heterogeneous wireless edge devices, IEEE Trans. Wirel. Commun. 22 (10) (2023) 6778-6792.

[5]

G.K. Wallace, The JPEG still picture compression standard, IEEE Trans. Consum. Electron. 38 (1) (1992) xviii-xxxiv.

[6]

D.S. Taubman, M.W. Marcellin, M. Rabbani, JPEG2000: image compression funda-mentals, standards and practice, J. Electron. Imaging 11 (2) (2000) 286-287.

[7]

F. Li, S. Krivenko, V. Lukin, An automatic optimization method for BPG compres-sion based on visual perception, in: International Scientific-Practical Conference, Springer, 2021, pp. 213-225.

[8]

J. Ho, A. Jain, P. Abbeel, Denoising diffusion probabilistic models, Adv. Neural Inf. Process. Syst. 33 (2020) 6840-6851.

[9]

Y. Dong, Q. Liu, B. Du, L. Zhang, Weighted feature fusion of convolutional neural network and graph attention network for hyperspectral image classification, IEEE Trans. Image Process. 31 (2022) 1559-1572.

[10]

A.O. Abuassba, Z. Dezheng, H. Ali, F. Zhang, K. Ali, Classification with ensembles and case study on functional magnetic resonance imaging, Digit. Commun. Netw. 8 (1) (2022) 80-86.

[11]

K. Manning, X. Zhai, W. Yu, Image analysis and machine learning-based malaria assessment system, Digit. Commun. Netw. 8 (2) (2022) 132-142.

[12]

J. Zhu, L. Meng, W. Wu, D. Choi, J. Ni, Generative adversarial network-based at-mospheric scattering model for image dehazing, Digit. Commun. Netw. 7 (2) (2021) 178-186.

[13]

J. Ballé, V. Laparra, E.P. Simoncelli,End-to-end optimized image compression, in:International Conference on Learning Representations, 2017, pp. 1-27.

[14]

J. Ballé, D. Minnen, S. Singh, S.J. Hwang, N. Johnston,Variational image compres-sion with a scale hyperprior, in:International Conference on Learning Representa-tions, 2018, pp. 1-13.

[15]

D. Minnen, J. Ballé, G.D. Toderici,Joint autoregressive and hierarchical priors for learned image compression, Adv. Neural Inf. Process. Syst. 31 (2018) 10771-10780.

[16]

S. Ayzik, S. Avidan,Deep image compression using decoder side information, in:European Conference on Computer Vision, 2020, pp. 699-714.

[17]

D. Slepian, J. Wolf, Noiseless coding of correlated information sources, IEEE Trans. Inf. Theory 19 (4) (1973) 471-480.

[18]

A. Wyner, J. Ziv, The rate-distortion function for source coding with side informa-tion at the decoder, IEEE Trans. Inf. Theory 22 (1) (1976) 1-10.

[19]

S.S. Pradhan, K. Ramchandran, Distributed source coding using syndromes (DIS-CUS): design and construction, IEEE Trans. Inf. Theory 49 (3) (2003) 626-643.

[20]

Y. Zhao, J. Garcia-Frias, Joint estimation and compression of correlated nonbinary sources using punctured turbo codes, IEEE Trans. Commun. 53 (3) (2005) 385-390.

[21]

Y. Yang, S. Cheng, Z. Xiong, W. Zhao, Wyner-Ziv coding based on TCQ and LDPC codes, IEEE Trans. Commun. 57 (2) (2009) 376-387.

[22]

M.K. Singh, S.I. Amin, Energy-efficient data transmission technique for wireless sen-sor networks based on DSC and virtual MIMO, ETRI J. 42 (3) (2020) 341-350.

[23]

M. Song, J. Choi, B. Han,Variable-rate deep image compression through spatially-adaptive feature transform, in:Proceedings of the IEEE/CVF International Confer-ence on Computer Vision, 2021, pp. 2380-2389.

[24]

Y. Patel, S. Appalaraju, R. Manmatha,Saliency driven perceptual image compres-sion, in:Proceedings of the IEEE/CVF Winter Conference on Applications of Com-puter Vision, 2021, pp. 227-236.

[25]

M. Li, W. Zuo, S. Gu, D. Zhao, D. Zhang,Learning convolutional networks for content-weighted image compression, in:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 3214-3223.

[26]

D. He, Y. Zheng, B. Sun, Y. Wang, H. Qin,Checkerboard context model for effi-cient learned image compression, in:Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 14771-14780.

[27]

J. Liu, S. Wang, R. Urtasun, Dsic: deep stereo image compression,in: Pro-ceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 3136-3145.

[28]

X. Deng, W. Yang, R. Yang, M. Xu, E. Liu, Q. Feng, R. Timofte,Deep homography for efficient stereo image compression, in:Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 1492-1501.

[29]

M. Wödlinger, J. Kotera, J. Xu, R. Sablatnig, Sasic: stereo image compression with latent shifts and stereo attention,in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 661-670.

[30]

J. Lei, X. Liu, B. Peng, D. Jin, W. Li, J. Gu,Deep stereo image compression via bi-directional coding, in:Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 19669-19678.

[31]

N. Mital, E. Özyılkan, A. Garjani, D. Gündüz, Neural distributed image compres-sion using common information, in: Data Compression Conference, IEEE, 2022, pp. 182-191.

[32]

N. Mital, E. Özyilkan, A. Garjani, D. Gündüz,Neural distributed image compres-sion with cross-attention feature alignment, in:Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023, pp. 2498-2507.

[33]

Y. Yuan, L. Huang, J. Guo, C. Zhang, X. Chen, J. Wang, Ocnet: object context net-work for scene parsing, arXiv preprint, arXiv : 1809.00916.

[34]

D. Hu, An introductory survey on attention mechanisms in NLP problems, in: Proceedings of the 2019 Intelligent Systems Conference, Springer, 2020, pp. 432-448.

[35]

I. Bello, B. Zoph, A. Vaswani, J. Shlens, Q.V. Le,Attention augmented convolutional networks, in:Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019, pp. 3286-3295.

[36]

J. Hu, L. Shen, G. Sun,Squeeze-and-excitation networks, in:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 7132-7141.

[37]

S. Woo, J. Park, J.-Y. Lee, I.S. Kweon, Cbam: convolutional block attention module,in: Proceedings of the European Conference on Computer Vision, 2018, pp. 3-19.

[38]

J. Xu, B. Ai, W. Chen, A. Yang, P. Sun, M. Rodrigues, Wireless image transmission using deep source channel coding with attention modules, IEEE Trans. Circuits Syst. Video Technol. 32 (4) (2021) 2315-2328.

[39]

Z. Cheng, H. Sun, M. Takeuchi, J. Katto,Learned image compression with dis-cretized Gaussian mixture likelihoods and attention modules, in:Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 7939-7948.

[40]

J. Liu, G. Lu, Z. Hu, D. Xu, A unified end-to-end framework for efficient deep image compression, arXiv preprint, arXiv : 2002.03370.

[41]

X. Chu, L. Chen, W. Yu, Nafssr: stereo image super-resolution using NAFNet,in: Pro-ceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 1239-1248.

[42]

X. Ying, Y. Wang, L. Wang, W. Sheng, W. An, Y. Guo, A stereo attention module for stereo image super-resolution, IEEE Signal Process. Lett. 27 (2020) 496-500.

[43]

A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, et al. Pytorch: an imperative style, high-performance deep learning library, Adv. Neural Inf. Process., Syst. 32 (2019) 8026-8037.

[44]

A. Geiger, P. Lenz, C. Stiller, R. Urtasun, Vision meets robotics: the kitti dataset, Int. J. Robot. Res. 32 (11) (2013) 1231-1237.

[45]

M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, B. Schiele,The cityscapes dataset for semantic urban scene understanding, in:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 3213-3223.

[46]

S.J. Reddi, S. Kale, S. Kumar,On the convergence of Adam and beyond, in:Interna-tional Conference on Learning Representations, 2018, pp. 1-23.

AI Summary AI Mindmap
PDF

588

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/