Expressive diffusion network: a novel approach to grayscale image colorization using diffusion models

WANG Xingshuo; WANG Tong

doi:10.19884/j.1672-5220.202412012

Journal of Donghua University(English Edition) ›› 2026, Vol. 43 ›› Issue (2) :103 -111. DOI: 10.19884/j.1672-5220.202412012

Information Technology and Artificial Intelligence

research-article

Expressive diffusion network: a novel approach to grayscale image colorization using diffusion models

WANG Xingshuo ¹^,²
, WANG Tong ¹^,²^,^*

Author information +

History +

PDF (10529KB)

Abstract

Image colorization has attracted considerable research interest over the past few decades. However, current methodologies frequently struggle with limited local colorization flexibility and produce unnatural color outputs, primarily due to the absence of comprehensive understanding of color perception. In this work, we propose an expressive diffusion network (EDN) that leverages a robust diffusion network to significantly enhance both colorization accuracy and diversity. The EDN consists of two main components: a pre-trained latent diffusion model and a perceptual luminance model based on VQ-Diffusion. These components work together to generate rich and vibrant colors while maintaining high fidelity to the structural features of the original grayscale image. The EDN incorporates controllable creative diffusion (CCD) to direct the color generation process toward more realistic outcomes. Extensive experiments demonstrate that the EDN outperforms existing methods in perceptual quality, offering notable improvements in visual realism and vibrancy across various scenes. The proposed EDN showcases significant improvements over ChromaGAN and InstColor, confirming its robustness in both simple and complex scenarios.

Keywords

diffusion model / image colorization / expressive diffusion network (EDN) / controllable creative diffusion (CCD) / guided model

Cite this article

Download citation ▾

WANG Xingshuo, WANG Tong. Expressive diffusion network: a novel approach to grayscale image colorization using diffusion models. Journal of Donghua University(English Edition), 2026, 43(2): 103-111 DOI:10.19884/j.1672-5220.202412012

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	LIU H Y , XING J B , XIE M S , et al. Improved diffusion—based image colorization via piggybacked models[EB/OL]. (2023—04—21)[2024—12—11]. https://arxiv.org/abs/2304.11105.

[2]	ZHANG W D , ZHUANG P X , SUN H H , et al. Underwater image enhancement via minimal color loss and locally adaptive contrast enhancement[J]. IEEE Transactions on Image Processing, 2022, 31: 3997—4010.

[3]	LI Y Y , WANG H , JIN Q , et al. SnapFusion: text—to—image diffusion model on mobile devices within two seconds[J]. Advances in Neural Information Processing Systems, 2024, 36: 1—12.

[4]	FEI B , LYU Z Y , PAN L , et al. Generative diffusion prior for unified image restoration and enhancement[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2023: 9935—9946.

[5]	ZHANG Y Z , ZHANG C Y , ZHANG T , et al. Self—attention guidance and multiscale feature fusion—based UAV image object detection[J]. IEEE Geoscience and Remote Sensing Letters, 2023, 20: 1—5.

[6]	FOSTER D H . Color constancy[J]. Vision Research, 2011, 51(7): 674—700.

[7]	ZHANG C S , ZHANG C N , ZHANG M C , et al. Text—to—image diffusion models in generative AI: a survey[EB/OL]. (2023—03—14)[2024—12—11]. https://arxiv.org/abs/2303.07909.

[8]	LESTER B , AL—RFOU R , CONSTANT N . The power of scale for parameter—efficient prompt tuning[C]// Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Stroudsburg: ACL, 2021: 3045—3059.

[9]	AN Z F , XU Z F , FAN E , et al. Enhancing visual realism: fine—tuning instructPix2Pix for advanced image colorization[EB/OL]. (2023—12—08)[2024—12—11]. https://arxiv.org/abs/2312.04780.

[10]	HUANG Y , HUANG J C , LIU Y F , et al. Diffusion model—based image editing: a survey[EB/OL]. (2024—02—27)[2024—12—11]. https://arxiv.org/abs/2402.17525.

[11]	SAHARIA C , CHAN W , SAXENA S , et al. Photorealistic text—to—image diffusion models with deep language understanding[J]. Advances in Neural Information Processing Systems, 2022, 35: 36479—36494.

[12]	SONG Y , DHARIWAL P , CHEN M , et al. Consistency models[EB/OL]. (2023—03—02)[2024—12—11]. https://arxiv.org/abs/2303.01469.

[13]	ROMBACH R , BLATTMANN A , LORENZ D , et al. High—resolution image synthesis with latent diffusion models[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2022: 10674—10685.

[14]	LIN T Y , MAIRE M , BELONGIE S , et al. Microsoft COCO: common objects in context[C]// ECCV 2014: 13th European Conference. Berlin: Springer, 2014.

[15]	DENG J , DONG W , SOCHER R , et al. ImageNet: a large—scale hierarchical image database[C]// Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2009: 248—255.

[16]	ZHOU J C , LI B S , ZHANG D H , et al. UGIF—net: an efficient fully guided information flow network for underwater image enhancement[J]. IEEE Transactions on Geoscience and Remote Sensing, 2023, 61: 1—17.

[17]	TANG Z C , GU S Y , BAO J M , et al. Improved vector quantized diffusion models[EB/OL]. (2022—05—31)[2024—12—11]. https://arxiv.org/abs/2205.16007.

[18]	ZHANG R , ISOLA P , EFROS A A . Colorful image colorization[C]// ECCV 2016: 14th European Conference. Berlin: Springer, 2016.

[19]	GAO D W , FENG Q , WEI Q F , et al. Dyeing of modal fiber in supercritical carbon dioxide using disperse dye CI (color index) disperse yellow 54[J]. Journal of Fiber Bioengineering and Informatics, 2010, 3(3): 148—152.

[20]	ZHANG R , ZHU J Y , ISOLA P , et al. Real—time user—guided image colorization with learned deep priors[J]. ACM Transactions on Graphics, 2017, 36(4): 1—11.

[21]	SALMONA A , BOUZA L , DELON J . DeOldify: a review and implementation of an automatic colorization method[J]. Image Processing on Line, 2022, 12: 347—368.

[22]	SU J W , CHU H K , HUANG J B . Instance—aware image colorization[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. New York: IEEE, 2020: 7965—7974.

[23]	VITORIA P , RAAD L , BALLESTER C . ChromaGAN: adversarial picture colorization with semantic class distribution[C]// Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. New York: IEEE, 2020: 2434—2443.