A Survey of Edge-side Text-to-image Diffusion Models

Zi-Hao Qiu; Wenhao Yang; Lijun Zhang

doi:10.1007/s11704-026-52197-3

Front. Comput. Sci. ›› DOI: 10.1007/s11704-026-52197-3

REVIEW ARTICLE

A Survey of Edge-side Text-to-image Diffusion Models

Author information +

History +

PDF (3594KB)

Abstract

Diffusion models have emerged as a central focus in generative artificial intelligence due to their robust theoretical foundations and exceptional generation capabilities. Edge-side text-to-image diffusion models represent a critical application area, targeting efficient, low-latency image generation on resource-constrained platforms while preserving user privacy and data security. This survey systematically reviews recent advances in edge-side text-to-image diffusion models across theoretical foundations, algorithmic improvements, model architectures, and deployment optimization. We first analyze the core mathematical frameworks-Denoising Diffusion Probabilistic Models (DDPM), Score-based Generative Models (SGM), and Score-based Stochastic Differential Equations-alongside the mathematical basis for conditional generation. We then examine key improvements including latent space modeling, likelihood estimation optimization, efficient sampling algorithms, and consistency-based and flow matching methods. Subsequently, we explore the design, training, and evaluation of large-scale text-to-image models, emphasizing mainstream architectures and their trade-offs. Finally, we summarize essential edge deployment techniques: model quantization, lightweight architecture design, knowledge and step distillation, and computational optimizations, providing practical strategies for efficient inference on constrained hardware. This review aims to serve as a comprehensive reference for researchers, bridging theory and practice, and to promote the implementation and innovation of efficient text-to-image diffusion models in real-world scenarios such as mobile deployment.