2026-12-15 2026, Volume 20 Issue 12

  • Select all
  • REVIEW ARTICLE
    Cong JIN, Jingru FAN, Jinfa HUANG, Jinyuan FU, Tao MEI, Li YUAN, Jiebo LUO

    Multimodal Foundation Models (MFMs), including diffusion models and multimodal large language models, have attracted significant interest owing to their scalable capabilities in tasks involving vision as well as vision-language understanding and generation. Despite the growing body of research focusing on MFMs’ advancements, a comprehensive review of their applications in the text-to-media domain remains limited. This review aims to bridge that gap by offering an exhaustive overview of MFMs’ development within the text-to-media landscape. We focus on four popular text prompt-based AI-generated content (AIGC) tasks: text-to-image, text-to-video, text-to-music, and text-to-motion generation. We delve into fundamental concepts, model architectures, training strategies, and dataset settings for each task. This work serves as a crucial resource for researchers aiming to develop text-to-media models using MFMs tailored to specific requirements. Moreover, we expose the evolution from traditional AIGC to Next-Gen AIGC by discussing the adaptation of advanced MFMs for text-to-media innovations. We identify existing challenges and outline future directions to help researchers gain deeper insights into the future trajectory of AIGC development and deployment. In summary, our review provides a comprehensive understanding of advancing the field of MFMs in text-to-media applications, which may have a far-reaching impact on the community.

  • LETTER
    Donglin ZHOU, Weike PAN, Zhong MING
  • LETTER
    Junyi LI, Chenweinan JIANG, Daixin WANG, Guo YE, Libang ZHANG, Huimei HE, Binbin HU, Zhiqiang ZHANG, Fuzhen ZHUANG
  • LETTER
    Chen CHEN, Yunchun LI, Mingyuan XIA, Wei LI
  • REVIEW ARTICLE
    Zheng LI, Wentai ZHU, Haohui HUANG, Yu WANG, Linzhang WANG

    Automated driving systems (ADSs) have made significant strides in recent years through the combined efforts of academia and industry. A typical ADS is composed of various complex modules, including perception, planning, and control. As emerging and complex computer programs, ADSs inevitably contain flaws, making it crucial to ensure their safety since any unsafe behavior can result in catastrophic outcomes. Testing is widely recognized as a key approach to ensuring ADS safety by uncovering unsafe behaviors. However, designing effective testing techniques for ADSs is exceptionally challenging due to the high complexity and multidisciplinary nature of these systems. Although an extensive body of literature focuses on ADS testing and several surveys summarizing technical advancements have been published, most concentrate on system-level testing performed within software simulators. Consequently, they often overlook the distinct characteristics, testing requirements, and datasets associated with various ADS modules. In this paper, we present a comprehensive survey of existing ADS testing literature. We begin by investigating the testing infrastructure for ADSs, including available datasets and tools, detailing their capabilities and characteristics. We then survey testing techniques for individual ADS modules (e.g., AI-based modules and firmware) and the integrated system, highlighting technical differences between validation layers. Finally, based on our findings, we identify key challenges and outline potential research opportunities in the field.

  • LETTER
    Shaorong XIE, Han ZHANG, Xiangfeng LUO, Zhenyu ZHANG, Mengke WANG, Hang YU
Publishing model
1

{"submissionFirstDecision":"40","jcrJfStr":"4.6 (2024)","editorEmail":"zhangdf@hep.com.cn"}

Downloads

{"submissionFirstDecision":"40","jcrJfStr":"4.6 (2024)","editorEmail":"zhangdf@hep.com.cn"}
Monthly

ISSN 2095-2228 (Print)
ISSN 2095-2236 (Online)
CN 10-1014/TP