PMMTD: Towards Proactive Multimodal Mixed-Type Dialogues

Hongfei XIA , Yuhang GUO , Bao CHEN , Linhao ZHENG , Zeming LIU , Haifeng WANG

Front. Comput. Sci. ››

PDF (4281KB)
Front. Comput. Sci. ›› DOI: 10.1007/s11704-026-51035-w
RESEARCH ARTICLE
PMMTD: Towards Proactive Multimodal Mixed-Type Dialogues
Author information +
History +
PDF (4281KB)

Abstract

Currently, mixed-type dialogue systems aim to handle complex conversations by integrating multiple dialogue types within a single interaction. However, existing approaches are predominantly text-based and lack the ability to proactively guide the conversation, which significantly limits their effectiveness in real-world scenarios. For instance, when a user is unfamiliar with a concept being discussed, a more natural and effective system response would be to proactively present an image, rather than continuing with additional text-based explanations. In this paper, we formally identify this limitation and define the challenge of building a proactive multimodal mixed-type dialogue system capable of handling realistic, dynamic dialogue situations. To mitigate this challenge, we propose a new task and introduce a novel Proactive Multimodal Mixed-Type Dialogue dataset, PMMTD, which spans four dialogue types, conversational recommendation, task-oriented dialogues, Q&A, and chitchat. Specifically, each dialogue in PMMTD involves multimodal information and rich dialogue types with natural topic transitions. Additionally, we propose a proactive multimodal mixed-type dialogue generation framework with a novel Composite Structure-Guiding mechanism, termed CSG, and build baselines for PMMTD to address this task. Experimental results show the effectiveness of CSG. We will open-source PMMTD and CSG at https://github.com/BITHLP/PMMTD.

Keywords

Multimodal mixed-type dialogues / Mixed-type dialogues / Dialogue systems

Cite this article

Download citation ▾
Hongfei XIA, Yuhang GUO, Bao CHEN, Linhao ZHENG, Zeming LIU, Haifeng WANG. PMMTD: Towards Proactive Multimodal Mixed-Type Dialogues. Front. Comput. Sci. DOI:10.1007/s11704-026-51035-w

登录浏览全文

4963

注册一个新账户 忘记密码

References

RIGHTS & PERMISSIONS

Higher Education Press 2026

PDF (4281KB)

0

Accesses

0

Citation

Detail

Sections
Recommended

/