MPFToD: a modularized pre-training framework for consistency identification in task-oriented dialogue

Libo QIN , Shijue HUANG , Qiguang CHEN , Qian LIU , Wanxiang CHE , Ruifeng XU

Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (10) : 1910351

PDF (1704KB)
Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (10) : 1910351 DOI: 10.1007/s11704-024-3778-9
Artificial Intelligence
RESEARCH ARTICLE

MPFToD: a modularized pre-training framework for consistency identification in task-oriented dialogue

Author information +
History +
PDF (1704KB)

Abstract

Consistency identification in task-oriented dialogue (CI-ToD) can prevent inconsistent dialogue response generation, which has recently emerged as an important and growing research area. This paper takes the first step to explore a pre-training paradigm for CI-ToD. Nevertheless, pre-training for CI-ToD is non-trivial because it requires a large amount of multi-turn KB-grounded dialogues, which are extremely hard to collect. To alleviate the data scarcity problem for pre-training, we introduce a modularized pre-training framework (MPFToD), which is capable of utilizing large amounts of KB-free dialogues. Specifically, such modularization allows us to decouple CI-ToD into three sub-modules and propose three pre-training tasks including (i) query response matching pre-training; (ii) dialogue history consistent identification pre-training; and (iii) KB mask language modeling to enhance different abilities of CI-ToD model. As different sub-tasks are solved separately, MPFToD can learn from large amounts of KB-free dialogues for different modules, which are much easier to obtain. Results on the CI-ToD benchmark show that MPFToD pushes the state-of-the-art performance from 56.3% to 61.0%. Furthermore, we show its transferability with promising performance on other downstream tasks (i.e., dialog act recognition, sentiment classification and table fact checking).

Graphical abstract

Keywords

task-oriented dialogue / consistency identification / modularized pre-training framework

Cite this article

Download citation ▾
Libo QIN, Shijue HUANG, Qiguang CHEN, Qian LIU, Wanxiang CHE, Ruifeng XU. MPFToD: a modularized pre-training framework for consistency identification in task-oriented dialogue. Front. Comput. Sci., 2025, 19(10): 1910351 DOI:10.1007/s11704-024-3778-9

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Qin L, Xie T, Huang S, Chen Q, Xu X, Che W. Don’t be contradicted with anything! CI-ToD: towards benchmarking consistency for task-oriented dialogue system. In: Proceedings of 2021 Conference on Empirical Methods in Natural Language Processing. 2021, 2357–2367

[2]

Chen Q, Zhu X, Ling Z H, Wei S, Jiang H, Inkpen D. Enhanced LSTM for natural language inference. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017, 1657–1668

[3]

Conneau A, Kiela D, Schwenk H, Barrault L, Bordes A. Supervised learning of universal sentence representations from natural language inference data. In: Proceedings of 2017 Conference on Empirical Methods in Natural Language Processing. 2017, 670–680

[4]

Yang R, Zhang J, Gao X, Ji F, Chen H. Simple and effective text matching with richer alignment features. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 4699–4709

[5]

Devlin J, Chang M W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019, 4171–4186

[6]

Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V. RoBERTa: a robustly optimized BERT pretraining approach. 2019, arXiv preprint arXiv: 1907.11692

[7]

Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, Stoyanov V, Zettlemoyer L. BART: denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 7871–7880

[8]

Qin L, Chen Q, Xie T, Liu Q, Huang S, Che W, Yu Z. CGIM: a cycle guided interactive learning model for consistency identification in task-oriented dialogue. In: Proceedings of the 29th International Conference on Computational Linguistics. 2022, 461–470

[9]

Qiu X, Sun T, Xu Y, Shao Y, Dai N, Huang X . Pre-trained models for natural language processing: a survey. Science China Technological Sciences, 2020, 63( 10): 1872–1897

[10]

Chen W, Wang H, Chen J, Zhang Y, Wang H, Li S, Zhou X, Wang W Y. TabFact: a large-scale dataset for table-based fact verification. In: Proceedings of the 8th International Conference on Learning Representations. 2020

[11]

Yin P, Neubig G, Yih W T, Riedel S. TaBERT: pretraining for joint understanding of textual and tabular data. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 8413–8426

[12]

Lee S, Schulz H, Atkinson A, Gao J, Suleman K, El Asri L, Adada M, Huang M, Sharma S, Tay W, Li X. Multi-domain task-completion dialog challenge. 2019

[13]

El Asri L, Schulz H, Sharma S, Zumer J, Harris J, Fine E, Mehrotra R, Suleman K. Frames: a corpus for adding memory to goal-oriented dialogue systems. In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. 2017, 207–219

[14]

Mrkšić N, Seaghdha D Ó, Wen T H, Thomson B, Young S. Neural belief tracker: data-driven dialogue state tracking. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2017, 1777–1788

[15]

Wen T H, Vandyke D, Mrkšić N, Gašić M, Rojas-Barahona L M, Su P H, Ultes S, Young S. A network-based end-to-end trainable task-oriented dialogue system. In: Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers. 2017, 438–449

[16]

Eric M, Krishnan L, Charette F, Manning C D. Key-value retrieval networks for task-oriented dialogue. In: Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue. 2017, 37–49

[17]

Budzianowski P, Wen T H, Tseng B H, Casanueva I, Ultes S, Ramadan O, Gašić M. MultiWOZ - a large-scale multi-domain Wizard-of-Oz dataset for task-oriented dialogue modelling. In: Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing. 2018, 5016–5026

[18]

Rastogi A, Zang X, Sunkara S, Gupta R, Khaitan P. Towards scalable multi-domain conversational agents: the schema-guided dialogue dataset. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 8689–8696

[19]

Li X, Wang Y, Sun S, Panda S, Liu J, Gao J. Microsoft dialogue challenge: building end-to-end task-completion dialogue systems. 2018, arXiv preprint arXiv: 1807.11125

[20]

Byrne B, Krishnamoorthi K, Sankar C, Neelakantan A, Goodrich B, Duckworth D, Yavuz S, Dubey A, Kim K Y, Cedilnik A. Taskmaster-1: toward a realistic and diverse dialog dataset. In: Proceedings of 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). 2019, 4516–4525

[21]

Shah P, Hakkani-Tür D, Liu B, Tür G. Bootstrapping a neural conversational agent with dialogue self-play, crowdsourcing and on-line reinforcement learning. In: Proceedings of 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 3 (Industry Papers). 2018, 41–51

[22]

Li Y, Su H, Shen X, Li W, Cao Z, Niu S. DailyDialog: a manually labelled multi-turn dialogue dataset. In: Proceedings of the 8th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2017, 986–995

[23]

Parikh A, Wang X, Gehrmann S, Faruqui M, Dhingra B, Yang D, Das D. ToTTo: a controlled table-to-text generation dataset. In: Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020. 1173–1186

[24]

Loshchilov I, Hutter F. Decoupled weight decay regularization. In: Proceedings of the 7th International Conference on Learning Representations. 2019

[25]

Yang Z, Dai Z, Yang Y, Carbonell J, Salakhutdinov R, Le Q V. Xlnet: generalized autoregressive pretraining for language understanding. In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 517

[26]

Beltagy I, Peters M E, Cohan A. Longformer: the long-document transformer. 2020, arXiv preprint arXiv: 2004.05150

[27]

Wu C S, Hoi S C H, Socher R, Xiong C. TOD-BERT: pre-trained natural language understanding for task-oriented dialogue. In: Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020, 917–929

[28]

Gururangan S, Marasović A, Swayamdipta S, Lo K, Beltagy I, Downey D, Smith N A. Don’t stop pretraining: adapt language models to domains and tasks. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 8342–8360

[29]

Cerisara C, Jafaritazehjani S, Oluokun A, Le H T. Multi-task dialog act and sentiment recognition on mastodon. In: Proceedings of the 27th International Conference on Computational Linguistics. 2018, 745–754

[30]

Qin L, Che W, Li Y, Ni M, Liu T. Dcr-Net: a deep co-interactive relation network for joint dialog act recognition and sentiment classification. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. 2020, 8665–8672

[31]

Zhuang Z, Chen Q, Ma L, Li M, Han Y, Qian Y, Bai H, Zhang W, Liu T. Through the lens of core competency: survey on evaluation of large language models. In: Proceedings of the 22nd Chinese National Conference on Computational Linguistics (Volume 2: Frontier Forum). 2023, 88–109

[32]

Qin L, Chen Q, Feng X, Wu Y, Zhang Y, Li Y, Li M, Che W, Yu P S. Large language models meet NLP: a survey. 2024, arXiv preprint arXiv: 2405.12819

[33]

Zhang S, Dinan E, Urbanek J, Szlam A, Kiela D, Weston J. Personalizing dialogue agents: I have a dog, do you have pets too? In: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2018, 2204–2213

[34]

Zheng Y, Chen G, Huang M, Liu S, Zhu X. Personalized dialogue generation with diversified traits. 2019, arXiv preprint arXiv:1901.09672

[35]

Welleck S, Weston J, Szlam A, Cho K. Dialogue natural language inference. In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019, 3731–3741

[36]

Dziri N, Kamalloo E, Mathewson K, Zaiane O. Evaluating coherence in dialogue systems using entailment. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). 2019, 3806–3812

[37]

Song H, Wang Y, Zhang W N, Zhao Z, Liu T, Liu X. Profile consistency identification for open-domain dialogue agents. In: Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020, 6651–6662

[38]

Nie Y, Williamson M, Bansal M, Kiela D, Weston J. I like fish, especially dolphins: addressing contradictions in dialogue modeling. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021, 1699–1713

[39]

Zhang Z, Guo T, Chen M. DialogueBERT: a self-supervised learning based dialogue pre-training encoder. In: Proceedings of the 30th ACM International Conference on Information & Knowledge Management. 2021, 3647–3651

[40]

Gu J C, Tao C, Ling Z, Xu C, Geng X, Jiang D. MPC-BERT: a pre-trained language model for multi-party conversation understanding. In: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). 2021, 3682–3692

[41]

Zhang Y, Sun S, Galley M, Chen Y C, Brockett C, Gao X, Gao J, Liu J, Dolan B. DIALOGPT: large-scale generative pre-training for conversational response generation. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations. 2020, 270–278

[42]

Zhao X, Wu W, Tao C, Xu C, Zhao D, Yan R. Low-resource knowledge-grounded dialogue generation. In: Proceedings of the 8th International Conference on Learning Representations. 2020

[43]

Bao S, He H, Wang F, Wu H, Wang H. PLATO: pre-trained dialogue generation model with discrete latent variable. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. 2020, 85–96

[44]

Su Y, Shu L, Mansimov E, Gupta A, Cai D, Lai Y A, Zhang Y. Multi-task pre-training for plug-and-play task-oriented dialogue system. In: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2022, 4661–4676

[45]

Shi X, Che W . Combating with extremely noisy samples in weakly supervised slot filling for automatic diagnosis. Frontiers of Computer Science, 2023, 17( 5): 175333

[46]

Qin L, Xu X, Wang L, Zhang Y, Che W . Modularized pre-training for end-to-end task-oriented dialogue. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2023, 31: 1601–1610

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (1704KB)

Supplementary files

Highlights

1236

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/