
LLM-Driven Cognitive Diagnosis with SOLO Taxonomy: A Model-Agnostic Framework
Zhiang Dong, Jingyuan Chen, Fei Wu
Frontiers of Digital Education ›› 2025, Vol. 2 ›› Issue (2) : 20.
LLM-Driven Cognitive Diagnosis with SOLO Taxonomy: A Model-Agnostic Framework
With the development of the Internet and intelligent education systems, the significance of cognitive diagnosis has become increasingly acknowledged. Cognitive diagnosis models (CDMs) aim to characterize learners’ cognitive states based on their responses to a series of exercises. However, conventional CDMs often struggle with less frequently observed learners and items, primarily due to limited prior knowledge. Recent advancements in large language models (LLMs) offer a promising avenue for infusing rich domain information into CDMs. However, integrating LLMs directly into CDMs poses significant challenges. While LLMs excel in semantic comprehension, they are less adept at capturing the fine-grained and interactive behaviours central to cognitive diagnosis. Moreover, the inherent difference between LLMs’ semantic representations and CDMs’ behavioural feature spaces hinders their seamless integration. To address these issues, this research proposes a model-agnostic framework to enhance the knowledge of CDMs through LLMs extensive knowledge. It enhances various CDM architectures by leveraging LLM-derived domain knowledge and the structure of observed learning outcomes taxonomy. It operates in two stages: first, LLM diagnosis, which simultaneously assesses learners via educational techniques to establish a richer and a more comprehensive knowledge representation; second, cognitive level alignment, which reconciles the LLM’s semantic space with the CDM’s behavioural domain through contrastive learning and mask-reconstruction learning. Empirical evaluations on multiple real-world datasets demonstrate that the proposed framework significantly improves diagnostic accuracy and underscoring the value of integrating LLM-driven semantic knowledge into traditional cognitive diagnosis paradigms.
large language models / cognitive diagnosis models / intelligent education system, SOLO taxonomy, knowledge representation
[1] |
Abbasiantaeb, Z., Yuan, Y. F., Kanoulas, E., Aliannejadi, M. (2024). Let the LLMs talk: Simulating human-to-human conversational QA via zero-shot LLM-to-LLM interactions. In: Proceedings of the 17th ACM International Conference on Web Search and Data Mining. Merida: ACM, 8–17.
|
[2] |
Bi, H. Y., Chen, E. H., He, W. D., Wu, H., Zhao, W. H., Wang, S. J., Wu, J. Z. (2023). BETA-CD: A Bayesian meta-learned cognitive diagnosis framework for personalized learning. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence. Washington: AAAI Press, 5018–5026.
|
[3] |
Bi, H. Y., Ma, H. P., Huang, Z. Y., Yin, Y., Liu, Q., Chen, E. H., Su, Y., Wang, S. J. (2020). Quality meets diversity: A model-agnostic framework for computerized adaptive testing. In: Proceedings of 2020 IEEE International Conference on Data Mining. Sorrento: IEEE, 42–51.
|
[4] |
Biggs, J. B., Collis, K. F. (1982). Evaluating the quality of learning: The SOLO taxonomy (structure of the observed learning outcome). New York: Academic Press.
|
[5] |
Cui, J. Q., Zhong, Z. S., Tian, Z. T., Liu, S., Yu, B., Jia, J. Y. (2024). Generalized parametric contrastive learning.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(12): 7463–7474
|
[6] |
Dai, Z. L., Yao, C., Han, W. K., Yuanying, Y. Y., Gao, Z. P., Chen, J. Y. (2024). MPCoder: Multi-user personalized code generator with explicit and implicit style representation learning. In: Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics. Bangkok: ACL, 3765–3780.
|
[7] |
de la Torre, J. (2009). DINA model and parameter estimation: A didactic.Journal of Educational and Behavioral Statistics, 34(1): 115–130
|
[8] |
Deng, X., Bashlovkina, V., Han, F., Baumgartner, S., Bendersky, M. (2023). LLMs to the moon? Reddit market sentiment analysis with large language models. In: Proceedings of the ACM Web Conference 2023. New York: ACM, 1014–1019.
|
[9] |
Dong, Z., Chen, J. Y., Wu, F. (2025). Knowledge is power: Harnessing large language models for enhanced cognitive diagnosis. arXiv Preprint, arXiv:2502.05556.
|
[10] |
Gao, W. B., Liu, Q., Huang, Z. Y., Yin, Y., Bi, H. Y., Wang, M. C., Ma, J. H., Wang, S. J., Su, Y. (2021). RCD: Relation map driven cognitive diagnosis for intelligent education systems. In: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 501–510.
|
[11] |
He, K. M., Chen, X. L., Xie, S. N., Li, Y. H., Dollár, P., Girshick, R. (2022). Masked autoencoders are scalable vision learners. In: Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. New Orleans: IEEE, 16000–16009.
|
[12] |
Hoang, M., Bihorac, O. A., Rouces, J. (2019). Aspect-based sentiment analysis using BERT. In: Proceedings of the 22nd Nordic Conference on Computational Linguistics. Turku: ACL, 187–196.
|
[13] |
Hu, L. Y., Dong, Z. A., Chen, J. Y., Wang, G. F., Wang, Z. H., Zhao, Z., Wu, F. (2023). PTADisc: A cross-course dataset supporting personalized learning in cold-start scenarios. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. New Orleans: Curran Associates Inc., 44976–44996.
|
[14] |
Huang, Z. C., Jin, X. J., Lu, C. Z., Hou, Q. B., Cheng, M. M., Fu, D. M., Shen, X. H., Feng, J. S. (2024). Contrastive masked autoencoders are stronger vision learners.IEEE Transactions on Pattern Analysis and Machine Intelligence, 46(4): 2506–2517
|
[15] |
Huang, Z. Y., Liu, Q., Zhai, C. X., Yin, Y., Chen, E. H., Gao, W. B., Hu, G. P. (2019). Exploring multi-objective exercise recommendations in online education systems. In: Proceedings of the 28th ACM International Conference on Information and Knowledge Management. New York: ACM, 1261–1270.
|
[16] |
Hui, B. Y., Yang, J., Cui, Z. Y., Yang, J. X., Liu, D. Y. H., Zhang, L., Liu, T. Y., Zhang, J. J., Yu, B. W., Lu, K. M., ,
|
[17] |
Khosla, P., Teterwak, P., Wang, C., Sarna, A., Tian, Y. L., Isola, P., Maschinot, A., Liu, C., Krishnan, D. (2020). Supervised contrastive learning. In: Proceedings of the 34th International Conference on Neural Information Processing Systems. Vancouver: Curran Associates Inc., 18661–18673.
|
[18] |
Laskar, T. R., Hoque, E., Huang, J. X. (2022). Domain adaptation with pre-trained transformers for query-focused abstractive text summarization.Computational Linguistics, 48(2): 279–320
|
[19] |
Li, Q. Y., Fu, L. Y., Zhang, W. M., Chen, X. Y., Yu, J. W., Xia, W., Zhang, W. N., Tang, R. M., Yu, Y. (2023). Adapting large language models for education: Foundational capabilities, potentials, and challenges. arXiv Preprint, arXiv:2401.08664.
|
[20] |
Lin, W., Chen, J. Y., Shi, J. X., Guo, Z. R., Zhu, Y. C., Wang, Z. H., Jin, T., Zhao, Z., Wu, F., Yan, S. C., Zhang, H. W. (2024a). Action imitation in common action space for customized action image synthesis. In: Proceedings of the 38th Annual Conference on Neural Information Processing Systems. Vancouver.
|
[21] |
Lin, W., Chen, J. Y., Shi, J. X., Zhu, Y. C., Liang, C., Miao, J. Z., Jin, T., Zhao, Z., Wu, F., Yan, S. C., Zhang, H. W. (2024b). Non-confusing generation of customized concepts in diffusion models. In: Proceedings of the 41st International Conference on Machine Learning. Vienna: JMLR, 1206.
|
[22] |
Lin, W., Feng, Y. Y., Han, W. K., Jin, T., Zhao, Z., Wu, F., Yao, C., Chen, J. Y. (2024c). E3: Exploring embodied emotion through a large-scale egocentric video dataset. In: Proceedings of the 38th Conference on Neural Information Processing Systems Datasets and Benchmarks Track. Vancouver.
|
[23] |
Liu, Q. (2021). Towards a new generation of cognitive diagnosis. In: Proceedings of the 30th International Joint Conference on Artificial Intelligence. Montreal: ijcai, 4961–4964.
|
[24] |
Liu, J. Y., Huang, Z. Y., Xiao, T., Sha, J., Wu, J. Z., Liu, Q., Wang, S. J., Chen, E. H. (2024a). SocraticLM: Exploring Socratic personalized teaching with large language models. In: Proceedings of the 38th Annual Conference on Neural Information Processing Systems. Vancouver.
|
[25] |
Liu, S., Shen, J. H., Qian, H., Zhou, A. M. (2024b). Inductive cognitive diagnosis for fast student learning in web-based intelligent education systems. In: Proceedings of the ACM Web Conference 2024. New York: ACM, 4260–4271.
|
[26] |
Liu, Z. Y., Yin, S. X., Lin, G. Y., Chen, N. F. (2024c). Personality-aware student simulation for conversational intelligent tutoring systems. In: Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing. Miami: ACL.
|
[27] |
Lord, F. (1952). A theory of test scores. Psychometric Monographs No. 7. Richmond: Psychometric Corporation.
|
[28] |
Moslem, Y., Haque, R., Kelleher, J. D., Way, A. (2023). Adaptive machine translation with large language models. In: Proceedings of the 24th Annual Conference of the European Association for Machine Translation. Tampere: European Association for Machine Translation, 227–237.
|
[29] |
Reckase, M. D. (2009). Multidimensional item response theory models. In: Reckase, M. D., ed. Multidimensional item response theory. New York: Springer, 79–112.
|
[30] |
Su, H. J., Shi, W. J., Kasai, J., Wang, Y. Z., Hu, Y. S., Ostendorf, M., Yih, W. T., Smith, N. A., Zettlemoyer, L., Yu, T. (2023). One embedder, any task: Instruction-finetuned text embeddings. In: Proceedings of the Findings of the Association for Computational Linguistics. Toronto: ACL, 1102–1121.
|
[31] |
van den Oord, A., Li, Y. Z., Vinyals, O. (2018). Representation learning with contrastive predictive coding. arXiv Preprint, arXiv:1807.03748.
|
[32] |
van der Maaten, L., Hinton, G. (2008). Visualizing data using t-SNE.Journal of Machine Learning Research, 9(86): 2579–2605
|
[33] |
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In: Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates Inc., 6000–6010.
|
[34] |
Wang, F., Liu, Q., Chen, E. H., Huang, Z. Y., Chen, Y. Y., Yin, Y., Huang, Z., Wang, S. J. (2020). Neural cognitive diagnosis for intelligent education systems. In: Proceedings of the 34th AAAI Conference on Artificial Intelligence. New York: AAAI Press, 6153–6161.
|
[35] |
Wang, F., Liu, Q., Chen, E. H., Liu, C. R., Huang, Z. Y., Wu, J. Z., Wang, S. J. (2024a). Unified uncertainty estimation for cognitive diagnosis models. In: Proceedings of the ACM Web Conference 2024. New York: ACM, 3545–3554.
|
[36] |
Wang, S. S., Zeng, Z., Yang, X., Xu, K., Zhang, X. Y. (2024b). Boosting neural cognitive diagnosis with student’s affective state modeling. In: Proceedings of the 38th AAAI Conference on Artificial Intelligence. Vancouver: AAAI Press, 620–627.
|
[37] |
Wang, S. S., Zeng, Z., Yang, X., Zhang, X. Y. (2023). Self-supervised graph learning for long-tailed cognitive diagnosis. In: Proceedings of the 37th AAAI Conference on Artificial Intelligence. Washington: AAAI Press, 110–118.
|
[38] |
Wu, T., Li, M. Z., Chen, J. Y., Ji, W., Lin, W., Gao, J. Y., Kuang, K., Zhao, Z., Wu, F. (2024). Semantic alignment for multimodal large language models. In: Proceedings of the 32nd ACM International Conference on Multimedia. New York: ACM, 3489–3498.
|
[39] |
Xia, J., Wu, L. R., Wang, G., Chen, J. T., Li, S. Z. (2022). ProGCL: Rethinking hard negative mining in graph contrastive learning. In: Proceedings of the 39th International Conference on Machine Learning. Baltimore: PMLR, 24332–24346.
|
[40] |
Xu, S. L., Zhang, X. Y., Qin, L. H. (2024). EduAgent: Generative student agents in learning. arXiv Preprint, arXiv:2404.07963.
|
[41] |
Yu, X. S., Qin, C., Shen, D. Z., Ma, H. P., Zhang, L., Zhang, X. Y., Zhu, H. S., Xiong, H. (2024). RDGT: Enhancing group cognitive diagnosis with relation-guided dual-side graph transformer.IEEE Transactions on Knowledge and Data Engineering, 36(7): 3429–3442
|
[42] |
Zeng, A. H., Xu, B., Wang, B. W., Zhang, C. H., Yin, D., Zhang, D., Rojas, D., Feng, G. Y., Zhao, H. L., Lai, H. Y., ,
|
[43] |
Zhang, B., Haddow, B., Birch, A. (2023a). Prompting large language model for machine translation: A case study. In: Proceedings of the 40th International Conference on Machine Learning. Honolulu: JMLR, 41092–41110.
|
[44] |
Zhang, D. C., Zhang, K., Wu, L., Tian, M., Hong, R. C., Wang, M. (2024b). Path-specific causal reasoning for fairness-aware cognitive diagnosis. In: Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. New York: ACM, 4143–4154.
|
[45] |
Zhang, H. P., Liu, X., Zhang, J. W. (2023b). SummIt: Iterative text summarization via ChatGPT. In: Proceedings of the Findings of the Association for Computational Linguistics. Singapore: ACL, 10644–10657.
|
[46] |
Zhang, J. J., Hou, Y. P., Xie, R. B., Sun, W. Q., McAuley, J., Zhao, W. X., Lin, L. Y., Wen, J. R. (2024a). AgentCF: Collaborative learning with autonomous language agents for recommender systems. In: Proceedings of the ACM Web Conference 2024. New York: ACM, 3679–3689.
|
[47] |
Zhu, L. X., Huang, X. W., Sang, J. T. (2024). How reliable is your simulator? Analysis on the limitations of current LLM-based user simulators for conversational recommendation. In: Proceedings of the ACM Web Conference 2024. New York: ACM, 1726–1732.
|
[48] |
Zhuang, Y., Liu, Q., Huang, Z. Y., Li, Z., Jin, B. B., Bi, H. Y., Chen, E. H., Wang, S. J. (2022). A robust computerized adaptive testing approach in educational question retrieval. In: Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 416–426.
|
/
〈 |
|
〉 |