Selecting text classification model through maximizing posterior evidence over informative sub-space

Zhiwei SUN , Jun BAI , Zhuofan CHEN , Chen LI , Wenge RONG , Zhang XIONG

Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (12) : 1912377

PDF (1886KB)
Front. Comput. Sci. ›› 2025, Vol. 19 ›› Issue (12) : 1912377 DOI: 10.1007/s11704-025-41380-7
Artificial Intelligence
RESEARCH ARTICLE

Selecting text classification model through maximizing posterior evidence over informative sub-space

Author information +
History +
PDF (1886KB)

Abstract

Text classification is a pivotal task in natural language understanding, and its performance has seen remarkable advancements with the rise of Pre-trained Language Models (PLMs). Recently, the proliferation of PLMs has made it increasingly challenging to choose the most suitable model for a given dataset. Since fine-tuning the sheer number of models is impractical, Transferability Estimation (TE) has become a promising solution to efficient model selection. Unlike current TE methods that focus solely on fixed and hard class assignments to evaluate the quality of model-encoded features, our approach further takes into account the inter-sample and inter-model variations represented by soft class assignments. We achieve this by utilizing class embeddings to predict posterior class assignments, with the logarithm of the maximum posterior evidence serving as the transferability score. Moreover, we found that the informative sub-space of the dataset can lead to more accurate calculation of soft class assignments, where we achieve efficient annotation of informative samples by eliciting the powerful judging ability of large language model. The resulting posterior evidence over the informative sub-space, LogIPE, enables us to capture subtle differences between models, enhancing the accuracy of model selection and validated by extensive experiments conducted on a wide range of text classification datasets as well as candidate PLMs.

Graphical abstract

Keywords

text classification / model selection / posterior evidence / informative sub-space

Cite this article

Download citation ▾
Zhiwei SUN, Jun BAI, Zhuofan CHEN, Chen LI, Wenge RONG, Zhang XIONG. Selecting text classification model through maximizing posterior evidence over informative sub-space. Front. Comput. Sci., 2025, 19(12): 1912377 DOI:10.1007/s11704-025-41380-7

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Devlin J, Chang M W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019, 4171–4186

[2]

Radford A, Wu J, Child R, Luan D, Amodei D, Sutskever I . Language models are unsupervised multitask learners. OpenAI Blog, 2019, 1( 8): 9

[3]

Yang A, Yang B, Hui B, Zheng B, Yu B, Zhou C, Li C, Li C, Liu D, Huang F, Dong G, Wei H, Lin H, Tang J, Wang J, Yang J, Tu J, Zhang J, Ma J, Yang J, Xu J, Zhou J, Bai J, He J, Lin J, Dang K, Lu K, Chen K, Yang K, Li M, Xue M, Ni N, Zhang P, Wang P, Peng R, Men R, Gao R, Lin R, Wang S, Bai S, Tan S, Zhu T, Li T, Liu T, Ge W, Deng X, Zhou X, Ren X, Zhang X, Wei X, Ren X, Liu X, Fan Y, Yao Y, Zhang Y, Wan Y, Chu Y, Liu Y, Cui Z, Zhang Z, Guo Z, Fan Z. Qwen2 technical report. 2024, arXiv preprint arXiv: 2407.10671

[4]

Touvron H, Martin L, Stone K, Albert P, Almahairi A, Babaei Y, Bashlykov N, Batra S, Bhargava P, Bhosale S, Bikel D, Blecher L, Canton Ferrer C, Chen M, Cucurull G, Esiobu D, Fernandes J, Fu J, Fu W, Fuller B, Gao C, Goswami V, Goyal N, Hartshorn A, Hosseini S, Hou R, Inan H, Kardas M, Kerkez V, Khabsa M, Kloumann I, Korenev A, Koura P S, Lachaux M A, Lavril T, Lee J, Liskovich D, Lu Y, Mao Y, Martinet X, Mihaylov T, Mishra P, Molybog I, Nie Y, Poulton A, Reizenstein J, Rungta R, Saladi K, Schelten A, Silva R, Smith E M, Subramanian R, Tan X E, Tang B, Taylor R, Williams A, Kuan J X, Xu P, Yan Z, Zarov I, Zhang Y, Fan A, Kambadur M, Narang S, Rodriguez A, Stojnic R, Edunov S, Scialom T. Llama 2: open foundation and fine-tuned chat models. 2023, arXiv preprint arXiv: 2307.09288

[5]

Hou B, O’Connor J, Andreas J, Chang S, Zhang Y. PromptBoosting: black-box text classification with ten forward passes. In: Proceedings of the 40th International Conference on Machine Learning. 2023, 13309–13324

[6]

Bai J, Zhang X, Li C, Hong H, Xu X, Lin C, Rong W. How to determine the most powerful pre-trained language model without brute force fine-tuning? An empirical survey. In: Proceedings of Findings of the Association for Computational Linguistics: EMNLP. 2023, 5369–5382

[7]

Zamir A, Sax A, Shen W, Guibas L J, Malik J, Savarese S. Taskonomy: disentangling task transfer learning. In: Proceedings of the 28th International Joint Conference on Artificial Intelligence. 2019, 6241–6245

[8]

Bai J, Chen Z, Li Z, Hong H, Zhang J, Li C, Lin C, Rong W. Leveraging estimated transferability over human intuition for model selection in text ranking. In: Proceedings of 2024 Conference on Empirical Methods in Natural Language Processing. 2024, 12356–12374

[9]

Bolya D, Mittapalli R, Hoffman J. Scalable diverse model selection for accessible transfer learning. In: Proceedings of the 35th International Conference on Neural Information Processing Systems. 2021, 19301–19312

[10]

Puigcerver J, Ruiz C R, Mustafa B, Renggli C, Pinto A S, Gelly S, Keysers D, Houlsby N. Scalable transfer learning with expert models. In: Proceedings of the 9th International Conference on Learning Representations. 2021

[11]

Pándy M, Agostinelli A, Uijlings J, Ferrari V, Mensink T. Transferability estimation using Bhattacharyya class separability. In: Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 9162–9172

[12]

Zhang D, Sensoy M, Makrehchi M, Taneva-Popova B. Uncertainty quantification for text classification. In: Proceedings of the 45th European Conference on Information Retrieval. 2023, 362–369

[13]

Agostinelli A, Pándy M, Uijlings J, Mensink T, Ferrari V. How stable are transferability metrics evaluations? In: Proceedings of the 17th European Conference on Computer Vision. 2022, 303–321

[14]

Su T, Zhang J, Wang G, Liu X. Self-supervised learning with explorative knowledge distillation. In: Proceedings of 2023 IEEE International Conference on Acoustics, Speech and Signal Processing. 2023, 1–5

[15]

Sun Z, Bai J, Li Z, Li C, Rong W, Ouyang Y, Xiong Z. Logarithm of maximum posterior evidence: advanced model selection for text classification. In: Proceedings of the 17th International Conference on Knowledge Science, Engineering and Management. 2024, 229–240

[16]

Collins E, Rozanov N, Zhang B. Evolutionary data measures: understanding the difficulty of text classification tasks. In: Proceedings of the 22nd Conference on Computational Natural Language Learning. 2018, 380–391

[17]

Zhao B, Bai J, Li C, Zhang J, Rong W, Ouyang Y, Xiong Z . Enhancing biomedical ReQA with adversarial hard in-batch negative samples. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2023, 20( 5): 2933–2944

[18]

Padurariu C, Breaban M E . Dealing with data imbalance in text classification. In: Proceedings of the 23rd International Conference on Knowledge-Based and Intelligent Information & Engineering Systems, 2019, 736–745

[19]

Nguyen Q, Valizadegan H, Hauskrecht M . Learning classification models with soft-label information. Journal of the American Medical Informatics Association, 2014, 21( 3): 501–508

[20]

Garcia L P F, De Carvalho A C P L F, Lorena A C . Effect of label noise in the complexity of classification problems. Neurocomputing, 2015, 160: 108–119

[21]

Wang L, Han M, Li X, Zhang N, Cheng H . Review of classification methods on unbalanced data sets. IEEE Access, 2021, 9: 64606–64628

[22]

Maldonado S, Vairetti C, Fernández A, Herrera F . FW-SMOTE: a feature-weighted oversampling approach for imbalanced classification. Pattern Recognition, 2022, 124: 108511

[23]

Li H, Dong Q, Chen J, Su H, Zhou Y, Ai Q, Ye Z, Liu Y. LLMs-as-judges: a comprehensive survey on LLM-based evaluation methods. 2024, arXiv preprint arXiv: 2412.05579

[24]

Li D, Jiang B, Huang L, Beigi A, Zhao C, Tan Z, Bhattacharjee A, Jiang Y, Chen C, Wu T, Shu K, Cheng L, Liu H. From generation to judgment: opportunities and challenges of LLM-as-a-judge. 2024, arXiv preprint arXiv: 2411.16594

[25]

Tan S, Zhuang S, Montgomery K, Tang W Y, Cuadron A, Wang C, Popa R, Stoica I. JudgeBench: a benchmark for evaluating LLM-based judges. In: Proceedings of the 13th International Conference on Learning Representations. 2025

[26]

Zheng L, Chiang W L, Sheng Y, Zhuang S, Wu Z, Zhuang Y, Lin Z, Li Z, Li D, Xing E P, Zhang H, Gonzalez J E, Stoica I. Judging LLM-as-a-judge with MT-bench and chatbot arena. In: Proceedings of the 37th International Conference on Neural Information Processing Systems. 2023, 46595–46623

[27]

Li Z, Zhang J, Yin C, Ouyang Y, Rong W. ProCQA: a large-scale community-based programming question answering dataset for code search. In: Proceedings of 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation. 2024, 13057–13067

[28]

Liu Y, Ott M, Goyal N, Du J, Joshi M, Chen D, Levy O, Lewis M, Zettlemoyer L, Stoyanov V. RoBERTa: a robustly optimized BERT pretraining approach. 2019, arXiv preprint arXiv: 1907.11692

[29]

Peters M E, Ruder S, Smith N A. To tune or not to tune? Adapting pretrained representations to diverse tasks. In: Proceedings of the 4th Workshop on Representation Learning for NLP. 2019, 7–14

[30]

Liu N F, Gardner M, Belinkov Y, Peters M E, Smith N A. Linguistic knowledge and transferability of contextual representations. In: Proceedings of 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2019, 1073–1094

[31]

Cruz J C B, Cheng C. Evaluating language model finetuning techniques for low-resource languages. 2019, arXiv preprint arXiv: 1907.00409

[32]

Wang L, Ma C, Feng X, Zhang Z, Yang H, Zhang J, Chen Z, Tang J, Chen X, Lin Y, Xin Zhao W, Wei Z, Wen J. A survey on large language model based autonomous agents. Frontiers of Computer Science, 2024, 18(6): 186345

[33]

Kojima T, Gu S S, Reid M, Matsuo Y, Iwasawa Y. Large language models are zero-shot reasoners. In: Proceedings of the 36th International Conference on Neural Information Processing Systems. 2022, 22199–22213

[34]

Tang J, Song R, Huang Y, Gao S, Yu Z. Semantic-aware entity alignment for low resource language knowledge graph. Frontiers of Computer Science, 2024, 18(4): 184319

[35]

Li X, Hu Z, Ge Y, Shan Y, Duan L Y. Exploring model transferability through the lens of potential energy. In: Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. 2023, 5406–5415

[36]

Tran A, Nguyen C, Hassner T. Transferability and hardness of supervised classification tasks. In: Proceedings of 2019 IEEE/CVF International Conference on Computer Vision. 2019, 1395–1405

[37]

Nguyen C V, Hassner T, Seeger M, Archambeau C. LEEP: a new measure to evaluate transferability of learned representations. In: Proceedings of the 37th International Conference on Machine Learning. 2020, 7294–7305

[38]

Li Y, Jia X, Sang R, Zhu Y, Green B, Wang L, Gong B. Ranking neural checkpoints. In: Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2021, 2662–2672

[39]

Bao Y, Li Y, Huang S L, Zhang L, Zheng L, Zamir A, Guibas L J. An information-theoretic approach to transferability in task transfer learning. In: Proceedings of 2019 IEEE International Conference on Image Processing. 2019, 2309–2313

[40]

Ibrahim S, Ponomareva N, Mazumder R. Newer is not always better: rethinking transferability metrics, their peculiarities, stability and performance. In: Proceedings of 2022 European Conference on Machine Learning and Knowledge Discovery in Databases. 2022, 693–709

[41]

Kumari N, Zhang R, Shechtman E, Zhu J Y. Ensembling off-the-shelf models for GAN training. In: Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022, 10641–10652

[42]

Shao W, Zhao X, Ge Y, Zhang Z, Yang L, Wang X, Shan Y, Luo P. Not all models are equal: predicting model transferability in a self-challenging fisher space. In: Proceedings of the 17th European Conference on Computer Vision. 2022, 286–302

[43]

Huang L K, Huang J, Rong Y, Yang Q, Wei Y. Frustratingly easy transferability estimation. In: Proceedings of the 39th International Conference on Machine Learning. 2022, 9201–9225

[44]

Ding N, Chen X, Levinboim T, Changpinyo S, Soricut R. PACTran: PAC-Bayesian metrics for estimating the transferability of pretrained models to classification tasks. In: Proceedings of the 17th European Conference on Computer Vision. 2022, 252–268

[45]

You K, Liu Y, Wang J, Long M. LogME: practical assessment of pre-trained models for transfer learning. In: Proceedings of the 38th International Conference on Machine Learning. 2021, 12133–12143

[46]

Gholami M, Akbari M, Wang X, Kamranian B, Zhang Y. ETran: energy-based transferability estimation. In: Proceedings of 2023 IEEE/CVF International Conference on Computer Vision. 2023, 18567–18576

[47]

LeCun Y, Chopra S, Hadsell R, Ranzato M A, Huang F J. Energy-based models. In: BakIr G, Hofmann T, Schölkopf B, Smola A J, Taskar B, Vishwanathan S V N, eds. Predicting Structured Data. Cambridge: MIT Press, 2007

[48]

Ait-Saada M, Nadif M. Is anisotropy truly harmful? A case study on text clustering. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 2023, 1194–1203

[49]

Li B, Zhou H, He J, Wang M, Yang Y, Li L. On the sentence embeddings from pre-trained language models. In: Proceedings of 2020 Conference on Empirical Methods in Natural Language Processing. 2020, 9119–9130

[50]

Sasaki S, Heinzerling B, Suzuki J, Inui K . Examining the effect of whitening on static and contextualized word embeddings. Information Processing & Management, 2023, 60( 3): 103272

[51]

Su J, Cao J, Liu W, Ou Y. Whitening sentence representations for better semantics and faster retrieval. 2021, arXiv preprint arXiv: 2103.15316

[52]

Wang Z, Wu Y. Investigating the effectiveness of whitening post-processing methods on modifying LLMs representations. In: Proceedings of the 35th IEEE International Conference on Tools with Artificial Intelligence. 2023, 813–820

[53]

Zhuo W, Sun Y, Wang X, Zhu L, Yang Y. WhitenedCSE: whitening-based contrastive learning of sentence embeddings. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics. 2023, 12135–12148

[54]

Penedo G, Kydlíček H, Allal L B, Lozhkov A, Mitchell M, Raffel C A, Von Werra L, Wolf T. The FineWeb datasets: decanting the web for the finest text data at scale. In: Proceedings of the 38th International Conference on Neural Information Processing Systems. 2024, 30811–30849

[55]

Warstadt A, Singh A, Bowman S R . Neural network acceptability judgments. Transactions of the Association for Computational Linguistics, 2019, 7: 625–641

[56]

Dolan W B, Brockett C. Automatically constructing a corpus of sentential paraphrases. In: Proceedings of the 3rd International Workshop on Paraphrasing. 2005, 9–16

[57]

Tatu M, Moldovan D. COGEX at RTE 3. In: Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing. 2007, 22–27

[58]

Wang A, Singh A, Michael J, Hill F, Levy O, Bowman S R. GLUE: a multi-task benchmark and analysis platform for natural language understanding. In: Proceedings of the 7th International Conference on Learning Representations. 2019

[59]

Ait A, Izquierdo J L C, Cabot J. HFCommunity: a tool to analyze the hugging face hub community. In: Proceedings of 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering. 2023, 728–732

[60]

Müller R, Kornblith S, Hinton G E. When does label smoothing help? In: Proceedings of the 33rd International Conference on Neural Information Processing Systems. 2019, 4694–4703

[61]

Chen R, Hao B, Paschalidis I C. Distributionally robust multiclass classification and applications in deep image classifiers. In: Proceedings of 2023 IEEE International Conference on Acoustics, Speech and Signal Processing. 2023, 1–5

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (1886KB)

Supplementary files

Highlights

431

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/