HPClas: A data-driven approach for identifying halophilic proteins based on catBoost

Shantong Hu , Xiaoyu Wang , Zhikang Wang , Menghan Jiang , Shihui Wang , Wenya Wang , Jiangning Song , Guimin Zhang

mLife ›› 2024, Vol. 3 ›› Issue (4) : 515 -526.

PDF
mLife ›› 2024, Vol. 3 ›› Issue (4) : 515 -526. DOI: 10.1002/mlf2.12125
METHOD

HPClas: A data-driven approach for identifying halophilic proteins based on catBoost

Author information +
History +
PDF

Abstract

Halophilic proteins possess unique structural properties and show high stability under extreme conditions. This distinct characteristic makes them invaluable for application in various aspects such as bioenergy, pharmaceuticals, environmental clean-up, and energy production. Generally, halophilic proteins are discovered and characterized through labor-intensive and time-consuming wet lab experiments. In this study, we introduce the Halophilic Protein Classifier (HPClas), a machine learning-based classifier developed using the catBoost ensemble learning technique to identify halophilic proteins. Extensive in silico calculations were conducted on a large public dataset of 12,574 samples and HPClas achieved an area under the receiver operating characteristic curve (AUROC) of 0.844 on an independent test set of 200 samples. The source code and curated dataset of HPClas are publicly available at https://github.com/Showmake2/HPClas. In conclusion, HPClas can be explored as a promising tool to aid in the identification of halophilic proteins and accelerate their application in different fields.

Keywords

feature engineering / halophilic protein / machine learning

Cite this article

Download citation ▾
Shantong Hu, Xiaoyu Wang, Zhikang Wang, Menghan Jiang, Shihui Wang, Wenya Wang, Jiangning Song, Guimin Zhang. HPClas: A data-driven approach for identifying halophilic proteins based on catBoost. mLife, 2024, 3(4): 515-526 DOI:10.1002/mlf2.12125

登录浏览全文

4963

注册一个新账户 忘记密码

References

RIGHTS & PERMISSIONS

2024 The Author(s). mLife published by John Wiley & Sons Australia, Ltd on behalf of Institute of Microbiology, Chinese Academy of Sciences.

AI Summary AI Mindmap
PDF

119

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/