Protein acetylation sites with complex-valued polynomial model

Wenzheng BAO , Bin YANG

Front. Comput. Sci. ›› 2024, Vol. 18 ›› Issue (3) : 183904

PDF (7807KB)
Front. Comput. Sci. ›› 2024, Vol. 18 ›› Issue (3) : 183904 DOI: 10.1007/s11704-023-2640-9
Interdisciplinary
RESEARCH ARTICLE

Protein acetylation sites with complex-valued polynomial model

Author information +
History +
PDF (7807KB)

Abstract

Protein acetylation refers to a process of adding acetyl groups (CH3CO-) to lysine residues on protein chains. As one of the most commonly used protein post-translational modifications, lysine acetylation plays an important role in different organisms. In our study, we developed a human-specific method which uses a cascade classifier of complex-valued polynomial model (CVPM), combined with sequence and structural feature descriptors to solve the problem of imbalance between positive and negative samples. Complex-valued gene expression programming and differential evolution are utilized to search the optimal CVPM model. We also made a systematic and comprehensive analysis of the acetylation data and the prediction results. The performances of our proposed method are 79.15% in Sp, 78.17% in Sn, 78.66% in ACC 78.76% in F1, and 0.5733 in MCC, which performs better than other state-of-the-art methods.

Graphical abstract

Keywords

protein acetylation / complex-valued polynomial model / machine learning

Cite this article

Download citation ▾
Wenzheng BAO, Bin YANG. Protein acetylation sites with complex-valued polynomial model. Front. Comput. Sci., 2024, 18(3): 183904 DOI:10.1007/s11704-023-2640-9

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Kouzarides T . Chromatin modifications and their function. Cell, 2007, 128( 4): 693–705

[2]

Mann M, Jensen O N . Proteomic analysis of post-translational modifications. Nature Biotechnology, 2003, 21( 3): 255–261

[3]

Lu CT, Lee TY, Chen YJ, et al. “An intelligent system for identifying acetylated lysine on histones and nonhistone proteins,” BioMed research international, 6(528650), 2014.

[4]

Deng W, Wang C, Zhang Y, et al. “GPS-PAIL: prediction of lysine acetyltransferase-specific modification sites from protein sequences,” Scientific reports, 6(39787), 2016.

[5]

Wysocka J, Swigut T, Xiao H, Milne T A, Kwon S Y, Landry J, Kauer M, Tackett A J, Chait B T, Badenhorst P, Wu C, Allis C D . A PHD finger of NURF couples histone H3 lysine 4 trimethylation with chromatin remodelling. Nature, 2006, 442( 7098): 86–90

[6]

Wysocka J, Swigut T, Milne T A, Dou Y, Zhang X, Burlingame A L, Roeder R G, Brivanlou A H, Allis C D . WDR5 associates with histone H3 methylated at K4 and is essential for H3 K4 methylation and vertebrate development. Cell, 2005, 121( 6): 859–872

[7]

Zeng L, Zhou M M . Bromodomain: an acetyl-lysine binding domain. FEBS Letters, 2002, 513( 1): 124–128

[8]

Jenuwein T, Allis C D . Translating the histone code. Science, 2001, 293( 5532): 1074–1080

[9]

Marmorstein R, Roth S Y . Histone acetyltransferases: function, structure, and catalysis. Current Opinion in Genetics & Development, 2001, 11( 2): 155–161

[10]

Bode A M, Dong Z . Post-translational modification of p53 in tumorigenesis. Nature Reviews Cancer, 2004, 4( 10): 793–805

[11]

Walsh G, Jefferis R . Post-translational modifications in the context of therapeutic proteins. Nature Biotechnology, 2006, 24( 10): 1241–1252

[12]

Westermann S, Weber K . Post-translational modifications regulate microtubule function. Nature Reviews Molecular Cell Biology, 2003, 4( 12): 938–948

[13]

Janke C, Bulinski J C . Post-translational regulation of the microtubule cytoskeleton: mechanisms and functions. Nature Reviews Molecular Cell Biology, 2011, 12( 12): 773–786

[14]

Xu Y, Shao X J, Wu L Y, Deng N Y, Chou K C . iSNO-AAPair: incorporating amino acid pairwise coupling into PseAAC for predicting cysteine S-nitrosylation sites in proteins. PeerJ, 2013, 1: e171

[15]

Qiu W R, Xiao X, Lin W Z, Chou K C . iMethyl-PseAAC: identification of protein methylation sites via a pseudo amino acid composition approach. BioMed Research International, 2014, 947416

[16]

Xu Y, Wen X, Shao X J, Deng N Y, Chou K C . iHyd-PseAAC: predicting hydroxyproline and hydroxylysine in proteins by incorporating dipeptide position-specific propensity into pseudo amino acid composition. International Journal of Molecular Sciences, 2014, 15( 5): 7594–7610

[17]

Xiao X, Ye H X, Liu Z, Jia J H, Chou K C . iROS-gPseKNC: predicting replication origin sites in DNA by incorporating dinucleotide position-specific propensity into general pseudo nucleotide composition. Oncotarget, 2016, 7( 23): 34180–34189

[18]

Tu Y, Lin Y, Hou C, Mao S . Complex-valued networks for automatic modulation classification. IEEE Transactions on Vehicular Technology, 2020, 69( 9): 10085–10089

[19]

Rawat S, Rana K P S, Kumar V . A novel complex-valued convolutional neural network for medical image denoising. Biomedical Signal Processing and Control, 2021, 69: 102859

[20]

Yang B, Bao W . Complex-valued ordinary differential equation modeling for time series identification. IEEE Access, 2019, 7: 41033–41042

[21]

Chen W, Tang H, Ye J, Lin H, Chou K C . iRNA-PseU: identifying RNA pseudouridine sites. Molecular Therapy Nucleic Acids, 2016, 5: e332

[22]

Jia J, Liu Z, Xiao X, Liu B, Chou K C . iCar-PseCp: identify carbonylation sites in proteins by Monte Carlo sampling and incorporating sequence coupled effects into general PseAAC. Oncotarget, 2016, 7( 23): 34558–34570

[23]

Jia J, Zhang L, Liu Z, Xiao X, Chou K C . pSumo-CD: predicting sumoylation sites in proteins with covariance discriminant algorithm by incorporating sequence-coupled effects into general PseAAC. Bioinformatics, 2016, 32( 20): 3133–3141

[24]

Liu Z, Xiao X, Yu D J, Jia J, Qiu W R, Chou K C . pRNAm-PC: predicting N6-methyladenosine sites in RNA sequences via physical-chemical properties. Analytical Biochemistry, 2016, 497: 60–67

[25]

Qiu W R, Sun B Q, Xiao X, Xu Z C, Chou K C . iPTM-mLys: identifying multiple lysine PTM sites and their different types. Bioinformatics, 2016, 32( 20): 3116–3123

[26]

Qiu W R, Xiao X, Xu Z C, Chou K C . iPhos-PseEn: identifying phosphorylation sites in proteins by fusing different pseudo components into an ensemble classifier. Oncotarget, 2016, 7( 32): 51270–51283

[27]

Feng P, Ding H, Yang H, Chen W, Lin H, Chou K C . iRNA-PseColl: identifying the occurrence sites of different RNA modifications by incorporating collective effects of nucleotides into PseKNC. Molecular Therapy Nucleic Acids, 2017, 7: 155–163

[28]

Bao W, Huang Z, Yuan C A, Huang D S . Pupylation sites prediction with ensemble classification model. International Journal of Data Mining and Bioinformatics, 2017, 18( 2): 91–104

[29]

Qiu W R, Jiang S Y, Xu Z C, Xiao X, Chou K C . iRNAm5C-PseDNC: identifying RNA 5-methylcytosine sites by incorporating physical-chemical properties into pseudo dinucleotide composition. Oncotarget, 2017, 8( 25): 41178–41188

[30]

Qiu W R, Sun B Q, Xiao X, Xu D, Chou K C . iPhos‐PseEvo: identifying human phosphorylated proteins by incorporating evolutionary information into general PseAAC via grey system theory. Molecular Informatics, 2017, 36( 5–6): 1600010

[31]

Qiu W R, Sun B Q, Xiao X, Xu Z C, Jia J H, Chou K C . iKcr-PseEns: identify lysine crotonylation sites in histone proteins with pseudo components and ensemble classifier. Genomics, 2018, 110( 5): 239–246

[32]

Xu Y, Wang Z, Li C, Chou K C . iPreny-PseAAC: identify C-terminal cysteine prenylation sites in proteins by incorporating two tiers of sequence couplings into PseAAC. Medicinal Chemistry, 2017, 13( 6): 544–551

[33]

Bao W, Jiang Z, Huang D S . Novel human microbe-disease association prediction using network consistency projection. BMC Bioinformatics, 2017, 18( S16): 543

[34]

Chou K C . Prediction of human immunodeficiency virus protease cleavage sites in proteins. Analytical Biochemistry, 1996, 233( 1): 1–14

[35]

Khan Y D, Rasool N, Hussain W, Khan S A, Chou K C . iPhosT-PseAAC: identify phosphothreonine sites by incorporating sequence statistical moments into PseAAC. Analytical Biochemistry, 2018, 550: 109–116

[36]

Liu B, Liu F, Wang X, Chen J, Fang L, Chou K C . Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences. Nucleic Acids Research, 2015, 43( W1): W65–W71

[37]

Chou K C . Impacts of bioinformatics to medicinal chemistry. Medicinal Chemistry, 2015, 11( 3): 218–234

[38]

Yuan L F, Ding C, Guo S H, Ding H, Chen W, Lin H . Prediction of the types of ion channel-targeted conotoxins based on radial basis function network. Toxicology in Vitro, 2013, 27( 2): 852–856

[39]

Chen W, Lin H, Chou K C . Pseudo nucleotide composition or PseKNC: an effective formulation for analyzing genomic sequences. Molecular Biosystems, 2015, 11( 10): 2620–2634

[40]

Cheng X, Zhao S G, Lin W Z, Xiao X, Chou K C . pLoc-mAnimal: predict subcellular localization of animal proteins with both single and multiple sites. Bioinformatics, 2017, 33( 22): 3524–3531

[41]

Cheng X, Xiao X, Chou K C . pLoc-mGneg: predict subcellular localization of Gram-negative bacterial proteins by deep gene ontology learning via general PseAAC. Genomics, 2018, 110( 4): 231–239

[42]

Cheng X, Xiao X, Chou K C . pLoc-mEuk: predict subcellular localization of multi-label eukaryotic proteins by extracting the key GO information into general PseAAC. Genomics, 2018, 110( 1): 50–58

[43]

Bao W, Chen Y, Wang D . Prediction of protein structure classes with flexible neural tree. Bio-Medical Materials and Engineering, 2014, 24( 6): 3797–3806

[44]

Bao W, Wang D, Chen Y . Classification of protein structure classes on flexible neutral tree. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2017, 14( 5): 1122–1133

[45]

Chen Y, Yang B, Dong J, Abraham A . Time-series forecasting using flexible neural tree model. Information Sciences, 2005, 174( 3–4): 219–235

[46]

Chen Y, Abraham A, Yang B . Hybrid flexible neural-tree-based intrusion detection systems. International Journal of Intelligent Systems, 2007, 22( 4): 337–352

[47]

Chen Y, Abraham A, Yang B . Feature selection and classification using flexible neural tree. Neurocomputing, 2006, 70( 1–3): 305–313

RIGHTS & PERMISSIONS

Higher Education Press

AI Summary AI Mindmap
PDF (7807KB)

Supplementary files

FCS-22640-OF-WB_suppl_1

1780

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/