Accurate prediction of protein dihedral angles through conditional random field
Shesheng ZHANG, Shengping JIN, Bin XUE
Accurate prediction of protein dihedral angles through conditional random field
Identifying local conformational changes induced by subtle differences on amino acid sequences is critical in exploring the functional variations of the proteins. In this study, we designed a computational scheme to predict the dihedral angle variations for different amino acid sequences by using conditional random field. This computational tool achieved an accuracy of 87% and 84% in 10-fold cross validation in a large data set for ϕ and ψ, respectively. The prediction accuracies of ϕ and ψ are positively correlated to each other for most of the 20 types of amino acids. Helical amino acids can achieve higher prediction accuracy in general, while amino acids in beet sheet have higher accuracy at specific angular regions. The prediction accuracy of ϕ is negatively correlated with amino acid flexibility represented by Vihinen Index. The prediction accuracy of ϕ can also be negatively correlated with angle distribution dispersion.
conditional random field / flexibility / angle distribution dispersion
[1] |
Ahmadi Adl A, Nowzari-Dalini A, Xue B, Uversky V N, Qian X (2012). Accurate prediction of protein structural classes using functional domains and predicted secondary structure sequences. J Biomol Struct Dyn, 29(6): 623–633
CrossRef
Pubmed
Google scholar
|
[2] |
Ashkenazi A, Presta L G, Marsters S A, Camerato T R, Rosenthal K A, Fendly B M, Capon D J (1990). Mapping the CD4 binding site for human immunodeficiency virus by alanine-scanning mutagenesis. Proc Natl Acad Sci USA, 87(18): 7150–7154
CrossRef
Pubmed
Google scholar
|
[3] |
Chang D T, Huang H Y, Syu Y T, Wu C P (2008). Real value prediction of protein solvent accessibility using enhanced PSSM features. BMC Bioinformatics, 9(Suppl 12): S12
CrossRef
Pubmed
Google scholar
|
[4] |
Cunningham B C, Wells J A (1989). High-resolution epitope mapping of hGH-receptor interactions by alanine-scanning mutagenesis. Science, 244(4908): 1081–1085
CrossRef
Pubmed
Google scholar
|
[5] |
Dang T H, Van Leemput K, Verschoren A, Laukens K (2008). Prediction of kinase-specific phosphorylation sites using conditional random fields. Bioinformatics, 24(24): 2857–2864
CrossRef
Pubmed
Google scholar
|
[6] |
Ellegren H (2008). Comparative genomics and the study of evolution by natural selection. Mol Ecol, 17(21): 4586–4596
CrossRef
Pubmed
Google scholar
|
[7] |
Faraggi E, Xue B, Zhou Y (2009). Improving the prediction accuracy of residue solvent accessibility and real-value backbone torsion angles of proteins by guided-learning through a two-layer neural network. Proteins, 74(4): 847–856
CrossRef
Pubmed
Google scholar
|
[8] |
Faraggi E, Yang Y, Zhang S, Zhou Y (2009). Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction. Structure, 17(11): 1515–1527
CrossRef
Pubmed
Google scholar
|
[9] |
Gibbs C S, Zoller M J (1991). Identification of electrostatic interactions that determine the phosphorylation site specificity of the cAMP-dependent protein kinase. Biochemistry, 30(22): 5329–5334
CrossRef
Pubmed
Google scholar
|
[10] |
Green J R, Korenberg M J, Aboul-Magd M O (2009). PCI-SS: MISO dynamic nonlinear protein secondary structure prediction. BMC Bioinformatics, 10(1): 222
CrossRef
Pubmed
Google scholar
|
[11] |
Helles G, Fonseca R (2009). Predicting dihedral angle probability distributions for protein coil residues from primary sequence using neural networks. BMC Bioinformatics, 10(1): 338
CrossRef
Pubmed
Google scholar
|
[12] |
Lafferty J, McCallum A, Pereira F(2001) Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. Proceedings of the 18th International Conference on Machine Learning 2001 (ICML 2001), pages 282–289.
|
[13] |
Liu Y, Carbonell J, Weigele P, Gopalakrishnan V (2006). Protein fold recognition using segmentation conditional random fields (SCRFs). J Comput Biol, 13(2): 394–406
CrossRef
Pubmed
Google scholar
|
[14] |
Morrison K L, Weiss G A (2001). Combinatorial alanine-scanning. Curr Opin Chem Biol, 5(3): 302–307
CrossRef
Pubmed
Google scholar
|
[15] |
Sutton C, McCallum A (2007)An Introduction to Conditional Random Fields for Relational Learning. In: Getoor L, Taskar B, ed. Introduction to Statistical Relational Learning. MIT Press
|
[16] |
Tokuriki N, Tawfik D S (2009). Stability effects of mutations and protein evolvability. Curr Opin Struct Biol, 19(5): 596–604
CrossRef
Pubmed
Google scholar
|
[17] |
Vihinen M, Torkkila E, Riikonen P (1994). Accuracy of protein flexibility predictions. Proteins, 19(2): 141–149
CrossRef
Pubmed
Google scholar
|
[18] |
Wallach H (2004). Conditional Random Fields: An Introduction. University of Pennsylvania CIS Technical Report MS-CIS-04-21
|
[19] |
Wang G, Dunbrack R L Jr (2003). PISCES: a protein sequence culling server. Bioinformatics, 19(12): 1589–1591
CrossRef
Pubmed
Google scholar
|
[20] |
Wang L, Sauer U H (2008). OnD-CRF: predicting order and disorder in proteins using [corrected] conditional random fields. Bioinformatics, 24(11): 1401–1402
CrossRef
Pubmed
Google scholar
|
[21] |
Xue B, Dor O, Faraggi E, Zhou Y (2008). Real-value prediction of backbone torsion angles. Proteins, 72(1): 427–433
CrossRef
Pubmed
Google scholar
|
[22] |
Xue B, Faraggi E, Zhou Y (2009). Predicting residue-residue contact maps by a two-layer, integrated neural-network method. Proteins, 76(1): 176–183
CrossRef
Pubmed
Google scholar
|
[23] |
Zhao F, Li S, Sterner B W, Xu J (2008). Discriminative learning for protein conformation sampling. Proteins, 73(1): 228–240
CrossRef
Pubmed
Google scholar
|
[24] |
Zhao F, Peng J, Xu J (2010). Fragment-free approach to protein folding using conditional neural fields. Bioinformatics, 26(12): i310–i317
CrossRef
Pubmed
Google scholar
|
/
〈 | 〉 |