IRIS: A method for predicting in vivo RNA secondary structures using PARIS data
Jianyu Zhou, Pan Li, Wanwen Zeng, Wenxiu Ma, Zhipeng Lu, Rui Jiang, Qiangfeng Cliff Zhang, Tao Jiang
IRIS: A method for predicting in vivo RNA secondary structures using PARIS data
Background: RNA secondary structures play a pivotal role in posttranscriptional regulation and the functions of non-coding RNAs, yet in vivo RNA secondary structures remain enigmatic. PARIS (Psoralen Analysis of RNA Interactions and Structures) is a recently developed high-throughput sequencing-based approach that enables direct capture of RNA duplex structures in vivo. However, the existence of incompatible, fuzzy pairing information obstructs the integration of PARIS data with the existing tools for reconstructing RNA secondary structure models at the single-base resolution.
Methods: We introduce IRIS, a method for predicting RNA secondary structure ensembles based on PARIS data. IRIS generates a large set of candidate RNA secondary structure models under the guidance of redistributed PARIS reads and then uses a Bayesian model to identify the optimal ensemble, according to both thermodynamic principles and PARIS data.
Results: The predicted RNA structure ensembles by IRIS have been verified based on evolutionary conservation information and consistency with other experimental RNA structural data. IRIS is implemented in Python and freely available at http://iris.zhanglab.net.
Conclusion: IRIS capitalizes upon PARIS data to improve the prediction of in vivo RNA secondary structure ensembles. We expect that IRIS will enhance the application of the PARIS technology and shed more insight on in vivo RNA secondary structures.
RNA secondary structure / PARIS data / in vivo / structure ensembles / incompatible reads
[1] |
Eddy, S. R. (2001) Non-coding RNA genes and the modern RNA world. Nat. Rev. Genet., 2, 919–929
CrossRef
Pubmed
Google scholar
|
[2] |
Cech, T. R. and Steitz, J. A. (2014) The noncoding RNA revolution-trashing old rules to forge new ones. Cell, 157, 77–94
CrossRef
Pubmed
Google scholar
|
[3] |
Tinoco, I. Jr and Bustamante, C. (1999) How RNA folds. J. Mol. Biol., 293, 271–281
CrossRef
Pubmed
Google scholar
|
[4] |
Fallmann, J., Will, S., Engelhardt, J., Grüning, B., Backofen, R. and Stadler, P. F. (2017) Recent advances in RNA folding. J. Biotechnol., 261, 97–104
CrossRef
Pubmed
Google scholar
|
[5] |
Rivas, E. (2013) The four ingredients of single-sequence RNA secondary structure prediction. A unifying perspective. RNA Biol., 10, 1185–1196
CrossRef
Pubmed
Google scholar
|
[6] |
Zuker, M. (2003) Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res., 31, 3406–3415
CrossRef
Pubmed
Google scholar
|
[7] |
Hofacker, I. L. (2003) Vienna RNA secondary structure server. Nucleic Acids Res., 31, 3429–3431
CrossRef
Pubmed
Google scholar
|
[8] |
Reuter, J. S. and Mathews, D. H. (2010) RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinformatics, 11, 129
CrossRef
Pubmed
Google scholar
|
[9] |
Bevilacqua, P. C., Ritchey, L. E., Su, Z. and Assmann, S. M. (2016) Genome-wide analysis of RNA secondary structure. Annu. Rev. Genet., 50, 235–266
CrossRef
Pubmed
Google scholar
|
[10] |
McCaskill, J. S. (1990) The equilibrium partition function and base pair binding probabilities for RNA secondary structure. Biopolymers, 29, 1105–1119
CrossRef
Pubmed
Google scholar
|
[11] |
Chen, S.-J. (2008) RNA folding: conformational statistics, folding kinetics, and ion electrostatics. Annu. Rev. Biophys., 37, 197–214
CrossRef
Pubmed
Google scholar
|
[12] |
Flamm, C., Hofacker, I. L., Stadler, P. F. and Wolfinger, M. T. (2002) Barrier trees of degenerate landscapes. Z. Phys. Chem., 216, 155
CrossRef
Google scholar
|
[13] |
Kucharík, M., Hofacker, I. L., Stadler, P. F. and Qin, J. (2014) Basin Hopping Graph: a computational framework to characterize RNA folding landscapes. Bioinformatics, 30, 2009–2017
CrossRef
Pubmed
Google scholar
|
[14] |
Michálik, J., Touzet, H. and Ponty, Y. (2017) Efficient approximations of RNA kinetics landscape using non-redundant sampling. Bioinformatics, 33, i283–i292
CrossRef
Pubmed
Google scholar
|
[15] |
Hofacker, I. L., Schuster, P. and Stadler, P. F. (1998) Combinatorics of RNA secondary structures. Discrete Appl. Math., 88, 207–237
CrossRef
Google scholar
|
[16] |
Rivas, E. and Eddy, S. R. (2001) Noncoding RNA gene detection using comparative sequence analysis. BMC Bioinformatics, 2, 8
CrossRef
Pubmed
Google scholar
|
[17] |
Kalvari, I., Nawrocki, E. P., Argasinska, J., Quinones-Olvera, N., Finn, R. D., Bateman, A. and Petrov, A. I. (2018) Non-coding RNA analysis using the rfam database. Curr. Protoc. Bioinf., 62, e51
CrossRef
Pubmed
Google scholar
|
[18] |
Kalvari, I., Argasinska, J., Quinones-Olvera, N., Nawrocki, E. P., Rivas, E., Eddy, S. R., Bateman, A., Finn, R. D. and Petrov, A. I. (2018) Rfam 13.0: shifting to a genome-centric resource for non-coding RNA families. Nucleic Acids Res., 46, D335–D342
CrossRef
Pubmed
Google scholar
|
[19] |
Knudsen, B. and Hein, J. (2003) Pfold: RNA secondary structure prediction using stochastic context-free grammars. Nucleic Acids Res., 31, 3423–3428
CrossRef
Pubmed
Google scholar
|
[20] |
Do, C. B., Woods, D. A. and Batzoglou, S. (2006) CONTRAfold: RNA secondary structure prediction without physics-based models. Bioinformatics, 22, e90–e98
CrossRef
Pubmed
Google scholar
|
[21] |
Zakov, S., Goldberg, Y., Elhadad, M. and Ziv-Ukelson, M. (2011) Rich parameterization improves RNA structure prediction. J. Comput. Biol., 18, 1525–1542
CrossRef
Pubmed
Google scholar
|
[22] |
Andronescu, M., Condon, A., Hoos, H. H., Mathews, D. H. and Murphy, K. P. (2007) Efficient parameter estimation for RNA secondary structure prediction. Bioinformatics, 23, i19–i28
CrossRef
Pubmed
Google scholar
|
[23] |
Andronescu, M., Condon, A., Hoos, H. H., Mathews, D. H. and Murphy, K. P. (2010) Computational approaches for RNA energy parameter estimation. RNA, 16, 2304–2318
CrossRef
Pubmed
Google scholar
|
[24] |
Singh, J., Hanson, J., Paliwal, K. and Zhou, Y. (2019) RNA secondary structure prediction using an ensemble of two-dimensional deep neural networks and transfer learning. Nat. Commun., 10, 5407
CrossRef
Pubmed
Google scholar
|
[25] |
Kwok, C. K. (2016) Dawn of the in vivo RNA structurome and interactome. Biochem. Soc. Trans., 44, 1395–1410
CrossRef
Pubmed
Google scholar
|
[26] |
Leamy, K. A., Assmann, S. M., Mathews, D. H. and Bevilacqua, P. C. (2016) Bridging the gap between in vitro and in vivo RNA folding. Q. Rev. Biophys., 49, e10
CrossRef
Pubmed
Google scholar
|
[27] |
Strobel, E. J., Yu, A. M. and Lucks, J. B. (2018) High-throughput determination of RNA structures. Nat. Rev. Genet., 19, 615–634
CrossRef
Pubmed
Google scholar
|
[28] |
Rouskin, S., Zubradt, M., Washietl, S., Kellis, M. and Weissman, J. S. (2014) Genome-wide probing of RNA structure reveals active unfolding of mRNA structures in vivo. Nature, 505, 701–705
CrossRef
Pubmed
Google scholar
|
[29] |
Ding, Y., Tang, Y., Kwok, C. K., Zhang, Y., Bevilacqua, P. C. and Assmann, S. M. (2014) In vivo genome-wide profiling of RNA secondary structure reveals novel regulatory features. Nature, 505, 696–700
CrossRef
Pubmed
Google scholar
|
[30] |
Spitale, R. C., Flynn, R. A., Zhang, Q. C., Crisalli, P., Lee, B., Jung, J.-W., Kuchelmeister, H. Y., Batista, P. J., Torre, E. A., Kool, E. T.,
CrossRef
Pubmed
Google scholar
|
[31] |
Deigan, K. E., Li, T. W., Mathews, D. H. and Weeks, K. M. (2009) Accurate SHAPE-directed RNA structure determination. Proc. Natl. Acad. Sci. USA, 106, 97–102
CrossRef
Pubmed
Google scholar
|
[32] |
Deng, F., Ledda, M., Vaziri, S. and Aviran, S. (2016) Data-directed RNA secondary structure prediction using probabilistic modeling. RNA, 22, 1109–1119
CrossRef
Pubmed
Google scholar
|
[33] |
Wu, Y., Shi, B., Ding, X., Liu, T., Hu, X., Yip, K. Y., Yang, Z. R., Mathews, D. H. and Lu, Z. J. (2015) Improved prediction of RNA secondary structure by integrating the free energy model with restraints derived from experimental probing data. Nucleic Acids Res., 43, 7247–7259
CrossRef
Pubmed
Google scholar
|
[34] |
Washietl, S., Hofacker, I. L., Stadler, P. F. and Kellis, M. (2012) RNA folding with soft constraints: reconciliation of probing data and thermodynamic secondary structure prediction. Nucleic Acids Res., 40, 4261–4272
CrossRef
Pubmed
Google scholar
|
[35] |
Spasic, A., Assmann, S. M., Bevilacqua, P. C. and Mathews, D. H. (2018) Modeling RNA secondary structure folding ensembles using SHAPE mapping data. Nucleic Acids Res., 46, 314–323
CrossRef
Pubmed
Google scholar
|
[36] |
Aw, J. G. A., Shen, Y., Wilm, A., Sun, M., Lim, X. N., Boon, K.-L., Tapsin, S., Chan, Y.-S., Tan, C.-P., Sim, A. Y.,
CrossRef
Pubmed
Google scholar
|
[37] |
Sharma, E., Sterne-Weiler, T., O’Hanlon, D. and Blencowe, B. J. (2016) Global mapping of human RNA-RNA interactions. Mol. Cell, 62, 618–626
CrossRef
Pubmed
Google scholar
|
[38] |
Lu, Z., Zhang, Q. C., Lee, B., Flynn, R. A., Smith, M. A., Robinson, J. T., Davidovich, C., Gooding, A. R., Goodrich, K. J., Mattick, J. S.,
CrossRef
Pubmed
Google scholar
|
[39] |
Gong, J., Ju, Y., Shao, D. and Zhang, Q. C. (2018) Advances and challenges towards the study of RNA-RNA interactions in a transcriptome-wide scale. Quant. Biol., 6, 239–252
CrossRef
Google scholar
|
[40] |
Lu, Z., Gong, J. and Zhang, Q. C. (2018) PARIS: Psoralen analysis of RNA interactions and structures with high throughput and resolution. In: RNA Detection, pp. 59–84. Springer
|
[41] |
Fischer-Hwang, I., Lu, Z., Zou, J. and Weissman, T. (2019) Cross-linked RNA secondary structure analysis using network techniques. bioRxiv, 668491
|
[42] |
Li, P., Wei, Y., Mei, M., Tang, L., Sun, L., Huang, W., Zhou, J., Zou, C., Zhang, S., and Qin, C.-f. (2018) Integrative analysis of zika virus genome RNA structure reveals critical determinants of viral infectivity. Cell host & microbe. 24, 875–886. e875
|
[43] |
Danaee, P., Rouches, M., Wiley, M., Deng, D., Huang, L. and Hendrix, D. (2018) bpRNA: large-scale automated annotation and analysis of RNA secondary structure. Nucleic Acids Res., 46, 5381–5394
CrossRef
Pubmed
Google scholar
|
[44] |
Li, P., Shi, R. and Zhang, Q. C. (2019) icSHAPE-pipe: A comprehensive toolkit for icSHAPE data analysis and evaluation. Methods, 178, 96–103
Pubmed
|
[45] |
Flynn, R. A., Zhang, Q. C., Spitale, R. C., Lee, B., Mumbach, M. R. and Chang, H. Y. (2016) Transcriptome-wide interrogation of RNA secondary structure in living cells with icSHAPE. Nat. Protoc., 11, 273–290
CrossRef
Pubmed
Google scholar
|
[46] |
Zhu, J. Y. A., Steif, A., Proctor, J. R. and Meyer, I. M. (2013) Transient RNA structure features are evolutionarily conserved and can be computationally predicted. Nucleic Acids Res., 41, 6273–6285
CrossRef
Pubmed
Google scholar
|
[47] |
Martin, L. C., Gloor, G. B., Dunn, S. D. and Wahl, L. M. (2005) Using information theory to search for co-evolving residues in proteins. Bioinformatics, 21, 4116–4124
CrossRef
Pubmed
Google scholar
|
[48] |
Rivas, E., Clements, J. and Eddy, S. R. (2017) A statistical test for conserved RNA structure shows lack of evidence for structure in lncRNAs. Nat. Methods, 14, 45–48
CrossRef
Pubmed
Google scholar
|
[49] |
Hamada, M. (2012) Direct updating of an RNA base-pairing probability matrix with marginal probability constraints. J. Comput. Biol., 19, 1265–1276
CrossRef
Pubmed
Google scholar
|
[50] |
Hofacker, I. L., Fontana, W., Stadler, P. F., Bonhoeffer, L. S., Tacker, M., and Schuster, P. (1994) Fast folding and comparison of RNA secondary structures. Monatshefte für Chemie/Chemical Monthly. 125, 167–188
|
[51] |
Cox, M. A. and Cox, T. F. (2008) Multidimensional scaling. In: Handbook of Data Visualization, pp. 315–347. Springer
|
[52] |
Aurenhammer, F. (1991) Voronoi diagrams—a survey of a fundamental geometric data structure. ACM Comput. Surv., 23, 345–405
CrossRef
Google scholar
|
[53] |
Lyngsø, R. B. (2004) Complexity of pseudoknot prediction in simple models. In: International Colloquium on Automata, Languages, and Programming, pp. 919–931. Springer
|
[54] |
Dobin, A., Davis, C. A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., Batut, P., Chaisson, M. and Gingeras, T. R. (2013) STAR: ultrafast universal RNA-seq aligner. Bioinformatics, 29, 15–21
CrossRef
Pubmed
Google scholar
|
[55] |
Lyngsø, R. B. and Pedersen, C. N. (2000) RNA pseudoknot prediction in energy-based models. J. Comput. Biol., 7, 409–427
CrossRef
Pubmed
Google scholar
|
[56] |
Murtagh, F. (1983) A survey of recent advances in hierarchical clustering algorithms. Comput. J., 26, 354–359
CrossRef
Google scholar
|
[57] |
Virtanen, P., Gommers, R., Oliphant, T. E., Haberland, M., Reddy, T., Cournapeau, D., Burovski, E., Peterson, P., Weckesser, W., Bright, J.,
CrossRef
Google scholar
|
[58] |
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R. and Dubourg, V. (2011) Scikit-learn: Machine learning in python. J. Mach. Learn. Res., 12, 2825–2830
|
[59] |
Darty, K., Denise, A. and Ponty, Y. (2009) VARNA: Interactive drawing and editing of the RNA secondary structure. Bioinformatics, 25, 1974–1975
CrossRef
Pubmed
Google scholar
|
[60] |
Hunter, J. D. (2007) Matplotlib: A 2D graphics environment. Comput. Sci. Eng., 9, 90–95
CrossRef
Google scholar
|
/
〈 | 〉 |