Predicting enhancer-promoter interaction from genomic sequence with deep neural networks

Shashank Singh , Yang Yang , Barnabás Póczos , Jian Ma

Quant. Biol. ›› 2019, Vol. 7 ›› Issue (2) : 122 -137.

PDF (1298KB)
Quant. Biol. ›› 2019, Vol. 7 ›› Issue (2) : 122 -137. DOI: 10.1007/s40484-019-0154-0
RESEARCH ARTICLE
RESEARCH ARTICLE

Predicting enhancer-promoter interaction from genomic sequence with deep neural networks

Author information +
History +
PDF (1298KB)

Abstract

Background: In the human genome, distal enhancers are involved in regulating target genes through proximal promoters by forming enhancer-promoter interactions. Although recently developed high-throughput experimental approaches have allowed us to recognize potential enhancer-promoter interactions genome-wide, it is still largely unclear to what extent the sequence-level information encoded in our genome help guide such interactions.

Methods: Here we report a new computational method (named “SPEID”) using deep learning models to predict enhancer-promoter interactions based on sequence-based features only, when the locations of putative enhancers and promoters in a particular cell type are given.

Results: Our results across six different cell types demonstrate that SPEID is effective in predicting enhancer-promoter interactions as compared to state-of-the-art methods that only use information from a single cell type. As a proof-of-principle, we also applied SPEID to identify somatic non-coding mutations in melanoma samples that may have reduced enhancer-promoter interactions in tumor genomes.

Conclusions: This work demonstrates that deep learning models can help reveal that sequence-based features alone are sufficient to reliably predict enhancer-promoter interactions genome-wide.

Graphical abstract

Keywords

chromatin interaction / enhancer-promoter interaction / deep neural network

Cite this article

Download citation ▾
Shashank Singh, Yang Yang, Barnabás Póczos, Jian Ma. Predicting enhancer-promoter interaction from genomic sequence with deep neural networks. Quant. Biol., 2019, 7(2): 122-137 DOI:10.1007/s40484-019-0154-0

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Sexton, T. and Cavalli, G. (2015) The role of chromosome domains in shaping the functional genome. Cell, 160, 1049–1059

[2]

Lieberman-Aiden, E., van Berkum, N. L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B. R., Sabo, P. J., Dorschner, M. O., (2009) Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science, 326, 289–293

[3]

Fullwood, M. J. and Ruan, Y. (2009) ChIP-based methods for the identification of long-range chromatin interactions. J. Cell. Biochem., 107, 30–39

[4]

Tang, Z., Luo, O. J., Li, X., Zheng, M., Zhu, J. J., Szalaj, P., Trzaskoma, P., Magalska, A., Włodarczyk, J., Ruszczycki, B., (2015) CTCF-mediated human 3D genome architecture reveals chromatin topology for transcription. Cell, 163, 1611–1627

[5]

Zhang, Y., Wong, C.-H., Birnbaum, R. Y., Li, G., Favaro, R., Ngan, C. Y., Lim, J., Tai, E., Poh, H. M., Wong, E., (2013) Chromatin connectivity maps reveal dynamic promoter-enhancer long-range associations. Nature, 504, 306–310

[6]

Dixon, J. R., Jung, I., Selvaraj, S., Shen, Y., Antosiewicz-Bourget, J. E., Lee, A. Y., Ye, Z., Kim, A., Rajagopal, N., Xie, W., (2015) Chromatin architecture reorganization during stem cell differentiation. Nature, 518, 331–336

[7]

Guo, Y., Xu, Q., Canzio, D., Shou, J., Li, J., Gorkin, D. U., Jung, I., Wu, H., Zhai, Y., Tang, Y., (2015) CRISPR inversion of CTCF sites alters genome topology and enhancer/promoter function. Cell, 162, 900–910

[8]

Sanyal, A., Lajoie, B. R., Jain, G. and Dekker, J. (2012) The long-range interaction landscape of gene promoters. Nature, 489, 109–113

[9]

Li, G., Ruan, X., Auerbach, R. K., Sandhu, K. S., Zheng, M., Wang, P., Poh, H. M., Goh, Y., Lim, J., Zhang, J., (2012) Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell, 148, 84–98

[10]

Rao, S. S., Huntley, M. H., Durand, N. C., Stamenova, E. K., Bochkov, I. D., Robinson, J. T., Sanborn, A. L., Machol, I., Omer, A. D., Lander, E. S., (2014) A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell, 159, 1665–1680

[11]

Roy, S., Siahpirani, A. F., Chasman, D., Knaack, S., Ay, F., Stewart, R., Wilson, M. and Sridharan, R. (2015) A predictive modeling approach for cell line-specific long-range regulatory interactions. Nucleic Acids Res., 43, 8694–8712

[12]

Whalen, S., Truty, R. M. and Pollard, K. S. (2016) Enhancer-promoter interactions are encoded by complex genomic signatures on looping chromatin. Nat. Genet., 48, 488–496

[13]

Schreiber, J., Libbrecht, M., Bilmes, J. and Noble, W. (2018) Nucleotide sequence and DNaseI sensitivity are predictive of 3D chromatin architecture. bioRxiv, 103614

[14]

Zhu, Y., Chen, Z., Zhang, K., Wang, M., Medovoy, D., Whitaker, J. W., Ding, B., Li, N., Zheng, L. and Wang, W. (2016) Constructing 3D interaction maps from 1D epigenomes. Nat. Commun., 7, 10812

[15]

Cao, Q., Anyansi, C., Hu, X., Xu, L., Xiong, L., Tang, W., Mok, M. T. S., Cheng, C., Fan, X., Gerstein, M., (2017) Reconstruction of enhancer-target networks in 935 samples of human primary cells, tissues and cell lines. Nat. Genet., 49, 1428–1436

[16]

Yang, Y., Zhang, R., Singh, S., and Ma, J. (2017) Exploiting sequence-based features for predicting enhancer-promoter interactions. Bioinformatics 33, i252–i260

[17]

Friedman, J. H. (2001) Greedy function approximation: a gradient boosting machine. Ann. Stat., 29, 1189–1232

[18]

Zhou, J. and Troyanskaya, O. G. (2015) Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods, 12, 931–934

[19]

Park, Y. and Kellis, M. (2015) Deep learning for regulatory genomics. Nat. Biotechnol., 33, 825–826

[20]

Alipanahi, B., Delong, A., Weirauch, M. T. and Frey, B. J. (2015) Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol., 33, 831–838

[21]

Quang, D. and Xie, X. (2016) DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res., 44, e107

[22]

Li, Y., Shi, W. and Wasserman, W. W. (2018) Genome-wide prediction of cis-regulatory regions using supervised deep learning methods. BMC Bioinformatics, 19, 202

[23]

Kelley, D. R., Snoek, J. and Rinn, J. L. (2016) Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res., 26, 990–999

[24]

Zhang, S., Hu, H., Jiang, T., Zhang, L. and Zeng, J. (2017) TITER: predicting translation initiation sites by deep learning. Bioinformatics, 33, i234–i242

[25]

Cuperus, J. T., Groves, B., Kuchina, A., Rosenberg, A. B., Jojic, N., Fields, S. and Seelig, G. (2017) Deep learning of the regulatory grammar of yeast 5′ untranslated regions from 500,000 random sequences. Genome Res., 27, 2015–2024

[26]

Singh, R., Lanchantin, J., Sekhon, A. and Qi, Y. (2017) Attend and predict: understanding gene regulation by selective attention on chromatin. In: Advances in Neural Infornation Processing Systems 30

[27]

Zhang, S., Hu, H., Jiang, T., Zhang, L. and Zeng, J. (2017) TITER: predicting translation initiation sites by deep learning. Bioinformatics, 33, i234–i242

[28]

Boža, V., Brejová B., and Vinař T. (2017) DeepNano: deep recurrent neural networks for base calling in MinION nanopore reads. PloS one, 12, e0178751

[29]

Wang, S., Sun, S., Li, Z., Zhang, R. and Xu, J. (2017) Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS Comput. Biol., 13, e1005324

[30]

Ching, T., Himmelstein, D. S., Beaulieu-Jones, B. K., Kalinin, A. A., Do, B. T., Way, G. P., Ferrero, E., Agapow, P.-M., Zietz, M., Hoffman, M. M., (2018) Opportunities and obstacles for deep learning in biology and medicine. J. R. Soc. Interface, 15, 142760

[31]

Angermueller, C., Pärnamaa, T., Parts, L. and Stegle, O. (2016) Deep learning for computational biology. Mol. Syst. Biol., 12, 878

[32]

ENCODE Project Consortium. (2012) An integrated encyclopedia of DNA elements in the human genome. Nature, 489, 57–74

[33]

Kundaje, A., Meuleman, W., Ernst, J., Bilenky, M., Yen, A., Heravi-Moussavi, A., Kheradpour, P., Zhang, Z., Wang, J., Ziller, M. J., (2015) Integrative analysis of 111 reference human epigenomes. Nature, 518, 317–330

[34]

Kulakovskiy, I. V., Vorontsov, I. E., Yevshin, I. S., Soboleva, A. V., Kasianov, A. S., Ashoor, H., Ba-Alawi, W., Bajic, V. B., Medvedeva, Y. A., Kolpakov, F. A., (2016) HOCOMOCO: expansion and enhancement of the collection of transcription factor binding sites models. Nucleic Acids Res., 44, D116–D125

[35]

Xu, J., Sankaran, V. G., Ni, M., Menne, T. F., Puram, R. V., Kim, W. and Orkin, S. H. (2010) Transcriptional silencing of γ-globin by BCL11A involves long-range interactions and cooperation with SOX6. Genes Dev., 24, 783–798

[36]

Frank, C. L., Liu, F., Wijayatunge, R., Song, L., Biegler, M. T., Yang, M. G., Vockley, C. M., Safi, A., Gersbach, C. A., Crawford, G. E., (2015) Regulation of chromatin accessibility and Zic binding at enhancers in the developing cerebellum. Nat. Neurosci., 18, 647–656

[37]

Krivega, I. and Dean, A. (2017) LDB1-mediated enhancer looping can be established independent of mediator and cohesin. Nucleic Acids Res., 45, 8255–8268

[38]

Bowman, C. J., Ayer, D. E. and Dynlacht, B. D. (2014) Foxk proteins repress the initiation of starvation-induced atrophy and autophagy programs. Nat. Cell Biol., 16, 1202–1214

[39]

van Riel, B. and Rosenbauer, F. (2014) Epigenetic control of hematopoiesis: the PU.1 chromatin connection. Biol. Chem., 395, 1265–1274

[40]

Steidl, U., Rosenbauer, F., Verhaak, R. G., Gu, X., Ebralidze, A., Otu, H. H., Klippel, S., Steidl, C., Bruns, I., Costa, D. B., (2006) Essential role of Jun family transcription factors in PU.1 knockdown-induced leukemic stem cells. Nat. Genet., 38, 1269–1277

[41]

Gupta, S., Stamatoyannopoulos, J. A., Bailey, T. L. and Noble, W. S. (2007) Quantifying similarity between motifs. Genome Biol., 8, R24

[42]

Hodis, E., Watson, I. R., Kryukov, G. V., Arold, S. T., Imielinski, M., Theurillat, J.-P., Nickerson, E., Auclair, D., Li, L., Place, C., (2012) A landscape of driver mutations in melanoma. Cell, 150, 251–263

[43]

Xi, W. and Beer, M. A. (2018). Local epigenomic state cannot discriminate interacting and non-interacting enhancer–promoter pairs with high accuracy. PLoS Comput. Biol., 14, e1006625

[44]

Cao, Q., Anyansi, C., Hu, X., Xu, L., Xiong, L., Tang, W., Mok, M.T.S., Cheng, C., Fan, X., Gerstein, M. (2017) Reconstruction of enhancer–target networks in 935 samples of human primary cells, tissues and cell lines. Nat. Genet., 49, 1428–1436

[45]

Shrikumar, A., Greenside, P., Shcherbina, A. and Kundaje, A. (2016) Not just a black box: learning important features through propagating activation differences. arXiv, 1605.01713

[46]

Li, Y., Chen, C.-Y. and Wasserman, W. W. (2016) Deep feature selection: theory and application to identify enhancers and promoters. J. Comput. Biol., 23, 322–336

[47]

Glorot, X., Bordes, A. and Bengio, Y. (2011) Deep sparse rectifier neural networks. In: International Conference on Artificial Intelligen Vol. 15, pp. 275

[48]

LeCun, Y., Bengio, Y. and Hinton, G. (2015) Deep learning. Nature, 521, 436–444

[49]

Hochreiter, S. and Schmidhuber, J. (1997) Long short-term memory. Neural Comput., 9, 1735–1780

[50]

Graves, A., Jaitly, N. and Mohamed, A.-R. (2013) Hybrid speech recognition with deep bidirectional LSTM. In: Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on IEEE pp. 273–278

[51]

Chollet, F. (2015) Keras. accessed on April 10, 2018

[52]

Kingma, D. and Ba, J. (2014) Adam: a method for stochastic optimization. arXiv, 1412.6980

[53]

Ioffe, S. and Szegedy, C. (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of The 32nd International Conference on Machine Learning pp. 448–456

[54]

Krizhevsky, A., Sutskever, I. and Hinton, G. E. (2012) Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems pp. 1097–1105

[55]

Grant, C. E., Bailey, T. L. and Noble, W. S. (2011) FIMO: scanning for occurrences of a given motif. Bioinformatics, 27, 1017–1018

RIGHTS & PERMISSIONS

Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature

AI Summary AI Mindmap
PDF (1298KB)

Supplementary files

QB-18154-OF-MJ_suppl_1

3600

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/