Predicting enhancer-promoter interaction from genomic sequence with deep neural networks

Shashank Singh, Yang Yang, Barnabás Póczos, Jian Ma

PDF(1298 KB)
PDF(1298 KB)
Quant. Biol. ›› 2019, Vol. 7 ›› Issue (2) : 122-137. DOI: 10.1007/s40484-019-0154-0
RESEARCH ARTICLE
RESEARCH ARTICLE

Predicting enhancer-promoter interaction from genomic sequence with deep neural networks

Author information +
History +

Abstract

Background: In the human genome, distal enhancers are involved in regulating target genes through proximal promoters by forming enhancer-promoter interactions. Although recently developed high-throughput experimental approaches have allowed us to recognize potential enhancer-promoter interactions genome-wide, it is still largely unclear to what extent the sequence-level information encoded in our genome help guide such interactions.

Methods: Here we report a new computational method (named “SPEID”) using deep learning models to predict enhancer-promoter interactions based on sequence-based features only, when the locations of putative enhancers and promoters in a particular cell type are given.

Results: Our results across six different cell types demonstrate that SPEID is effective in predicting enhancer-promoter interactions as compared to state-of-the-art methods that only use information from a single cell type. As a proof-of-principle, we also applied SPEID to identify somatic non-coding mutations in melanoma samples that may have reduced enhancer-promoter interactions in tumor genomes.

Conclusions: This work demonstrates that deep learning models can help reveal that sequence-based features alone are sufficient to reliably predict enhancer-promoter interactions genome-wide.

Graphical abstract

Keywords

chromatin interaction / enhancer-promoter interaction / deep neural network

Cite this article

Download citation ▾
Shashank Singh, Yang Yang, Barnabás Póczos, Jian Ma. Predicting enhancer-promoter interaction from genomic sequence with deep neural networks. Quant. Biol., 2019, 7(2): 122‒137 https://doi.org/10.1007/s40484-019-0154-0

References

[1]
Sexton, T. and Cavalli, G. (2015) The role of chromosome domains in shaping the functional genome. Cell, 160, 1049–1059
CrossRef Pubmed Google scholar
[2]
Lieberman-Aiden, E., van Berkum, N. L., Williams, L., Imakaev, M., Ragoczy, T., Telling, A., Amit, I., Lajoie, B. R., Sabo, P. J., Dorschner, M. O., (2009) Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science, 326, 289–293
CrossRef Pubmed Google scholar
[3]
Fullwood, M. J. and Ruan, Y. (2009) ChIP-based methods for the identification of long-range chromatin interactions. J. Cell. Biochem., 107, 30–39
CrossRef Pubmed Google scholar
[4]
Tang, Z., Luo, O. J., Li, X., Zheng, M., Zhu, J. J., Szalaj, P., Trzaskoma, P., Magalska, A., Włodarczyk, J., Ruszczycki, B., (2015) CTCF-mediated human 3D genome architecture reveals chromatin topology for transcription. Cell, 163, 1611–1627
CrossRef Pubmed Google scholar
[5]
Zhang, Y., Wong, C.-H., Birnbaum, R. Y., Li, G., Favaro, R., Ngan, C. Y., Lim, J., Tai, E., Poh, H. M., Wong, E., (2013) Chromatin connectivity maps reveal dynamic promoter-enhancer long-range associations. Nature, 504, 306–310
CrossRef Pubmed Google scholar
[6]
Dixon, J. R., Jung, I., Selvaraj, S., Shen, Y., Antosiewicz-Bourget, J. E., Lee, A. Y., Ye, Z., Kim, A., Rajagopal, N., Xie, W., (2015) Chromatin architecture reorganization during stem cell differentiation. Nature, 518, 331–336
CrossRef Pubmed Google scholar
[7]
Guo, Y., Xu, Q., Canzio, D., Shou, J., Li, J., Gorkin, D. U., Jung, I., Wu, H., Zhai, Y., Tang, Y., (2015) CRISPR inversion of CTCF sites alters genome topology and enhancer/promoter function. Cell, 162, 900–910
CrossRef Pubmed Google scholar
[8]
Sanyal, A., Lajoie, B. R., Jain, G. and Dekker, J. (2012) The long-range interaction landscape of gene promoters. Nature, 489, 109–113
CrossRef Pubmed Google scholar
[9]
Li, G., Ruan, X., Auerbach, R. K., Sandhu, K. S., Zheng, M., Wang, P., Poh, H. M., Goh, Y., Lim, J., Zhang, J., (2012) Extensive promoter-centered chromatin interactions provide a topological basis for transcription regulation. Cell, 148, 84–98
CrossRef Pubmed Google scholar
[10]
Rao, S. S., Huntley, M. H., Durand, N. C., Stamenova, E. K., Bochkov, I. D., Robinson, J. T., Sanborn, A. L., Machol, I., Omer, A. D., Lander, E. S., (2014) A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell, 159, 1665–1680
CrossRef Pubmed Google scholar
[11]
Roy, S., Siahpirani, A. F., Chasman, D., Knaack, S., Ay, F., Stewart, R., Wilson, M. and Sridharan, R. (2015) A predictive modeling approach for cell line-specific long-range regulatory interactions. Nucleic Acids Res., 43, 8694–8712
CrossRef Pubmed Google scholar
[12]
Whalen, S., Truty, R. M. and Pollard, K. S. (2016) Enhancer-promoter interactions are encoded by complex genomic signatures on looping chromatin. Nat. Genet., 48, 488–496
CrossRef Pubmed Google scholar
[13]
Schreiber, J., Libbrecht, M., Bilmes, J. and Noble, W. (2018) Nucleotide sequence and DNaseI sensitivity are predictive of 3D chromatin architecture. bioRxiv, 103614
[14]
Zhu, Y., Chen, Z., Zhang, K., Wang, M., Medovoy, D., Whitaker, J. W., Ding, B., Li, N., Zheng, L. and Wang, W. (2016) Constructing 3D interaction maps from 1D epigenomes. Nat. Commun., 7, 10812
CrossRef Pubmed Google scholar
[15]
Cao, Q., Anyansi, C., Hu, X., Xu, L., Xiong, L., Tang, W., Mok, M. T. S., Cheng, C., Fan, X., Gerstein, M., (2017) Reconstruction of enhancer-target networks in 935 samples of human primary cells, tissues and cell lines. Nat. Genet., 49, 1428–1436
CrossRef Pubmed Google scholar
[16]
Yang, Y., Zhang, R., Singh, S., and Ma, J. (2017) Exploiting sequence-based features for predicting enhancer-promoter interactions. Bioinformatics 33, i252–i260
CrossRef Google scholar
[17]
Friedman, J. H. (2001) Greedy function approximation: a gradient boosting machine. Ann. Stat., 29, 1189–1232
CrossRef Google scholar
[18]
Zhou, J. and Troyanskaya, O. G. (2015) Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods, 12, 931–934
CrossRef Pubmed Google scholar
[19]
Park, Y. and Kellis, M. (2015) Deep learning for regulatory genomics. Nat. Biotechnol., 33, 825–826
CrossRef Pubmed Google scholar
[20]
Alipanahi, B., Delong, A., Weirauch, M. T. and Frey, B. J. (2015) Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning. Nat. Biotechnol., 33, 831–838
CrossRef Pubmed Google scholar
[21]
Quang, D. and Xie, X. (2016) DanQ: a hybrid convolutional and recurrent deep neural network for quantifying the function of DNA sequences. Nucleic Acids Res., 44, e107
CrossRef Pubmed Google scholar
[22]
Li, Y., Shi, W. and Wasserman, W. W. (2018) Genome-wide prediction of cis-regulatory regions using supervised deep learning methods. BMC Bioinformatics, 19, 202
[23]
Kelley, D. R., Snoek, J. and Rinn, J. L. (2016) Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res., 26, 990–999
CrossRef Pubmed Google scholar
[24]
Zhang, S., Hu, H., Jiang, T., Zhang, L. and Zeng, J. (2017) TITER: predicting translation initiation sites by deep learning. Bioinformatics, 33, i234–i242
[25]
Cuperus, J. T., Groves, B., Kuchina, A., Rosenberg, A. B., Jojic, N., Fields, S. and Seelig, G. (2017) Deep learning of the regulatory grammar of yeast 5′ untranslated regions from 500,000 random sequences. Genome Res., 27, 2015–2024
CrossRef Pubmed Google scholar
[26]
Singh, R., Lanchantin, J., Sekhon, A. and Qi, Y. (2017) Attend and predict: understanding gene regulation by selective attention on chromatin. In: Advances in Neural Infornation Processing Systems 30
[27]
Zhang, S., Hu, H., Jiang, T., Zhang, L. and Zeng, J. (2017) TITER: predicting translation initiation sites by deep learning. Bioinformatics, 33, i234–i242
CrossRef Pubmed Google scholar
[28]
Boža, V., Brejová, B., and Vinař, T. (2017) DeepNano: deep recurrent neural networks for base calling in MinION nanopore reads. PloS one, 12, e0178751
[29]
Wang, S., Sun, S., Li, Z., Zhang, R. and Xu, J. (2017) Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS Comput. Biol., 13, e1005324
CrossRef Pubmed Google scholar
[30]
Ching, T., Himmelstein, D. S., Beaulieu-Jones, B. K., Kalinin, A. A., Do, B. T., Way, G. P., Ferrero, E., Agapow, P.-M., Zietz, M., Hoffman, M. M., (2018) Opportunities and obstacles for deep learning in biology and medicine. J. R. Soc. Interface, 15, 142760
CrossRef Pubmed Google scholar
[31]
Angermueller, C., Pärnamaa, T., Parts, L. and Stegle, O. (2016) Deep learning for computational biology. Mol. Syst. Biol., 12, 878
CrossRef Pubmed Google scholar
[32]
ENCODE Project Consortium. (2012) An integrated encyclopedia of DNA elements in the human genome. Nature, 489, 57–74
CrossRef Pubmed Google scholar
[33]
Kundaje, A., Meuleman, W., Ernst, J., Bilenky, M., Yen, A., Heravi-Moussavi, A., Kheradpour, P., Zhang, Z., Wang, J., Ziller, M. J., (2015) Integrative analysis of 111 reference human epigenomes. Nature, 518, 317–330
CrossRef Pubmed Google scholar
[34]
Kulakovskiy, I. V., Vorontsov, I. E., Yevshin, I. S., Soboleva, A. V., Kasianov, A. S., Ashoor, H., Ba-Alawi, W., Bajic, V. B., Medvedeva, Y. A., Kolpakov, F. A., (2016) HOCOMOCO: expansion and enhancement of the collection of transcription factor binding sites models. Nucleic Acids Res., 44, D116–D125
CrossRef Pubmed Google scholar
[35]
Xu, J., Sankaran, V. G., Ni, M., Menne, T. F., Puram, R. V., Kim, W. and Orkin, S. H. (2010) Transcriptional silencing of γ-globin by BCL11A involves long-range interactions and cooperation with SOX6. Genes Dev., 24, 783–798
CrossRef Pubmed Google scholar
[36]
Frank, C. L., Liu, F., Wijayatunge, R., Song, L., Biegler, M. T., Yang, M. G., Vockley, C. M., Safi, A., Gersbach, C. A., Crawford, G. E., (2015) Regulation of chromatin accessibility and Zic binding at enhancers in the developing cerebellum. Nat. Neurosci., 18, 647–656
CrossRef Pubmed Google scholar
[37]
Krivega, I. and Dean, A. (2017) LDB1-mediated enhancer looping can be established independent of mediator and cohesin. Nucleic Acids Res., 45, 8255–8268
CrossRef Pubmed Google scholar
[38]
Bowman, C. J., Ayer, D. E. and Dynlacht, B. D. (2014) Foxk proteins repress the initiation of starvation-induced atrophy and autophagy programs. Nat. Cell Biol., 16, 1202–1214
CrossRef Pubmed Google scholar
[39]
van Riel, B. and Rosenbauer, F. (2014) Epigenetic control of hematopoiesis: the PU.1 chromatin connection. Biol. Chem., 395, 1265–1274
CrossRef Pubmed Google scholar
[40]
Steidl, U., Rosenbauer, F., Verhaak, R. G., Gu, X., Ebralidze, A., Otu, H. H., Klippel, S., Steidl, C., Bruns, I., Costa, D. B., (2006) Essential role of Jun family transcription factors in PU.1 knockdown-induced leukemic stem cells. Nat. Genet., 38, 1269–1277
CrossRef Pubmed Google scholar
[41]
Gupta, S., Stamatoyannopoulos, J. A., Bailey, T. L. and Noble, W. S. (2007) Quantifying similarity between motifs. Genome Biol., 8, R24
CrossRef Pubmed Google scholar
[42]
Hodis, E., Watson, I. R., Kryukov, G. V., Arold, S. T., Imielinski, M., Theurillat, J.-P., Nickerson, E., Auclair, D., Li, L., Place, C., (2012) A landscape of driver mutations in melanoma. Cell, 150, 251–263
CrossRef Pubmed Google scholar
[43]
Xi, W. and Beer, M. A. (2018). Local epigenomic state cannot discriminate interacting and non-interacting enhancer–promoter pairs with high accuracy. PLoS Comput. Biol., 14, e1006625
[44]
Cao, Q., Anyansi, C., Hu, X., Xu, L., Xiong, L., Tang, W., Mok, M.T.S., Cheng, C., Fan, X., Gerstein, M. (2017) Reconstruction of enhancer–target networks in 935 samples of human primary cells, tissues and cell lines. Nat. Genet., 49, 1428–1436
[45]
Shrikumar, A., Greenside, P., Shcherbina, A. and Kundaje, A. (2016) Not just a black box: learning important features through propagating activation differences. arXiv, 1605.01713
[46]
Li, Y., Chen, C.-Y. and Wasserman, W. W. (2016) Deep feature selection: theory and application to identify enhancers and promoters. J. Comput. Biol., 23, 322–336
CrossRef Pubmed Google scholar
[47]
Glorot, X., Bordes, A. and Bengio, Y. (2011) Deep sparse rectifier neural networks. In: International Conference on Artificial Intelligen Vol. 15, pp. 275
[48]
LeCun, Y., Bengio, Y. and Hinton, G. (2015) Deep learning. Nature, 521, 436–444
CrossRef Pubmed Google scholar
[49]
Hochreiter, S. and Schmidhuber, J. (1997) Long short-term memory. Neural Comput., 9, 1735–1780
CrossRef Pubmed Google scholar
[50]
Graves, A., Jaitly, N. and Mohamed, A.-R. (2013) Hybrid speech recognition with deep bidirectional LSTM. In: Automatic Speech Recognition and Understanding (ASRU), 2013 IEEE Workshop on IEEE pp. 273–278
[51]
Chollet, F. (2015) Keras. https://github.com/fchollet/keras, accessed on April 10, 2018
[52]
Kingma, D. and Ba, J. (2014) Adam: a method for stochastic optimization. arXiv, 1412.6980
[53]
Ioffe, S. and Szegedy, C. (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of The 32nd International Conference on Machine Learning pp. 448–456
[54]
Krizhevsky, A., Sutskever, I. and Hinton, G. E. (2012) Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems pp. 1097–1105
[55]
Grant, C. E., Bailey, T. L. and Noble, W. S. (2011) FIMO: scanning for occurrences of a given motif. Bioinformatics, 27, 1017–1018
CrossRef Pubmed Google scholar

SUPPLEMENTARY MATERIALS

The supplementary materials can be found online with this article at https://doi.org/10.1007/s40484-019-0154-0.

ACKNOWLEDGEMENTS

We thank the members of the Ma lab, especially Yang Zhang, Yuchuan Wang, Ruochi Zhang, and Dechao Tian, for helpful discussions. We also thank Yihang Shen for technical assistance. This work was supported in part by the National Science Foundation (1252522 to Shashank Singh, 1054309 and 1262575 to Jian Ma) and the National Institutes of Health (HG007352 and DK107965 to Jian Ma).

COMPLIANCE WITH ETHICS GUIDELINES

The authors Shashank Singh, Yang Yang, Barnabás Póczos and Jian Ma declare that they have no conflict of interests.
This article does not contain any studies with human or animal subjects performed by any of the authors.

RIGHTS & PERMISSIONS

2019 Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature
AI Summary AI Mindmap
PDF(1298 KB)

Accesses

Citations

Detail

Sections
Recommended

/