Transformer-based DNA methylation detection on ionic signals from Oxford Nanopore sequencing data

Xiuquan Wang; Mian Umair Ahsan; Yunyun Zhou; Kai Wang

doi:10.15302/J-QB-022-0323

PDF(2726 KB)

Quant. Biol. ›› 2023, Vol. 11 ›› Issue (3) : 287-296. DOI: 10.15302/J-QB-022-0323

RESEARCH ARTICLE

NGS Data Analysis - RESEARCH ARTICLE

Transformer-based DNA methylation detection on ionic signals from Oxford Nanopore sequencing data

Author information +

History +

Abstract

Background: Oxford Nanopore long-read sequencing technology addresses current limitations for DNA methylation detection that are inherent in short-read bisulfite sequencing or methylation microarrays. A number of analytical tools, such as Nanopolish, Guppy/Tombo and DeepMod, have been developed to detect DNA methylation on Nanopore data. However, additional improvements can be made in computational efficiency, prediction accuracy, and contextual interpretation on complex genomics regions (such as repetitive regions, low GC density regions).

Method: In the current study, we apply Transformer architecture to detect DNA methylation on ionic signals from Oxford Nanopore sequencing data. Transformer is an algorithm that adopts self-attention architecture in the neural networks and has been widely used in natural language processing.

Results: Compared to traditional deep-learning method such as convolutional neural network (CNN) and recurrent neural network (RNN), Transformer may have specific advantages in DNA methylation detection, because the self-attention mechanism can assist the relationship detection between bases that are far from each other and pay more attention to important bases that carry characteristic methylation-specific signals within a specific sequence context.

Conclusion: We demonstrated the ability of Transformers to detect methylation on ionic signal data.

Author summary

Transformer is an algorithm that adopts self-attention architecture in the neural networks and has been widely used in natural language processing. In the current study, we apply Transformer architecture to detect DNA methylation on ionic signals from Oxford Nanopore sequencing data. We evaluated this idea using real data sets (Escherichia coli data and the human genome NA12878 sequenced by Simpson et al.) and demonstrated the ability of Transformers to detect methylation on ionic signal data.

Graphical abstract

Keywords

Nanopore / long-read sequencing / deep learning / Transformer model / DNA methylation.

Cite this article

EndNote

Ris (Procite)

Bibtex

Download citation ▾

Xiuquan Wang, Mian Umair Ahsan, Yunyun Zhou, Kai Wang. Transformer-based DNA methylation detection on ionic signals from Oxford Nanopore sequencing data. Quant. Biol., 2023, 11(3): 287‒296 https://doi.org/10.15302/J-QB-022-0323

This is a preview of subscription content, contact us for subscripton.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Bell,J. (2021). Genetic impacts on DNA methylation: research findings and future perspectives. Genome Biol., 22: 127 CrossRef Google scholar

[2]	Kulis,M. (2010). DNA methylation and cancer. Adv. Genet., 70: 27–56 CrossRef Google scholar

[3]	Jin,B. Robertson,K. (2013). DNA methyltransferases, DNA damage repair, and cancer. Adv. Exp. Med. Biol., 754: 3–29 CrossRef Google scholar

[4]	Bernstein,C., Nfonsam,V., Prasad,A. R. (2013). Epigenetic field defects in progression to cancer. World J. Gastrointest. Oncol., 5: 43–49 CrossRef Google scholar

[5]	nez-Iglesias,O., Carrera,I., Carril,J. C., ndez-Novoa,L., Cacabelos,N. (2020). DNA methylation in neurodegenerative and cerebrovascular disorders. Int. J. Mol. Sci., 21: 2220 CrossRef Google scholar

[6]	Jeong,H., Mendizabal,I., Berto,S., Chatterjee,P., Layman,T., Usui,N., Toriumi,K., Douglas,C., Singh,D., Huh,I. . (2021). Evolution of DNA methylation in the human brain. Nat. Commun., 12: 2021 CrossRef Google scholar

[7]	Jobe,E. M. (2017). DNA methylation and adult neurogenesis. Brain Plast., 3: 5–26 CrossRef Google scholar

[8]	Tognini,P., Napoli,D. (2015). Dynamic DNA methylation in the brain: a new epigenetic mark for experience-dependent plasticity. Front. Cell. Neurosci., 9: 331 CrossRef Google scholar

[9]	McCoy,C. R., Glover,M. E., Flynn,L. T., Simmons,R. K., Cohen,J. L., Ptacek,T., Lefkowitz,E. J., Jackson,N. L., Akil,H., Wu,X. . (2019). Altered DNA methylation in the developing brains of rats genetically prone to high versus low anxiety. J. Neurosci., 39: 3144–3158 CrossRef Google scholar

[10]	Jones,P. A., Issa,J. P. (2016). Targeting the cancer epigenome for therapy. Nat. Rev. Genet., 17: 630–641 CrossRef Google scholar

[11]	Mani,S. (2010). DNA demethylating agents and epigenetic therapy of cancer. Adv. Genet., 70: 327–340 CrossRef Google scholar

[12]

Issa,J. P., Garcia-Manero,G., Giles,F. J., Mannari,R., Thomas,D., Faderl,S., Bayar,E., Lyons,J., Rosenfeld,C. S., Cortes,J. . (2004). Phase 1 study of low-dose prolonged exposure schedules of the hypomethylating agent 5-aza-2′-deoxycytidine (decitabine) in hematopoietic malignancies. Blood, 103: 1635–1640

CrossRef Google scholar

[13]	Ding,X. L., Yang,X., Liang,G. (2016). Isoform switching and exon skipping induced by the DNA methylation inhibitor 5-Aza-2′-deoxycytidine. Sci. Rep., 6: 24545 CrossRef Google scholar

[14]	Ovenden,E. S., McGregor,N. W., Emsley,R. A. (2018). DNA methylation and antipsychotic treatment mechanisms in schizophrenia: progress and future directions. Prog. Neuropsychopharmacol. Biol. Psychiatry, 81: 38–49 CrossRef Google scholar

[15]	Clark,T. A., Lu,X., Luong,K., Dai,Q., Boitano,M., Turner,S. W., He,C. (2013). Enhanced 5-methylcytosine detection in single-molecule, real-time sequencing via Tet1 oxidation. BMC Biol., 11: 4 CrossRef Google scholar

[16]	Beaulaurier,J., Zhang,X. Zhu,S., Sebra,R., Rosenbluh,C., Deikus,G., Shen,N., Munera,D., Waldor,M. K., Chess,A. . (2015). Single molecule-level detection and long read-based phasing of epigenetic variations in bacterial methylomes. Nat. Commun., 6: 7438 CrossRef Google scholar

[17]	Liu,Q., Georgieva,D. C., Egli,D. (2019). NanoMod: a computational tool to detect DNA modifications using Nanopore long-read sequencing data. BMC Genomics, 20: 78 CrossRef Google scholar

[18]	Simpson,J. T., Workman,R. E., Zuzarte,P. C., David,M., Dursi,L. J. (2017). Detecting DNA cytosine methylation using Nanopore sequencing. Nat. Methods, 14: 407–410 CrossRef Google scholar

[19]	Pimiento,C., Ehret,D. J., Macfadden,B. J. (2010). Ancient nursery area for the extinct giant shark megalodon from the Miocene of Panama. PLoS One, 5: e10552 CrossRef Google scholar

[20]	Ni,P., Huang,N., Zhang,Z., Wang,D. Liang,F., Miao,Y., Xiao,C. Luo,F. (2019). DeepSignal: detecting DNA methylation state from Nanopore sequencing reads using deep-learning. Bioinformatics, 35: 4586–4595 CrossRef Google scholar

[21]	Weirather,J. L., de Cesare,M., Wang,Y., Piazza,P., Sebastiano,V., Wang,X. Buck,D. Au,K. (2017). Comprehensive comparison of Pacific Biosciences and Oxford Nanopore Technologies and their applications to transcriptome analysis. F1000 Res., 6: 100 CrossRef Google scholar

[22]	Yuen,Z. W. Srivastava,A., Daniel,R., McNevin,D., Jack,C. (2021). Systematic benchmarking of tools for CpG methylation detection from Nanopore sequencing. Nat. Commun., 12: 3438 CrossRef Google scholar

[23]	Liu,Q., Fang,L., Yu,G., Wang,D., Xiao,C. (2019). Detection of DNA base modifications by deep recurrent neural network on Oxford Nanopore sequencing data. Nat. Commun., 10: 2449 CrossRef Google scholar

[24]	Liu,Y., Rosikiewicz,W., Pan,Z., Jillette,N., Wang,P., Taghbalout,A., Foox,J., Mason,C., Carroll,M., Cheng,A. . (2021). DNA methylation-calling tools for Oxford Nanopore sequencing: a survey and human epigenome-wide evaluation. Genome Biol., 22: 295 CrossRef Google scholar

[25]	ZhangY.Yamaguchi K.,HatakeyamaS.,FurukawaY.,MiyanoS., YamaguchiR.. (2021) On the application of bert models for Nanopore methylation detection. In: 2021 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 320–327

[26]	Jiao,L., Zhang,F., Liu,F., Yang,S., Li,L., Feng,Z. (2019). A survey of deep learning-based object detection. IEEE Access, 7: 128837–128868 CrossRef Google scholar

[27]	Amarasinghe,S. L., Su,S., Dong,X., Zappia,L., Ritchie,M. E. (2020). Opportunities and challenges in long-read sequencing data analysis. Genome Biol., 21: 30 CrossRef Google scholar

[28]	DevlinJ.,Chang M.LeeK.. (2018) Bert: pre-training of deep bidirectional transformers for language understanding. arXiv, 181004805

[29]	VaswaniA.,Shazeer N. M.,ParmarN.,UszkoreitJ.,JonesL., GomezA. N.,Kaiser L.. (2017) Attention is all you need. arXiv, 1706.03762

DATA AND CODE AVAILABILITY

All code for data cleaning and analysis associated with the current submission is available upon request.

ACKNOWLEDGEMENTS

We thank Drs. Qian Liu and Jacqueline Peng for technical assistance and thank Dr. Li Fang for insightful discussions on the model. We thank the authors of methBERT for an initial implementation of BERT models on the problem of methylation detection from signal data sets. We thank the Genome In A Bottle Consortium for providing the NA12878 data sets and ground truth methylation calls for evaluating our method.

COMPLIANCE WITH ETHICS GUIDELINES

The authors Xiuquan Wang, Mian Umair Ahsan, Yunyun Zhou and Kai Wang declare that they have no conflict of interest or financial conflicts to disclose.

This article does not contain any studies with human or animal materials performed by any of the authors.

OPEN ACCESS

This article is licensed by the CC By under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.