WaveNano: a signal-level nanopore base-caller via simultaneous prediction of nucleotide labels and move labels through bi-directional WaveNets

Sheng Wang , Zhen Li , Yizhou Yu , Xin Gao

Quant. Biol. ›› 2018, Vol. 6 ›› Issue (4) : 359 -368.

PDF (902KB)
Quant. Biol. ›› 2018, Vol. 6 ›› Issue (4) : 359 -368. DOI: 10.1007/s40484-018-0155-4
METHODOLOGY ARTICLE
METHODOLOGY ARTICLE

WaveNano: a signal-level nanopore base-caller via simultaneous prediction of nucleotide labels and move labels through bi-directional WaveNets

Author information +
History +
PDF (902KB)

Abstract

Background: The Oxford MinION nanopore sequencer is the recently appealing third-generation genome sequencing device that is portable and no larger than a cellphone. Despite the benefits of MinION to sequence ultra-long reads in real-time, the high error rate of the existing base-calling methods, especially indels (insertions and deletions), prevents its use in a variety of applications.

Methods: In this paper, we show that such indel errors are largely due to the segmentation process on the input electrical current signal from MinION. All existing methods conduct segmentation and nucleotide label prediction in a sequential manner, in which the errors accumulated in the first step will irreversibly influence the final base-calling. We further show that the indel issue can be significantly reduced via accurate labeling of nucleotide and move labels directly from the raw signal, which can then be efficiently learned by a bi-directional WaveNet model simultaneously through feature sharing. Our bi-directional WaveNet model with residual blocks and skip connections is able to capture the extremely long dependency in the raw signal. Taking the predicted move as the segmentation guidance, we employ the Viterbi decoding to obtain the final base-calling results from the smoothed nucleotide probability matrix.

Results: Our proposed base-caller, WaveNano, achieves good performance on real MinION sequencing data from Lambda phage.

Conclusions: The signal-level nanopore base-caller WaveNano can obtain higher base-calling accuracy, and generate fewer insertions/deletions in the base-called sequences.

Graphical abstract

Keywords

nanopore sequencing / bi-directional WaveNets / base-calling / third generation sequencing / deep learning

Cite this article

Download citation ▾
Sheng Wang, Zhen Li, Yizhou Yu, Xin Gao. WaveNano: a signal-level nanopore base-caller via simultaneous prediction of nucleotide labels and move labels through bi-directional WaveNets. Quant. Biol., 2018, 6(4): 359-368 DOI:10.1007/s40484-018-0155-4

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Cao, M. D., Nguyen, S. H., Ganesamoorthy, D., Elliott, A. G., Cooper, M. A. and Coin, L. J. (2017) Scaffolding and completing genome assemblies in real-time with nanopore sequencing. Nat. Commun., 8, 14515

[2]

Loman, N. J., Quick, J. and Simpson, J. T. (2015) A complete bacterial genome assembled de novo using only nanopore sequencing data. Nat. Methods, 12, 733–735

[3]

Li, Y., Han, R., Bi, C., Li, M., Wang, S. and Gao, X. (2018) DeepSimulator: a deep simulator for nanopore sequencing. Bioinformatics, 34, 2899–2908

[4]

Jain, M., Fiddes, I. T., Miga, K. H., Olsen, H. E., Paten, B. and Akeson, M. (2015) Improved data analysis for the MinION nanopore sequencer. Nat. Methods, 12, 351–356

[5]

Lu, H., Giordano, F. and Ning, Z. (2016) Oxford Nanopore MinION sequencing and genome assembly. Genom. Proteom. Bioinf., 14, 265–279

[6]

Quick, J., Loman, N. J., Duraffour, S., Simpson, J. T., Severi, E., Cowley, L., Bore, J. A., Koundouno, R., Dudas, G., Mikhail, A., (2016) Real-time, portable genome sequencing for Ebola surveillance. Nature, 530, 228–232

[7]

Castro-Wallace, S. L., Chiu, C. Y., John, K. K., Stahl, S. E., Rubins, K. H., McIntyre, A. B. R., Dworkin, J. P., Lupisella, M. L., Smith, D. J., Botkin, D. J., (2017) Nanopore DNA sequencing and genome assembly on the International Space Station. Sci. Rep., 7, 18022

[8]

Loose, M., Malla, S. and Stout, M. (2016) Real-time selective sequencing using nanopore technology. Nat. Methods, 13, 751–754

[9]

Jain, M., Olsen, H. E., Paten, B. and Akeson, M. (2016) The Oxford Nanopore MinION: delivery of nanopore sequencing to the genomics community. Genome Biol., 17, 239

[10]

Goodwin, S.,Gurtowski, J., Ethe-Sayers, S., Deshpande, P., Schatz, M. C. and McCombie, W. R. (2015) Oxford Nanopore sequencing, hybrid error correction, and de novo assembly of a eukaryotic genome. Genome Res., 25, 1750–1756

[11]

Sovic, I., Šikić M., Wilm, A., Fenlon, S. N., Chen, S. and Nagarajan, N. (2016) Fast and sensitive mapping of error-prone nanopore sequencing reads with GraphMap. Nat Commun., 7, 11307

[12]

Szalay, T. and Golovchenko, J. A. (2015) De novo sequencing and variant calling with nanopores using PoreSeq. Nat. Biotechnol., 33, 1087–1091

[13]

David, M., Dursi, L. J., Yao, D., Boutros, P. C. and Simpson, J. T. (2017) Nanocall: an open source basecaller for Oxford Nanopore sequencing data. Bioinformatics, 33, 49–55

[14]

Boža, V., Brejová B. and Vinař T. (2017) DeepNano: deep recurrent neural networks for base calling in MinION nanopore reads. PLoS One, 12, e0178751

[15]

Van Den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., Kalchbrenner, N., Senior, A., and Kavukcuoglu K. (2016) Wavenet: A generative model for raw audio. ArXiv, 1609.03499

[16]

Hochreiter, S. and Schmidhuber, J. (1997) Long short-term memory. Neural Comput., 9, 1735–1780

[17]

Chung, J., Gulcehre, C., Cho, K. H. and Bengio, Y. (2014) Empirical evaluation of gated recurrent neural networks on sequence modeling. ArXiv, 1412.3555

[18]

LeCun, Y., Bengio, Y. and Hinton, G. (2015) Deep learning. Nature, 521, 436–444

[19]

He, K., Zhang, X., Ren, S., and Sun, J. (2016) Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas

[20]

Hirschberg, J. and Manning, C. D. (2015) Advances in natural language processing. Science, 349, 261–266

[21]

Wang, S., Sun, S., Li, Z., Zhang, R. and Xu, J. (2017) Accurate de novo prediction of protein contact map by ultra-deep learning model. PLoS Comput. Biol., 13, e1005324

[22]

Altschul, S. F., Gish, W., Miller, W., Myers, E. W. and Lipman, D. J. (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403–410

[23]

Pearson, W. R. and Miller, W. (1992) Dynamic programming algorithms for biological sequence comparison. In Methods in Enzymology. pp. 575–601, Elsevier

[24]

Wang, S., Ma, J. and Xu, J. (2016) AUCpreD: proteome-level protein disorder prediction by AUC-maximized deep convolutional neural fields. Bioinformatics, 32, i672–i679

[25]

McIntyre, A. B., Rizzardi, L., Yu, A. M., Alexander, N., Rosen, G. L., Botkin, D. J., Stahl, S. E., John, K. K., Castro-Wallace, S. L., McGrath, K., (2016) Nanopore sequencing in microgravity. npj Microgravity, 2, 16035

[26]

Teng, H., Cao, M. D., Hall, M. B., Duarte, T., Wang, S. and Coin, L. J. M. (2018) Chiron: translating nanopore raw signal directly into nucleotide sequence using deep learning. Gigascience, 7, giy037

[27]

Han, R., Li, Y., Wang, S. and Gao, X. (2017) An accurate and rapid continuous wavelet dynamic time warping algorithm for unbalanced global mapping in nanopore sequencing. bioRxiv, 238857

[28]

van den Oord, A., Kalchbrenner, N., Vinyals, O., Espeholt, L., Graves, A., and Kavukcuoglu, K. (2016) Conditional image generation with pixelcnn decoders. In Advances in Neural Information Processing Systems

[29]

Wang S., Sun S., and Xu J. (2016) AUC-maximized deep convolutional neural fields for protein sequence labeling. In Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2016. Lecture Notes in Computer Science, Frasconi P., Landwehr N., Manco G., Vreeken J. (eds) vol 9852. Springer, Cham

[30]

Calders T., and Jaroszewicz S. (2007) Efficient AUC optimization for classification. In Knowledge Discovery in Databases: PKDD 2007. Lecture Notes in Computer Science, Kok J. N., Koronacki J., Lopez de Mantaras R., Matwin S., Mladenič D., Skowron A. (eds), vol 4702. Springer, Berlin, Heidelberg

RIGHTS & PERMISSIONS

Higher Education Press and Springer-Verlag GmbH Germany, part of Springer Nature

AI Summary AI Mindmap
PDF (902KB)

2675

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/