Detecting differential expression from RNA-seq data with expression measurement uncertainty

Li ZHANG, Songcan CHEN, Xuejun LIU

PDF(709 KB)
PDF(709 KB)
Front. Comput. Sci. ›› 2015, Vol. 9 ›› Issue (4) : 652-663. DOI: 10.1007/s11704-015-4308-6
RESEARCH ARTICLE

Detecting differential expression from RNA-seq data with expression measurement uncertainty

Author information +
History +

Abstract

High-throughput RNA sequencing (RNA-seq) has emerged as a revolutionary and powerful technology for expression profiling. Most proposed methods for detecting differentially expressed (DE) genes from RNA-seq are based on statistics that compare normalized read counts between conditions. However, there are few methods considering the expression measurement uncertainty into DE detection. Moreover, most methods are only capable of detecting DE genes, and few methods are available for detecting DE isoforms. In this paper, a Bayesian framework (BDSeq) is proposed to detect DE genes and isoforms with consideration of expression measurement uncertainty. This expression measurement uncertainty provides useful information which can help to improve the performance of DE detection. Three real RAN-seq data sets are used to evaluate the performance of BDSeq and results show that the inclusion of expression measurement uncertainty improves accuracy in detection of DE genes and isoforms. Finally, we develop a GamSeq-BDSeq RNA-seq analysis pipeline to facilitate users.

Keywords

RNA-seq / Bayesian method / differentially expressed genes/isoforms / expression measurement uncertainty / analysis pipeline

Cite this article

Download citation ▾
Li ZHANG, Songcan CHEN, Xuejun LIU. Detecting differential expression from RNA-seq data with expression measurement uncertainty. Front. Comput. Sci., 2015, 9(4): 652‒663 https://doi.org/10.1007/s11704-015-4308-6

References

[1]
Mortazavi A, Williams A, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods, 2008, 5(7): 621―628
CrossRef Google scholar
[2]
Marioni J, Mason C, Mane S, Stephens M, Gilad Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Research, 2008, 18: 1509―1517
CrossRef Google scholar
[3]
Marguerat S, Bähler J. RNA-seq: from technology to biology. Cellular and Molecular Life Sciences, 2010, 67(4): 569―579
CrossRef Google scholar
[4]
Rapaport F, Khanin R, Liang Y, Pirun M, Krek A, Zumbo P, Mason C E, Socci N D, Betel D. Comprehensive evaluation of differential gene expression analysis methods for RNA-seq data. Genome Biology, 2013, 14(9): R95
CrossRef Google scholar
[5]
Zhang Z H, Jhaveri D J, Marshall VM, Bauer D C, Edson J, Narayanan R K, Zhao Q. A comparative study of techniques for differential expression analysis on RNA-Seq data. PLoS ONE, 2014, 9: e103207
CrossRef Google scholar
[6]
Ozsolak F, Milos P. RNA sequencing: advances, challenges and opportunities. Nature Reviews Genetics, 2011, 12(2): 87―98
CrossRef Google scholar
[7]
Soneson C, Delorenzi M. A comparison of methods for differential expression analysis of RNA-seq data. BMC Bioinformatics, 2013, 14(1): 9
CrossRef Google scholar
[8]
Kvam V, Lu P, Si Y. A comparison of statistical methods for detecting differentially expressed genes from Rna-Seq data. American Journal of Botany, 2012, 99(2): 248―256
CrossRef Google scholar
[9]
Seyednasrollah F, Laiho A, Elo L L. Comparison of software packages for detecting differential expression in RNA-seq studies. Briefings in bioinformatics, 2013, bbt086
[10]
Anders S, McCarthy D J, Chen Y, Okoniewski M, Smyth G K, Huber W, Robinson M D. Count-based differential expression analysis of RNA sequencing data using R and Bioconductor. Nature Protocols, 2013, 8(9): 1765―1786
CrossRef Google scholar
[11]
Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biology, 2010, 11(10): R106
CrossRef Google scholar
[12]
Hardcastle T, Kelly K. baySeq: empirical Bayesian methods for identifying differential expression in sequence count data. BMC Bioinformatics, 2010, 11(1): 422
CrossRef Google scholar
[13]
Di Y, Schafer D, Cumbie J, Chang J. The NBP negative binomial model for assessing differential gene expression from RNA-Seq. Statistical Applications in Genetics and Molecular Biology, 2011, 10(1): 1―28
CrossRef Google scholar
[14]
Yu D, Huber W, Vitek O. Shrinkage estimation of dispersion in negative binomial models for RNA-seq experiments with small sample size. Bioinformatics, 2013, 29(10): 1275―1282
CrossRef Google scholar
[15]
Robinson M, Smyth G. Moderated statistical tests for assessing differences in tag abundance. Bioinformatics, 2007, 23(21): 2881―2887
CrossRef Google scholar
[16]
Wu H, Wang C, Wu Z. A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data. Biostatistics, 2013, 14(2): 232―243
CrossRef Google scholar
[17]
Law CW, Chen Y, Shi W, Smyth G K. Voom: precision weights unlock linear model analysis tools for RNA-seq read counts. Genome Biology, 2014, 15: R29
CrossRef Google scholar
[18]
Bi Y, Davuluri R V. NPEBseq: nonparametric empirical bayesianbased procedure for differential expression analysis of RNA-seq data. BMC bioinformatics, 2013, 14(1): 262
CrossRef Google scholar
[19]
Sandmann T, Vogg M, Owlarn S, Boutros M, Bartscherer K. The headregeneration transcriptome of the planarian Schmidtea mediterranea. Genome Biol, 2011, 12(8): R76
CrossRef Google scholar
[20]
Jiang H, Wong W. Statistical inferences for isoform expression in RNA-Seq. Bioinformatics, 2009, 25(8): 1026―1032
CrossRef Google scholar
[21]
Li B, Dewey C. RSEM: accurate transcript quantification from RNASeq data with or without a reference genome. BMC Bioinformatics, 2011, 12(1): 323
CrossRef Google scholar
[22]
Trapnell C, Williams B, Pertea G, Mortazavi A, Kwan G, Baren M, Salzberg S, Wold B, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nature Biotechnology, 2010, 28(5): 211―215
CrossRef Google scholar
[23]
Glaus P, Honkela A, Rattray M. Identifying differentially expressed transcripts from RNA-seq data with biological variation. Bioinformatics, 2011, 28(13): 1721―1728
CrossRef Google scholar
[24]
Leng N, Dawson J, Thomson A, Ruotti V, Rissman A, Smits B M G, Haag J D, Gould M N, Stewart R M, Kendziorski C. EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments. Bioinformatics, 2013, 29(8): 1035―1043
CrossRef Google scholar
[25]
Trapnell C, Roberts A, Goff L, Pertea G, Kim D, Kelley D, Pimentel H, Salzberg S L, Rinn J L, Pachter L. Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks. Nature Protocols, 2012, 7(3): 562―578
CrossRef Google scholar
[26]
Hein A, Richardson S, Causton H, Ambler G, Green P. BGX: a fully Bayesian integrated approach to the analysis of Affymetrix GeneChip data. Biostatistics, 2005, 6(3): 349―373
CrossRef Google scholar
[27]
Liu X, Milo M, Lawrence D, Rattray M. Probe-level measurement error improv<?Pub Caret?>es accuracy in detecting differential gene expression. Bioinformatics, 2006, 22(17): 2107&horbar;2113
CrossRef Google scholar
[28]
Zhang L, Liu X. An improved probabilistic model for finding differential gene expression. In: Proceedings of the 2nd International Conference on Biomedical Engineering and Informatics. 2009, 1-4: 1566&horbar;1571
[29]
Zhang L, Liu X. A Gamma-based method of RNA-seq analysis. Journal of Nanjing University (Natural Sciences), 2013, 49: 465&horbar;474(in Chinese)
[30]
Jordan M, Ghahramani Z, Jaakkola T, Saul L. An introduction to variational methods for graphical models. Machine Learning, 1999, 37(2): 183&horbar;233
CrossRef Google scholar
[31]
Sun J, Kaban A. A fast algorithm for robust mixtures in the presence of measurement errors. IEEE Transactions on Neural Networks, 2010, 21(8): 1206&horbar;1220
CrossRef Google scholar
[32]
MAQC Consortium. TheMicroArray Quality Control (MAQC) project shows inter- and intraplatform reproducibility of gene expression measurements. Nature Biotechnology, 2006, 24(9): 1151&horbar;1161
CrossRef Google scholar
[33]
Canales R D, Luo Y L, Willey J C, Austermiller B, Barbacioru C C, Boysen C, Hunkapiller K, Jensen R V, Knight C R, Lee K Y, Ma Y Q, Maqsodi B, Papallo A, Peters E H, Poulter K, Ruppel P L, Samaha R R, Shi L M, Yang W, Zhang L, Goodsaid F M. Evaluation of DNA microarray results with quantitative gene expression platforms. Nature Biotechnology, 2006, 24(9): 1115&horbar;1122
CrossRef Google scholar
[34]
Griffith M, Griffith OL, Mwenifumbo J, Goya R, Morrissy AS, Morin R D, Corbett R, Tang M J, Hou Y C, Pugh T J, Robertson G, Chittaranjan S, Ally A, Asano J K, Chan S Y, Li H Y I, McDonald H, Teague K, Zhao Y J, Zeng T, Delaney A, Hirst M, Morin G B, Jones S GM, Tai I T, Marra M A. Alternative expression analysis by RNA sequencing. Nature Methods, 2010, 7(10): 843&horbar;847
CrossRef Google scholar
[35]
Wang E, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore S F, Schroth G P, Burge C B. Alternative isoform regulation in human tissue transcriptomes. Nature, 2008, 456(7221): 470&horbar;476
CrossRef Google scholar

RIGHTS & PERMISSIONS

2014 Higher Education Press and Springer-Verlag Berlin Heidelberg
AI Summary AI Mindmap
PDF(709 KB)

Accesses

Citations

Detail

Sections
Recommended

/