RESEARCH ARTICLE

Observations on potential novel transcripts from RNA-Seq data

  • Chao YE ,
  • Linxi LIU ,
  • Xi WANG ,
  • Xuegong ZHANG
Expand
  • Key Laboratory of Bioinformatics and Bioinformatics Division, Ministry of Education, Tsinghua National Laboratory for Information Science and Technology/Department of Automation, Tsinghua University, Beijing 100084, China

Received date: 23 Mar 2011

Accepted date: 12 Apr 2011

Published date: 05 Jun 2011

Copyright

2014 Higher Education Press and Springer-Verlag Berlin Heidelberg

Abstract

With the rapid development of next generation deep sequencing technologies, sequencing cDNA reverse-transcribed from RNA molecules (RNA-Seq) has become a key approach in studying gene expression and transcriptomes. Because RNA-Seq does not rely on annotation of known genes, it provides the opportunity of discovering transcripts that have not been annotated in current databases. Studying the distribution of RNA-Seq signals and a systematic view on the potential new transcripts revealed from the signals is an important step toward the understanding of transcriptomes.

Cite this article

Chao YE , Linxi LIU , Xi WANG , Xuegong ZHANG . Observations on potential novel transcripts from RNA-Seq data[J]. Frontiers of Electrical and Electronic Engineering, 2011 , 6(2) : 275 -282 . DOI: 10.1007/s11460-011-0148-9

1
Mercer T R, Dinger M E, Mattick J S. Long non-coding RNAs: insights into functions. Nature Reviews Genetics, 2009, 10(3): 155-159

DOI

2
van Bakel H, Hughes T R. Establishing legitimacy and function in the new transcriptome. Briefings in Functional Genomics & Proteomics, 2009, 8(6): 424-436

DOI

3
Schena M, Shalon D, Davis R W, Brown P O. Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science, 1995, 270(5235): 467-470

DOI

4
Shendure J, Ji H. Next-generation DNA sequencing. Nature Biotechnology, 2008, 26(10): 1135-1145

DOI

5
Metzker M L. Sequencing technologies — the next generation. Nature Reviews Genetics, 2010, 11(1): 31-46

DOI

6
Wang Z, Gerstein M, Snyder M. RNA-Seq: a revolutionary tool for transcriptomics. Nature Reviews Genetics, 2009, 10(1): 57-63

DOI

7
Cock P J, Fields C J, Goto N, Heier M L, Rice P M. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Research, 2010, 38(6): 1767-1771

DOI

8
Mortazavi A, Williams B A, McCue K, Schaeffer L, Wold B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods, 2008, 5(7): 621-628

DOI

9
Marioni J C, Mason C E, Mane S M, Stephens M, Gilad Y. RNA-Seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Research, 2008, 18(9): 1509-1517

DOI

10
Friedlaender M R, Chen W, Adamidi C, Maaskola J, Einspanier R, Knespel S, Rajewsky N. Discovering micro-RNAs from deep sequencing data using miRDeep. Nature Biotechnology, 2008, 26(4): 407-415

DOI

11
Pan Q, Shai O, Lee L J, Frey B J, Blencowe B J. Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nature Genetics, 2008, 40(12): 1413-1415

DOI

12
Wang E T, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore S F, Schroth G P, Burge C B. Alternative isoform regulation in human tissue transcriptomes. Nature, 2008, 456(7221): 470-476

DOI

13
Jiang H,Wong WH. Statistical inferences for Isoform expression in RNA-Seq. Bioinformatics, 2009, 25(8): 1026-1032

DOI

14
Homer N, Merriman B, Nelson S F. BFAST: an alignment tool for large scale genome resequencing. PLoS One, 2009, 4(11): e7767

DOI

15
Langmead B, Trapnel C, Pop M, Salzberg S L. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology, 2009, 10(3): R25

DOI

16
Li H, Ruan J, Durbin R. Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Research, 2008, 18(11): 1851-1858

DOI

17
Trapnell C, Pachter L, Salzberg S L. TopHat: discovering splice junctions with RNA-Seq. Bioinformatics, 2009, 25(9): 1105-1111

DOI

18
Au K F, Jiang H, Lin L, Xing Y, Wong W H. Detection of splice junctions from paired-end RNA-Seq data by SpliceMap. Nucleic Acids Research, 2010, 38(14): 4570-4578

DOI

19
Wang K, Singh D, Zeng Z, Coleman S J, Huang Y, Savich G L, He X, Mieczkowski P, Grimm S A, Perou C M, MacLeod J N, Chiang D Y, Prins J F, Liu J. MapSplice: accurate mapping of RNA-Seq reads for splice junction discovery. Nucleic Acids Research, 2010, 38(18): e178

DOI

20
Trapnell C, Salzberg S L. How to map billions of short reads onto genomes. Nature Biotechnology, 2009, 27(5): 455-457

DOI

21
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, 1000 Genome Project Data Processing Subgroup. The Sequence Alignment/MAP format and SAMtools. Bioinformatics, 2009, 25(16): 2078-2079

DOI

22
Pruitt K D, Tatusova T, Maglott D R. NCBI reference sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Research, 2005, 33(suppl 1): D501-D504

DOI

23
Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, Cox T, Cuff J, Curwen V, Down T, Durbin R, Eyras E, Gilbert J, Hammond M, Huminiecki L, Kasprzyk A, Lehvaslaiho H, Lijnzaad P, Melsopp C, Mongin E, Pettett R, Pocock M, Potter S, Rust A, Schmidt E, Searle S, Slater G, Smith J, Spooner W, Stabenau A, Stalker J, Stupka E, Ureta-Vidal A, Vastrik I, Clamp M. The Ensemble genome database project. Nucleic Acids Research, 2002, 30(1): 38-41

DOI

24
Harrow J, Denoeud F, Frankish A, Reymond A, Chen C K, Chrast J, Lagarde J, Gilbert J G R, Storey R, Swarbreck D, Rossier C, Ucla C, Hubbard T, Antonarakis S E, Guigo R. GENCODE: producing a reference annotation for ENCODE. Genome Biology, 2006, 7(Suppl 1): S4.1-S4.9

25
Wang L K, Feng Z X, Wang X, Wang X W, Zhang X G. DEGseq: an R package for identifying differentially expressed genes from RNA-Seq data. Bioinformatics, 2010, 26 (1): 136-138

DOI

26
Kent W J, Sugnet C W, Furey T S, Roskin K M, Pringle T H, Zahler A M, Haussler D. The human genome browser at UCSC. Genome Research, 2002, 12(6): 996-1006

27
Robinson J T, Thorvaldsdóttir H, Winckler W, Guttman M, Lander E S, Getz G, Mesirov J P. Integrative genomics viewer. Nature Biotechnology, 2011, 29(1): 24-26

DOI

28
Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society Series B, 1995, 57(1): 289-300

Outlines

/