A database called eukaryotic intron database (EID) was developed based on the data from GenBank. Studies on the statistical characteristics of EID show that there were 103, 848 genes, 478,484 introns, and 582,332 exons, with an average of 4.61 introns and 5.61 exons per gene. Introns of 40-120 nt in length were abundant in the database. Results of the statistical analysis on the data from nine model species showed that in eukaryotes, higher species do not necessarily have more introns or exons in a gene than lower species. Furthermore, characteristics of EID, such as intron phase, distribution of different splice sites, and the relationship between genome size and intron proportion or intron density, have been studied.
HE Miao, LI Jidong, ZHANG Shanghong
. Statistical characteristics of eukaryotic intron database[J]. Frontiers in Biology, 2006
, 1(4)
: 362
-366
.
DOI: 10.1007/s11515-006-0047-2