Introduction
Peanut (
Arachis hypogaea L.) is an important vegetable oil and protein crop and is planted in most countries. Different types of peanut varieties were different in morphology and physiological and agronomic characters, but there were not enough useful markers for quantitative trait locus (QTL) analysis and marker-assisted selection (
Raina et al., 2001). Current genetic research in peanuts is hindered for lack of sufficient molecular markers. Microsatellite or simple sequence repeat (SSR) markers are co-dominant, multiallelic, and highly polymorphic genetic markers (
Powell et al., 1996). EST-SSRs have received a lot of attention when the amount of ESTs deposited in databases for various plants is increasing (
Varshney et al., 2005). EST-SSRs exist in many plants like barley, maize, rice, sorghum and wheat (
Kantety et al., 2002), and this indicates that EST-SSR markers have potential use in peanut genetic studies. SSR marker showed more polymorphism than anyone of the other markers in peanut (
Han et al., 2004;
Tang et al., 2004;
Gimenes et al., 2007;
Liang et al., 2009). The application of SSR marker is not only to find the genetic diversity of peanut (
Hopkins et al., 1999;
He et al., 2003;
Moretzsohn et al., 2004;
Tang et al., 2007;
Hong et al., 2008) but also to draw the genetic linkage map of peanut (
Moretzsohn et al., 2005;
Varshney et al., 2009;
Hong et al., 2010).
A vast amount of available EST sequence data have been generated from the development of peanut EST projects, and ESTs have been established from a wild
Arachis species for gene discovery and marker development (
Proite et al., 2007). Therefore, these data offer an opportunity to identify SSR in ESTs by data mining. ESTs of peanut in the NCBI database before September 1, 2009 were downloaded and analyzed. By designing primers, more molecular markers for potential use in peanut genetic research were developed in our study.
Materials and methods
Sequence data
ESTs of peanut in the NCBI database (http://www.ncbi.nlm.nih.gov) ended before September 1, 2009 and were downloaded to obtain a total of 92403 ESTs sequences. DNAstar program was used for sequence splicing and redundancy analysis.
Screening of SSRs
The SSR loci were searched using the online software of Simple Sequence Repeat Identification Tool (SSRIT). The SSR motifs, with repeat units more than seven times in di-nucleotides, and more than five times in tri-, tetra-, penta- and hexa-nucleotides, were considered as SSR search criteria.
Primer design
The Primer Select of DNAstar Program was used for designing the primer pairs for SSRs, with the primer length from 18 to 24 bp, Tm from 40 to 65°C and amplification rate of more than 70%.
Results
Occurrence of SSRs in non-redundant peanut ESTs
92403 EST sequences of peanut in the NCBI database were downloaded; 5834 SSR-EST sequences were screened (6.3%). After all redundancy 2267 SSR-containing EST sequences remained and 2594 SSR loci included 729 dinucleotide loci (28.1%) and 1700 trinucleotide loci (65.54%) (Table 1).
Occurrence of different SSR of peanut
The total number of motif sequence types in SSR-EST was 92, with di-, tri-, tetra-, penta- and hexa- nucleotides of 3, 18, 26, 18 and 22. Di- and tri- nucleotides were predominant (93.64%) and others were negligible (6.36%).
The frequency of various repeat units varied greatly and the top eight repeat units were AG/TC (20.1%), AAG/TTC (11.8%), AAT/TTA (10.1%), AGG/TCC (6.6%), AGA/TCT (6.3%), AT/TA (5.9%), ACT/TGA (3.8%), and ATG/TAC (3.7%) and other 84 repeat units were 31.7% frequency. The frequency of AG/TC and AAG/TTC was 71.47% among the di-nucleotide motifs and 17.94% among the tri-nucleotide motifs, respectively (Table 2).
Primer design
Only 237 prime pairs were successfully designed based on the 2267 SSR-EST sequences. These prime pairs contained five kinds of repeat types, one of which had 213 pairs, with di-, tri-, tetra-, penta- and hexa-nucleotides of 56, 143, 9, 2, 3 pairs, respectively. The remaining 24 pairs included two kinds of repeat types.
Discussion
EST-SSR markers have more advantages than the general molecular markers, e.g. RFLP-, RAPD-, AFLP- and gSSR-marker, with high information and excellent transferability (
Decroocq et al., 2003;
Shangguan et al., 2010). If an EST is tagged with a genetic traits chain, this EST may be related with the gene controlling the traits. The developing EST-SSR marker is simple, quick and low in cost.
A high proportion of redundant ESTs usually appeared as the result of random sequencing within cDNA libraries. In this study, SSR search was performed following redundancy elimination to reduce the data set size and avoid overestimation of the EST-SSR frequency. 92403 EST sequences of peanut in the NCBI database were downloaded for SSR search, 6.3% of which (5834) contained SSR motifs, higher than that of rice (3.4%) (
Cardle et al., 2000), bread wheat (5.4%) (
Gupta et al., 2003), maize (1.4%), and barley (3.4%) (
Kantety et al., 2002) but lower than soybean (
Song et al., 2004). It was similar to the previous reports for cultivated peanuts (
Liang et al., 2009). This indicated that there were relatively higher abundant SSRs in peanut ESTs. The different abundance of SSRs was known to be dependent on different species, SSR search criteria, and database-mining tools (
Varshney et al., 2005).
In earlier reports, both di-nucleotide repeat and tri-nucleotide repeat were generally the most common motif (
Li et al., 2004;
Varshney et al., 2005), but the repeat types were usually different. In this study, the total number of motif sequence types in SSR-EST was 92 including 65.54% tri-nucleotide repeats and 28.10% di-nucleotide repeats. Therefore, the AG/TC with 20.1% frequency was the most abundant motif for most plants, while AAG/TTC with 11.8% frequency was similar to that of
Arabidopsis thaliana (14%) (
Cardle et al., 2000) and soybean (
Gao et al., 2003).
A total of 237 pairs of primers were successfully designed based on the 2267 SSR-EST sequences and had 10.5% frequency. The polymorphism of these primers was detected using local peanut varieties of Hebei Province, China.
Higher Education Press and Springer-Verlag Berlin Heidelberg