Novel polymorphic EST-based microsatellite marker isolation and characterization from Poncirus trifoliata (Rutaceae)

Novel Poncirus trifoliata simple sequence repeat (SSR) markers were developed to evaluate their utility for genetic diversity and breeding studies of P. trifoliata and related species. A total of 108 primer pairs were characterized by PCR ampli ﬁ cation experi-ments. Among these, 61 were polymorphic and transferable to other citrus species. The number of alleles per locus ranged from 2 to 6, with an average of 2.37 alleles per locus. The expected heterozygosity and observed heterozygosity ranged from 0 to 0.83 and 0 to 1.00, respectively. These novel polymorphic SSR markers will be useful for citrus cultivar identi ﬁ cation and evaluation as well as breeding studies.


Introduction
Poncirus trifoliata (L.) Raf. (Trifoliate orange) is a member of the family Rutaceae and closely related to the genus Citrus. It is also known as Chinese bitter orange and its center of origin is northern China. Sometimes it is included as a species in the genus Citrus. Swingle [1] recorded and compared the characters of Poncirus and Citrus and demonstrated that some pronounced characters, namely, being a deciduous tree with compound leaves, pubescent fruit, shrub-like tree structure and very thorny branches, make it difficult to place Poncirus as an ordinary member in the genus Citrus. The resistance of Poncirus against the Phytophthora root rot, the citrus tristeza virus (CTV) and the citrus nematode (CN), its cold tolerance and graft compatibility with other citrus species make it a perfect rootstock for commercial propagation of cultivated citrus species.
SSR markers have been widely used in genetic diversity, population genetic studies, linkage map construction as well as several breeding applications due to the superior characteristics of SSR markers over other molecular markers [2] . Previous studies suggested that Expressed Sequence Tag (EST) databases are valuable sources from which to develop SSR markers [3][4][5][6][7] . The development of SSR markers specific to P. trifoliata is important and critical for improving its varieties and assisting breeding programs. Till to date, no EST-SSR markers have been reported for P. trifoliata. Hence, the aim of this study was to develop P. trifoliata specific EST-SSR markers, investigate their cross-taxa transferability and utility for germplasm characterization.

Plant materials
A total of 42 accessions (Appendix A, Table S1) of citrus and its relatives were obtained from the Citrus Germplasm and Conservation Center of Huazhong Agricultural University, China. DNA of those accessions was extracted from the young leaves using the CTAB (Hexadecyltrimethylammonium bromide) methods described by Cheng et al. [8] .
2.2 EST processing and marker development EST sequences of P. trifoliata were downloaded from the NCBI (National Center for Biotechnology Information; http://www.ncbi.nlm.nih.gov) on the18th April, 2012. A perl script est_trimmer.pl was used to remove unusual EST sequences, vector contamination, poly A and poly T bases from the downloaded P. trifoliata EST sequences. After that, high quality EST sequences were assembled, redundancy of the ESTs were removed using Cap3 (http://mobyle.pasteur.fr/cgi-bin/portal.py#forms::cap3) software. Assembled ESTs were screened for SSR marker development by using SSRLocator software [9] using a minimum SSR repeat length of 16 bp, an expected product size of 100-300 bp, an annealing temperature of 50-55°C and a GC content of 45%-50%. To identify putative functions of the developed P. trifoliata EST-SSR primers, flanking regions of the primer pairs were blasted against the NCBI protein database using Blast2Go tool (www. blast2go.com). Virtual PCR strategy was applied to select in silico polymorphic SSR markers. Selected SSR primers were synthesized by the Sangon Company, Shanghai, China.  Table S2). PCR product was separated by 6% denaturant PAGE (Poly-Acrylamide Gel Electrophoresis) (60 cm Â 30 cm Â 0.4 cm), and bands visualized by silver-staining according to the protocol of Ruiz et al. [10] .

Data analysis
Allele numbers, expected heterozygosity, observed heterozygosity and polymorphic information content (PIC) values were estimated with PowerMarker V3.25.

Results
In this study, 25388 non-redundant P. trifoliata EST sequences (obtained after clustering ESTs from NCBI) were screened for microsatellites using the SSRLocator software [9] . The search was restricted to repeats with a minimum number of 8, 7, 5, 4 and 4 complete repeat units for di-, tri-, tetra-, penta-and hexa-nucleotide motifs, respectively. In total, 976 SSRs were identified from 887 ESTs (Appendix A, Table S3). Thus 3.49% of the P. trifoliata EST contained one or more microsatellites with one SSR per 4 kb EST (976 SSR in 22.7 Mb). Of the 887 SSR containing EST sequences, 695 (78%) ESTs are suitable for P. trifoliata EST SSR (PteSR) marker development. These sequences were further compared by blast against the NCBI non-redundant protein database with an E-value cut-off of 1.e -3 , in order to identify functional EST sequences. As a result, 699 sequences containing di (333), tri (170), tetra (47), penta (60) and hexa (59) microsatellites unites were identified as functional P. trifoliata specific EST SSR motifs (Fig. 1a). Microsatellites with higher repeat numbers and having a significant similarity to existing proteins in the NCBI database were favored because they are generally considered to be more polymorphic and functional. With these criteria, 542 primer pairs were selected for the in silico PCR (VPCR) amplification analysis, which was performed using SSRLocator. Whole genomic sequences of Citrus sinensis and C. clementina were used as templates for the virtual PCR amplification analysis and results reveal that 226 (41%) primer pairs had positive VPCR amplification.
To determine the genomic distribution of developed PteSR marker, we mapped 542 marker on nine chromosome of C. sinensis and results revealed that 198 (36%) markers distributed among nine chromosomes of C. sinensis with an average marker density 0.83 Mb (Fig. 1b, Fig. 2). Chromosome mapping of the markers showed highest frequency of markers on chromosome 5 (38 markers, 7%) and lowest on chromosome 1 (8, 1.4%) (Fig. 1b). In silico cross genera transferability was performed using an in silico PCR strategy to select a sub-set of PteSR marker for subsequent estimatation of their utility for genetic diversity of citrus and relative species. The results showed the highest transferability (42%) in C. sinensis and the lowest (1%) in Arabidopsis with an average of 12% of PteSR markers transferable in citrus and non citrus species (Fig. 1c). Based on the in silico cross-genera transferability results the best 108 primer pairs were selected for subsequent PCR amplification and analysis for transferability to citrus species and for determining their polymorphism in eight Citrus spp. (C. sinensis, C. reticulata, C. grandis, C. aurantifolia, C. limon, C. medica, C. ichangensis, C. paradisi, P. trifoliata and Fortunella hindsii). Among all 108 primer pairs tested, 18 (16%) failed to amplify the expected products. The remaining 90 microsatellite loci produced the expected products, of which 61 (56%) were found to be polymorphic and transferable to citrus related species.
Furthermore, a subset of 12 primer pairs was used for genotyping 42 accessions from six species (P. trifoliata, Fortunella sp. C. grandis, C. sinensis, C. reticulata, C. limon). Our results reveal that the number of alleles per locus varied from 1 to 6 with an average of 2.38 ( Table 1). Values of Ho, He and PIC ranged from 0 to 1.00, from 0 to 0.83 and 0 to 0.81, respectively. The highest number of alleles per locus (2.75) and diversity (He = 0.453) was recorded for C. reticulata, followed by P. trifoliata (number of allele = 2.667, He = 0.501). The single taxon from Fortunella appeared to be the least polymorphic for the markers investigated with only 1.66 alleles per locus, a lower expected heterozygosity and lower level of the PIC value.

Discussion
A large amount of EST data in the public domain provides a valuable resource for genetic research, including molecular marker identification, and SSRs have been discovered from EST data for many plant species [7,[11][12][13][14][15][16] . The SSR frequency in the P. trifoliata ESTs was 1/4.0 kb and a similar SSR frequency has been reported in poplar (1/4.0 kb), citrus (1/5.2 kb), Arabidopsis (1/6.0 kb) and sweet potato (1/7.1 kb) [17,18] , although much higher and lower SSR frequencies have also been reported in plants, viz., rice (1/40.0 kb) and cucumber (1/1.8 kb) [19,20] . Direct comparison of the frequency of occurrence of SSRs in different reports is difficult given that estimates are dependent on the SSR mining tools, search criteria, size of the data set and the redundancy of EST sequences. Dinucleotide repeats were the most frequent motif types in the transcribed region of the P. trifoliata genome, as has often been observed in other plant species [21] . In addition, the frequency of motif types varies greatly according to the SSR search criteria and search tools. The percentage of polymorphic P. trifoliata markers was slightly less than   observed in a previous study with C. grandis EST SSR markers, in which 57% of the primers were polymorphic [5] . Transferability to the related genera of SSR markers make them useful for genetic research, such as genotyping, linkage map construction, association mapping. P. trifoliata EST-SSR markers showed transferability to the Fortunella and Citrus genera. The rate of SSR transferability in different species is related to the genetic distance between the species from which the SSRs were developed and the other species [22] . We also found a similar trend for P. trifoliata EST-SSR markers transferability. As expected 36% anchored in the C. sinensis genome, which might be due to genomic similarity of these two species. This suggests that P. trifoliata EST-SSR markers might be useful for C. sinensis breeding research, such as in mapping, genotyping and population structure estimation.

Conclusions
In this study we developed, characterized and utilized the first set of functionally relevant SSR markers for P. trifoliata. These novel highly transferable polymorphic microsatellite markers will be useful for germplasm diversity, genetic mapping, and population structure analysis of Citrus and its relatives, and could help reveal the inter-specific relationships, origin and evolution of cultivated Citrus species. PIC Note: For each primer pair the number of alleles (AN), expected heterozygosity (He), observed heterozygosity (Ho) and the polymorphic information content (PIC) are given. The sample size for each population is shown in parentheses.