Introduction
In traditional taxonomic studies based on morphology, butterflies are divided into two superfamilies and five families: the Hesperiidae are generally designated as an independent superfamily (Hesperioidea), whereas all species in the other four families (Papilionidae, Pieridae, Lycaenidae, and Nymphalidae) are placed within the superfamily Papilionoidea (
Harvey,1991). However, phylogenetic studies based on molecular data have led to controversy regarding the phylogenetic relationship between Papilionoidea and Hesperioidea.
Weller and Pashley (1995) explored the phylogenetic relationships of butterflies and their related species in the macrolepidopteran superfamilies based on three integrated gene sequences (
ND1,
18S rRNA, and
28S rRNA) and 43 morphological characteristics. The results showed that Hesperioidea and Papilionoidea comprise a monophyletic group.
Wahlberg et al. (2005) sequenced DNA fragments of substantial length (3258 bp) from three genes (
COI,
EF-1a, and
wingless) and analyzed these using 99 morphological characters of 57 taxa. The results also showed that Hesperioidea and Papilionoidea are a monophyletic group. Some recent studies have also supported the contention that Hesperioidea and Papilionoidea are sister groups (
Warren et al., 2008,
2009;
Zou et al., 2009). However, other studies have indicated that Hesperiidae appears to be a sister group of families included in Papilionoidea, thereby invalidating the superfamily status of Hesperioidea (
Regier et al., 2008;
Mutanen et al., 2010;
Kim et al., 2014).
Owing to its maternal inheritance, lack of recombination, and accelerated nucleotide substitution rates compared to those of nuclear DNA, the mitochondrial genome has in recent years been routinely used in studies on phylogenetics, comparative and evolutionary genomics, population genetics, and molecular evolution. However, as a semi-autonomous organelle, the development of mitochondria has been controlled by both mitochondrial and nuclear genes. Therefore, nuclear genes may have an impact on the evolution of mitochondrial genes. Moreover, the biologic information contained in the nuclear genome is more abundant than that in the mitochondrial genome. Accordingly, to more accurately reflect the phylogenetic relationship among species, it would be preferable to use the data of concatenated mitochondrial and nuclear genes in phylogenetic analyses. Indeed, it is now common practice among insect molecular systematists to combine one or more mitochondrial genes with one or more nuclear genes, as the two types of data are unlinked and evolve under different constraints (
Hsu et al., 2001;
Lin and Danforth, 2004;
Pena et al., 2006;
Silva-Brandao et al., 2008;
Kim et al., 2010).
The
wingless gene (nuclear gene) is a single-copy gene of the
WNT gene family, which plays an important role in the formation of wing shape. To date, the gene has been widely used in the phylogenetic analyses of Lepidoptera (
Brower, 2000;
Campbell et al., 2000;
Pena et al., 2006;
Silva-Brandao et al., 2008).
Ampittia dioscorides is a representative species of the Hesperiidae (Lepidoptera: Hesperioidea), which is widely distributed in South-eastern Asian areas, including Sri Lanka, India, Malaysia, and China. In this study, we determined the complete mitochondrial genome and wingless nuclear gene sequence of A. dioscorides.
To date, there have been few studies on the phylogenetic analysis of a butterfly based on a combination of the complete mitochondrial genome and nuclear genes. In the present study, based on the respective characteristics of the mitochondrial genome and nuclear genes, we sequenced the mitochondrial genome and nuclear wingless gene of A. dioscorides. Using these sequences, together with the corresponding sequences generated in previous studies on 21 other species of true butterflies, we evaluated previous phylogenetic hypotheses regarding the Papilionoidea and Hesperioidea.
Materials and Methods
Specimen collection and DNA extraction
An adult specimen of
A. dioscorides was collected from Lingui, Guangxi Province, China. The specimen was immediately preserved in 100% ethanol and then stored at - 20°C before genomic DNA extraction. Total DNA was isolated from the muscles of the thorax or leg using a routine phenol/chloroform method (
Zhou et al., 2007).
Primer design, PCR, and sequencing
The primers used in this study for the amplification of the complete mitogenome and
wingless gene were based on the sequences listed in Table 1. A few universal PCR primers for short-fragment amplifications of the
cox1,
cox2,
nd5,
cytb, and
12S genes were synthesized based on sequences from a previous study (
Simon et al., 1994). The remaining primers were designed based on sequence alignments of the available complete lepidopteran mitogenomes using Primer Premier 5.0 software (
Singh et al., 1998). The entire mitogenome of
A. dioscorides was amplified in nine fragments (
nd2-
cox1,
cox1-
cox2,
cox2-
cox3,
nd3-
nad5,
cox3-
nd5,
nad5-
nad4,
nad4-
cytb,
cytb-
12s, and
12s-
nd2) using long-PCR techniques with TaKaRa LATaq polymerase under the following cycling conditions: initial denaturation for 5 min at 95°C, followed by 30 cycles of 95°C for 50 s, 45–50°C for 50 s, 68°C for 2 min and 30 s; and a final extension step of 68°C for 10 min. The PCR products were visualized by electrophoresis on a 1.0% agarose gel, then purified using a QIAGEN PCR purification kit (QIAGEN, Düsseldorf, Germany), and sequenced directly with an ABI 3730 DNA Analyzer using a BigDye chemistry kit (Applied Biosystems, Inc., Carlsbad, CA, USA), in which the same PCR primers were used. All PCR products were sequenced from both strands. The resultant mitogenome sequence and
wingless gene data were deposited into the GenBank database under the accession numbers KM102732 and KP153245, respectively.
Sequence analysis and annotation
Sequence annotation was performed using the blast tools from the NCBI website (http://blast. ncbi.nlm.nih.gov/Blast) and DNAStar package (DNAStar Inc., Madison, USA). The secondary structures of most of the tRNA genes were predicted with tRNAscan-SE 1.21 (
Lowe and Eddy, 1997) using invertebrate codon predictors; however, some [e.g., tRNA
Ser(AGN)] were drawn by hand based on the nucleotide sequences of the tRNA genes of other butterflies. The protein-coding genes (PCGs) and rRNAs were confirmed by sequence comparison with ClustalX1.8 software and the NCBI BLAST search function (
Altschul et al., 1990). Nucleotide composition and codon usage were calculated with DAMBE software (
Xia and Xie, 2001).
Phylogenetic analysis
To evaluate the phylogenetic relationships among butterflies, we used the complete mitogenomes of A. dioscorides and 21 other butterfly species (obtained from the GenBank database) (Table 2). The phylogenetic trees were constructed base on complete mitogenomes and wingless genes using Bayesian Inference (BI) and Maximum parsimony (MP) algorithms. The amino acid sequences of 13 PCGs and wingless from the 22 sequenced butterflies were aligned, together with those of two Bombycoidea species, Bombyx mori and Bombyx mandarina, used as an outgroup.
Results and discussion
Genome organization
The complete mitochondrial DNA (mtDNA) sequence of A. dioscorides is 15,313 bp in length (Table 3) and consists of two rRNAs, 22 tRNAs, 13 PCGs, and one major non-coding A+ T-rich region. As shown in Fig. 1, and consistent with the mitogenomes of many insect, the major strand of the DNA codes for a higher number of genes (nine PCGs and 14 tRNAs) than the minor strand (four PCGs, eight tRNAs, and two rRNA genes).
Almost all PCGs in the
A. dioscorides mitogenome are initiated by typical ATN codons (seven with ATG, four with ATT, and one with ATA); the single exception being the
COI gene, which was tentatively designated to have a CGA start codon (Table 3). Among the stop codons of 13 PCGs, three types of codon were found in
A. dioscorides: TAA (
ND2, ATPase8, ATPase6, COIII, ND5, ND4L, ND6, and
Cytb); TAG (
ND1 and
ND3); and the incomplete stop codon T (
COI, COII, and
ND4). Incomplete termination codons are frequently observed in most insect mitogenomes (
Cha et al., 2007;
Hong et al., 2008;
Kim et al., 2010), and are often activated through post-transcriptional polyadenylation, in which two A residues are added to create a TAA terminator (
Anderson et al., 1981;
Ojala et al., 1981).
tRNA and rRNA genes
The 22 tRNAs vary in size from 57 bp [tRNA
Ser(AGN)] to 71 bp (tRNA
Lys) and have a typical clover-leaf structure, with the unique exception of tRNA
Ser (AGN), which lacks the dihydrouridine (DHU) stem (Fig. 2). The tRNAs contain a total of 34 pair mismatches in their stems, including 10 pairs in the DHU stems, nine pairs in the amino acid acceptor stems, four pairs in the TYC stems, and 11 pairs in the anticodon stems. The number of mismatches in the
A. dioscorides tRNAs detected in the present study is well within the range reported in previous studies on other lepidopteran insect tRNAs (
Liu et al., 2008;
Jiang et al., 2009;
Kim et al., 2010). These tRNA mismatches can be corrected through RNA editing mechanisms, which are well known in arthropod mtDNA (
Lavrov et al., 2000).
As in all other insect mitogenome sequences, two rRNA genes (rrnL and rrnS) were detected in A. dioscorides. The lrRNA and srRNA genes of the A. dioscorides mitogenome are 1,316 and 721 bp in length, respectively. They are located between tRNALeu (CUN) and tRNAVal and between tRNAVal and the A+ T-rich region, respectively (Fig. 1).
A+ T-rich region
The A+ T-rich region of A. dioscorides is located between the 12S rRNA and tRNAMet genes. The 389-bp-long A+ T-rich region exhibits a higher A+ T content (93.32%) than any other region of the A. dioscorides mitogenome. A conserved sequence (5′-AGATTTTTTTTTTTTTTTT-3′) was identified at the 5′-end of the A+ T-rich region. A poly-A sequence (AAAAAAAAAAAAA) was found at the 3′-end of the A+ T-rich region. Additionally, other short microsatellite-like repeat regions were observed throughout the A+ T-rich region, without noticeable macro-repeats.
Phylogenetic analysis
The newly sequenced mitogenome and wingless gene of A. dioscorides were used for phylogenetic analyses, together with the mitogenomes of 21 other butterflies, representing six families (Nymphalidae, Danaidae, Lycaenidae, Pieridae, Papilionidae, and Hesperiidae). The phylogenetic analyses were carried out using Bayesian Inference (BI) and Maximum parsimony (MP) algorithms for the concatenated amino acid sequences of 13 protein-coding genes and nuclear wingless gene. Both BI and MP trees had the same topological structure: ((((Nymphalidae+ Danaidae) + Lycaenidae) + Pieridae) + Papilionidae) + Hesperiidae). In the two phylogenetic trees, species of the six families were divided into two major branches (Figs. 3 and 4). The first clade included species in the Papilionoidea, whereas the second clade included only Hesperiidae. Within Papilionoidea, five families (Nymphalidae, Danaidae, Lycaenidae, Pieridae, Papilionidae) of butterflies are presented as monophyletic in the phylogenetic trees.
Conclusion
We sequenced the complete mitochondrial genome and wingless gene of Ampittia dioscorides and analyzed the phylogenetic relationships between Papilionoidea and Hesperioidea. The results based on the concatenated amino acid sequences of 13 protein-coding genes and the nuclear wingless gene are congruent with traditional classification based on morphology and support the view that butterflies are divided into two superfamilies, Papilionoidea and Hesperioidea.
Higher Education Press and Springer-Verlag Berlin Heidelberg