1 INTRODUCTION
Tree architecture is pivotal to horticultural productivity and management. Trees with appropriate canopy and height more efficiently intercept light, increasing productivity and reducing labor costs. In addition, trees with unique architecture (e.g., weeping growth habit) are generally considered to have value as ornamental trees. Over recent decades, favorable architectural traits have been selected to greatly increase field and horticultural crop yields
[1]. Plant architecture is not only determined by the organization and activities of meristems, stems, branches, leaves and inflorescence
[2], but is also affected by environmental factors
[3,4]. Previous studies on plant architecture have focused mainly on herbaceous model species such as
Arabidopsis thaliana[5], rice
[6] and maize
[7]. However, genetic information of tree architecture is poorly understood in perennial woody plants, and this is attributed to their complex segregation patterns, heterozygous genome and extended juvenility.
Currently, inexpensive next generation sequencing (NGS) has led to the identification of genetic information associated with the control of tree architecture, and several causative genes or alleles controlling specific aspects of shoot architecture have been identified, mainly dwarf
[8–10], columnar
[11–13], pillar
[14] and weeping traits
[15–17]. Bulked segregant analysis (BSA) is a frequently used method in genetic mapping and candidate identification by analyzing two pools of genomes of contrasting phenotype
[18], allowing for inexpensive mapping and causal identification of mutation from forward genetic screens.
RNA sequencing (RNA-seq) is a widely adopted NGS technology which reflects relative transcript concentrations and can also be used to mine for single nucleotide polymorphisms (SNPs) derived from coding sequences. A modification of BSA strategy uses bulked segregant RNA-seq (BSR-seq) reads to rapidly and efficiently map genes controlling mutant phenotypes in segregating populations
[19]. One of the advantages of BSR-seq is that it can be used for both the identification of QTLs and tests for global patterns of gene expression. In addition, the BSR-seq strategy has also been extended to mapping genes defined by major QTL loci or dominant mutants
[19]. Some latest studies using a BSR-seq strategy identified key genes or major QTLs in plant architecture, including the gibberellic acid receptor
GID1c for dwarf phenotype
[10] and a conserved sterile alpha motif domain gene for weeping growth habit in peach
[17], the major QTL mapping for weeping trait in apple
[16], and the genes associated with leafy head formation in Chinese cabbage
[20]. In addition, similar studies using BSR-seq data have also been adapted for many other traits in plants such as plant growth and development
[21,22], disease resistance
[23,24] and stress responses
[25]. Although BSR-seq has been used extensively to map important genes in plants, it also has the potential to generate false-positive SNPs or allele-specific expression caused by low-quality reference genomes or uncorrected decisions of tissues
[19,26]. It would therefore be desirable to combine BSR-seq with causal genes or QTLs using more than two approaches and replicates
[27].
The weeping growth habit, a specific phenotype in tree architecture, is common in woody species, representing a particular form of tree architecture and a unique element in landscape architectural aesthetics. Compared with upright trees that grow upwards at certain angles, the shoots of weeping trees initially grow upwards and then their tips gradually grow downward for unknown reasons during shoot growth and development
[28]. The inheritance and allelism of the weeping trait is highly complex and variable. Previous studies indicate that weeping phenotypes in different species have different segregations including a single dominant locus in apple
[16,29], recessive loci in eastern redbud and peach
[30,31], and both recessive and dominant loci in chestnut
[32]. In addition, observations of the weeping phenotype in peach suggest that it is a consequence of incomplete dominance and genetic interactions
[33]. In
P. mume, the weeping trait is controlled by a major recessive locus on linkage group 7 coupled with some unknown minor loci
[15]. In our recent studies, we also revealed genetic interaction involved in the control of weeping traits in
P. mume and the epistatic loci and minor loci have been identified. However, the expression patterns and regulated relationships of genes located in these epistatic loci and QTLs remain poorly defined. Therefore, a better understanding of how these genes coordinate the regulation of the weeping trait in
P. mume would provide important insights into shoot orientation regulation in woody plants with potential horticultural benefits.
Here, we report the use of BSR-seq and comparative transcriptome analysis to reveal molecular pathways involved in weeping trait formation in P. mume. Based on a detailed evaluation of phenotypic variants in weeping and upright trees derived from F1 hybrids of cultivars ‘Liuban’ (upright) × ‘Fentai Chuizhi’ (weeping), we profiled transcript abundance in the buds and stems of plants with contrasting phenotypes. By comparing weeping pools with upright pools, we identified the Pm024074 allele encoding a glucosyltransferase in an overlapping major QTL that may specifically relate to the control of the weeping trait. In addition, weighted correlation network analysis (WGCNA) shows that many genes in the first and second neighborhood of Pm024074 were relevant to plant architecture. Our study provides new insights into understanding the genetic basis of weeping trait formation and molecular breeding in P. mume.
2 MATERIALS AND METHODS
2.1 Plant materials and experimental procedures
The F1 progenies of artificial crossing P. mume cultivars ‘Liuban’ (upright phenotype as the female) with ‘Fentai Chuizhi’ (weeping phenotype as the male) were used in this study. The full-sib family of 432 hybrids were grown in 2012 in Deqing County, Zhejiang Province, China (30.53° N, 119.97° E) in a randomized complete block design. After the evaluation of phenotypes in the field, we created bulks of weeping and upright phenotypes originating from the hybrid population. Each bulk contained 20 genotypes that were observed to be contrasting phenotypes for either weeping or upright branching. To avoid other contrasting traits, traits associated with leaf, basal stem, plant height and crown width were further measured and had effectively normal distributions (Fig. S1). The young buds and stems of seedlings of these 20 genotypes were collected for RNA-seq. Finally, we obtained bulks of contrasting traits of bud and stem with three biological replicates per bulk.
The buds and stems of weeping and upright trees derived from two different growth conditions were collected for the validation of gene expression using qRT-PCR. Samples for the tissue-specific expression analysis were collected from grafting individuals grown in a greenhouse. Different parts of shoots were collected when they sprouted to 5–10 cm in length in spring (March). All samples were stored at -80°C prior to use.
2.2 Evaluation of shoot weeping growth habits
Real-time imaging of shoot weeping growth habit in
P. mume was conducted to monitor the growth dynamics in the studio using time-lapse photography (Fig. S2). The weeping and upright shoots were grafted on the same rootstock to observe the initial status of lateral bud sprouting (Fig. S2). In addition, we also fixed the base of new weeping shoots (<2 cm) to maintain their growth upright, and then observed the growth characteristics before and after removing the shoot apex (Fig. S3). Twenty weeping grafting plants and 20 upright grafting plants were used to observe the growth habit of lateral branches. The weeping phenotype was measured using the nested phenotyping method
[34]. One-year-old dormant shoots were collected and divided into five parts (T1 to T5) and the T5 angle was defined as the weeping angle and measured using ImageJ software
[35]. The branch angle at dormancy stage (defined as A2) was also measured as the weeping related traits (Fig.1(e)). Contrasting individuals were selected through joints of T5 and A2 angles.
2.3 RNA extraction, quantification, library construction and sequencing
Equal amounts of total RNA (2 mg each) from buds and stems of 20 strongly weeping and upright individuals were pooled to generate four segregant bulks. Tissue collection for RNA-seq is shown in Fig.2(a). The RNA was extracted using an EasySpin RNA Reagent (Aidlab Biotechnologies, Beijing, China) according to the manufacturer’s instruction. The RNA concentration, degradation, and integrity were assessed using a Nanodrop spectrophotometer, 1% agarose gel and an Agilent 2100 system, respectively. A total of 3 mg of RNA per bulk was used to construct the cDNA library. This library was generated using a NEBNext Ultra RNA Library Prep Kit for Illumina (NEB, Ipswich, MA) following the manufacturer’s instructions. The library quality was checked using the Agilent Bioanalyzer 2100 system. The amplified fragments were sequenced using an Illumina HiSeq X-10 platform and 150 bp paired-end reads were generated. The insertion size was 270 bp. All raw data were stored in the Genome Sequence Archive under project ID CRA001273.
2.4 Bulked segregant RNA-sequence data processing and analysis
The raw reads of four bulked samples were filtered by removing adapter, poly-N sequences and low-quality sequences using Trimmomatic v.0.39 software
[36]. Then high-quality clean reads of four bulked samples and whole genome resequencing parents were mapped to the reference genome v.1.0 of
P. mume using BWA under the default parameters
[37]. SNPs and InDels were identified using GATK 3.8 software
[38]. VCFtools v.0.1.15
[39] was used to filter low-quality variations with minor allele frequency > 0.2, read depth of > 6, and Phred-quality score > 20. The difference in allele frequency between weeping and upright traits was calculated using G′ method via QTLseqr package
[40,41]. Manhattan plots were used to show the results using the ‘qqman’ R package
[42]. Raw counts of each read were normalized to fragments per kilobase of transcript per million mapped read (FPKM) values to evaluate the expression levels based on the alignments of reads.
2.5 Identification of differentially expressed genes and annotation analysis
Differentially expressed genes (DEGs) of the four bulked samples were analyzed using the DESeq R package
[43]. The Benjamini-Hochberg approach was then used to correct
P-values to control a false discovery rate<0.001. The DEGs were identified with an adjusted
P-value<0.05 and an absolute fold-change ≥1.5 (|log
2Ratio|≥0.585). Functional annotation of all DEGs was conducted using the gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases
[44,45]. MapMan software and the Mecrator web tool
[46] were used to identify DEG categories using TAIR version 10 as reference, with
P<0.05 for ontology. Diagrams of PCA analysis, Venn and hierarchical clustering heat maps were generated using R v3.5.3 software.
2.6 Co-expression network analysis using differentially expressed genes
To identify genes significantly correlated with the traits (weeping and upright) or tissues (bud and stem) of
P. mume, the normalized read counts from 1959 DEGs with 2-fold cutoff were used to conduct co-expression gene networks with the WGCNA package in R v3.5.3 software
[47]. The DEGs between upright and weeping samples in bud and stem tissues were initially grouped together. Redundant genes and genes with FPKM<1 were filtered to generate a set of confident DEGs to carry out co-expression analysis. The unsigned weighted network was initially created with the parameters following threshold power of 9, minimum module size of 30, and a branch merge cut height of 0.05. The co-expression gene modules were identified by dynamic hybrid tree cut algorithm and visualized using Cytoscape v3.6.1
[48]. Genes with similar expression patterns across phenotypes, tissues or their combinations were classified into a module.
2.7 Comparative analysis of nucleotide polymorphisms between weeping and upright landraces
The resequencing data of 217 landraces (27 weeping accessions and 190 upright accessions) were used to calculate the nucleotide polymorphisms between weeping and upright landraces. The accession number and Bio-Project number are RP093801 and PRJNA352648, respectively. The methods of data preprocessing, SNP calling and quality control were described by Zhang et al.
[49]. Using VCFtools v.0.1.15
[39] we calculated the genome-wide nucleotide diversity, allele frequency, and genetic differentiation. SNPs were validated using the MassARRAY system
[50]. The genotyping primers were: forward primer ACGTTGGATGGGTGTGTTTCTTTCTAACGAG, reverse primer: ACGTTGGATGATTGACCACAATCCCAGCAG and single nucleotide extension primer: CGGATGTGCAAGAAGTA. The samples and phenotypes are shown in Data S1. Transcription factor binding sites and miRNA targeted sites for variation sequences were predicted using the JASPAR database and miRBase, respectively.
2.8 Candidate gene expression validated by qRT-PCR
The buds and stems from weeping and upright samples were collected to validate the gene expression of putative candidates. Each sample was analyzed in triplicate for all three biological repeats. Total RNA was extracted using the same method described for RNA-seq. First-strand cDNA templates were synthesized from ~2
mg of total RNA using a PrimeScript
TM RT Reagent Kit (TaKaRa, Shiga, Japan) according to the manufacturer’s instructions. A set of primers were designed for qRT-PCR using Integrated DNA Technologies online tool and checked their specificity using the Primer-BLAST tool on NCBI. The
protein phosphatase 2A (
Pm006362) gene was used as an internal reference, in line with previous reports
[51]. One-step qPCR programs were conducted using a CFX96 Real-Time PCR Detection System (Bio-Rad, Hercules, CA) with SYBR Premix Ex
TaqII (TaKaRa) and the qRT-PCR reaction system as described by Xu et al.
[51]. Expression levels were calculated using the delta-delta CT method
[52].
2.9 Cloning of the Pm024074 gene
Specific primers of the
Pm024074 gene were designed based on the CDS sequences from the
P. mume genome. Total RNA was extracted from buds of three heterozygous weeping accessions (cultivars ‘Moshan Chuizhi’, ‘Hanxue Chuizhi’ and ‘Fenpi Chuizhi’) and upright accessions (cultivars ‘Caishan Lve’, ‘Danban Lve’ and ‘Changrui Lve’). After mRNA was isolated and transcribed to cDNA, the gene of
Pm024074 was amplified with the specific primers using methods in our laboratory
[53]. The specific primers were: forward primer ATGGACCCGGTTCAAGACCG and reverse primer TTAGATCTTGAGGCCCT- TCCATACC. All target sequences were confirmed by Sanger sequencing.
3 RESULTS
3.1 Assessment of the weeping growth habit
When dormant lateral buds sprouted the shoots of weeping trees immediately exhibited a horizontal growth tendency but the upright trees were vertical (Fig.1(a)). These observations indicate that the weeping orientation is generated at an early stage of bud development. We also found that the shoots of weeping trees showed upright growth when the shoot base was fixed. After removal of the shoot apex the new lateral branches regenerated with the weeping growth habit and the original branch angle increased in size (Fig.1(b)). We therefore suggest that mechanical stress from the increased weight of the shoots might further accelerated the branch and weeping angles. We then further investigated the effects of gravitropism using time-lapse photography to capture the grafted bud movement over a 12-d period until the shoots showed significant downward growth (Fig.1(d)). We found that the shoots of weeping trees initially grew upwards but the branch angle remained larger than in upright trees. As the shoots grew the branch angles of weeping trees gradually increased and the shoot tips began growing downwards. However, the branch angles were minor in upright trees (Fig.1(c,d)). This phenomenon supports the notion that gravistimulation or a lack of mechanical rigidity of elongating stems might be an important factor accelerating weeping by accelerating branches downward growth.
In addition, we further evaluated the relationship between branch angles and weeping angles through F1 segregating hybrids of ‘Liuban’ (upright phehotype) × ‘Fentai Chuizhi’ (weeping phenotype). Linear regression analysis shows that the branch angle (A2) was strongly correlated with the weeping angle (T5) (R2 = 0.628) (Fig.1(e)). This indicates that A2 and T5 angles can be used to discriminate phenotypes between weeping and upright traits. Accordingly we identified 67 strongly weeping individuals with A2≥70° and T5≥70°, and 72 strongly upright individuals with A2≤50° and T5≤50° (Table S1).
3.2 RNA sequence data processing and global analysis
An RNA-seq method using a BSA strategy was used to identify functionally causative alleles and corresponding gene expression dynamics responsible for weeping traits
[19]. RNA-seq data were generated from contrasting trait bulks of young buds and stems (Fig.2(a)). A total of 174 Gb clean data were generated with high quality (Q30 > 91.6%) from four resulting bulks and slightly higher percentages (75%–79%) of the reads were mapped to the reference genome (Tables S2–S3). Normalized read counts (fragments per kb of transcript per million mapped reads, FPKM) for each gene were calculated, and 17,456 genes were identified to be expressed (FPKM > 1.0) in at least one bulk (Data S2). In total, 24,201 genes were successfully annotated using at least one of six databases (COG, GO, KEGG, Swiss-Prot, eggNOG and Nr) and the annotated results of each database are summarized in Fig.2(d) and Table S4. The detailed annotation of all genes is listed in Data S3.
The gene number for four levels of FPKM values in each bulk was calculated. We found the four bulks had similar numbers of expressed genes (around 16,000) and the largest number of expressed genes was at 10<FPKM<50 level (Fig.2(c)). DEGs with 2-fold cutoff were used for each pairwise comparison. The repeatability and relatedness of RNA-seq data of different samples was evaluated by a cluster dendrogram using 2600 DEGs. The three biological replicates of all bulks showed good correlation (Fig.2(b)). Based on these analyses a total of 495 DEGs in UB-WB pairwise, 328 DEGs in US-WS pairwise, 1533 DEGs in UB-US pairwise and 1507 in WB-WS pairwise were identified (Fig.2(e) and Data S4). Most DEGs were identified from UB versus US and WB versus WS compared groups, indicating a large number of DEGs caused by tissue specificity. Most DEGs were upregulated in weeping pools for comparison of upright pools versus weeping pools in the same tissues (Fig.2(e)).
3.3 Identification of specific DEGs and enrichment analysis
Venn analysis shows that 403 (upregulation/downregulation, 277/126) and 376 (161/215) specific DEGs from bud vs. stem of the same phenotype (UB-US vs. WB-WS) were identified in upright bulks and weeping bulks, respectively, whereas 1131 DEGs overlapped between the two pairwise comparisons (Fig.2(f)). Comparing DEGs from weeping vs. upright of the same tissues (UB-WB vs. US-WS), 276 (235/41) and 109 (83/26), DEGs were specifically expressed in bud and stem pools, respectively, and 219 (184/35) overlapping DEGs were identified in both buds and stems (Fig.2(g)). These results indicate that gene expression levels were overall higher in weeping than in upright trees, but more genes downregulated in the stems of weeping than of upright trees.
MapMan functional categories show that specific DEGs of the four comparison groups (UB-US, WB-WS, UB-WB and US-WS) were mainly enriched for secondary metabolism, hormone metabolism, stress, RNA, protein, signaling and transport (Fig. S4(a) and Table S5). Most downregulated specific DEGs in the stems of weeping trees were mainly associated with secondary metabolism, hormone metabolism, RNA and signaling (Fig. S4(b) and Data S4), including ELI3-1 related to lignin biosynthesis, MYB-related transcription factor involved in circadian rhythm regulation, SAUR-like auxin-responsive protein, and NPH3 family protein involved in phototropic-response downregulated more than 2-fold in the stems of weeping P. mume.
GO enrichment analysis shows that most specifically expressed genes in biological processes were downregulated in the stems of weeping trees (Fig. S5(a,b). In addition, we found that four processes were specifically enriched in weeping trees, namely immune system processes, growth, and rhythmic processes and supramolecular fibers (Fig. S5(c) and Table S6). Genes among these processes might be related to the weeping trait, such as bZIP (Pm004596) involved in light-regulated transcriptional activation and mannitol dehydrogenase (Pm002264) involved in lignin biosynthesis.
3.4 Construction of the gene co-expression networks using weighted correlation network analysis
DEGs (1959) with 2-fold cutoff and median absolute deviation > 0.01 were used for WGCNA to investigate the network relationships between DEGs. Three distinct co-expression modules were identified as shown by the dendrogram (Fig.3(a)). Genes within the same module had high correlation coefficients and similar expression trends across samples, and significantly correlated relationships between the traits (weeping and upright) or tissues (buds and stems) of P. mume and the modules were identified (Fig.3(b)). The brown module was significantly related to trait with highly positive coefficients (0.96). Genes in the turquoise and blue models showed significant tissue-specificity with high coefficients of -0.99 and 0.92, respectively. We further investigated the relationship between genes in modules and traits or tissues, indicating that genes in the module were strongly associated with corresponding traits or tissues (Fig.3(c), Fig. S6). Genes in each significant module can therefore be considered representative of distinct trait or tissue types. Expression patterns of genes in each module are shown in Fig. S7.
In the brown module, most genes were highly and specifically expressed in both buds and stems of weeping tree but slightly expressed in upright trees (Fig.3(d), Fig. S8). The functional classification shows that most genes were associated with hormone metabolism, transcription regulation, signaling, and proteins (Fig.3(d) and Data S5). Genes associated with transcriptional regulation, hormone metabolism, and secondary metabolism have important involvement in the regulation of shoot architecture, especially the genes regulating gibberellin (GA), auxin, brassinosteroid (BR), cytokinin (CK) and lignin biosynthesis pathways
[1,54,55].
Networks of genes associated with transcriptional regulation, hormone metabolism, and secondary metabolism in brown modules with edge weight > 0.45 were constructed to investigate the relationships of hub genes (Fig. S9(a) and Data S6). Ten genes associated with IAA, BR, GA, CK and lignin were of particular interest in the network (Fig. S9(b) and Table S7). Strikingly, seven of ten genes are involved in the regulation of auxin and lignin, including four auxin related genes (
ILR-like5, Prunusmume_newGene_2032;
Cytochrome b561,
Pm012502;
GH3,
Pm021243; and
Aux/IAA,
Pm005182) and three lignin related genes (
AMP-dependent synthetase,
Pm012986;
Cytochrome P450,
Pm027089; and
Caffeoyl-CoA O-methyltransferase (
CCoAMT),
Pm027248). The other three genes are related to CK (
Zeatin O-xylosyltransferase,
Pm023083), GA (
Gibberellin 2-oxidase,
Pm011163), and BR (
Squalene monooxygenase,
Pm021002). All these genes are upregulated in weeping
P. mume except lignin related
CCoAMT (Table S7).
AUX/IAA and
CCoAMT are two key genes that are involved in IAA regulation
[56,57] and biosynthesis of lignin
[58], respectively. Some genes directly connected with
AUX/IAA and
CCoAMT in the networks have been shown to be associated with shoot regulation. For instance, GRAS (
SCARECROW-like 21,
Pm029452) transcription factor involved in the BR signal transduction pathway, highly expressed
OsGRAS19 can reduce shoot strength
[59,60].
3.5 Mapping an overlapping QTL on chromosome 7 via bulked segregant RNA-sequencing
We merged RNA-seq clean reads of WB/WS bulks and UB/US bulks into a weeping bulk and an upright bulk, respectively to identify the gene variation in weeping
P. mume. A total of 3,874,124 SNPs between weeping and upright
P. mume (including two bulks and parents) were identified by mapping the clean reads to the reference and quality control of SNPs. All SNPs were divided into continuous 50-kb windows and the G′ value of each window was calculated and is presented (Fig.4(a)). Five putative QTLs on chromosome 7 (Pm7) were identified with threshold Q value<0.001 (Fig.4(a) and Table 1). Pm7-qtl-4 located at 9,693,685–10,654,831 bp (~ 0.96 Mb) region partially overlapped with our previous hybrid population and landrace population studies
[15,49]. This QTL is of particular interest. Nucleotide diversity analysis of this QTL region further demonstrates a distinct selective sweep located in the 10.1–10.7 Mb region (Fig.4(b)). These results suggest that Pm7-qtl-4 is a higher confidence QTL associated with the weeping trait in
P. mume.
As plant phenotype can be affected by gene expression levels, we further investigated the expression level of genes located in Pm7-qtl-4 QTL interval using RNA-seq data. There were 52 expressed genes in the region but only four DEGs (log
2FC≥1) were identified (Fig.4(c) and Data S7). Functional annotation and fold-change of gene expression indicate that
Pm024074, belonging to the UDP-glycosyltransferase superfamily, was the most likely associated with weeping trait in
P. mume. This gene showed greatly different expression between weeping and upright
P. mume with very large fold-change (log
2FC > 3) (Fig.4(c)). A search for homologous protein revealed a significant domain: coniferyl-alcohol glucosyltransferase (Fig.4(h)). Protein contained in this domain was shown to be involved in the glucosylation of lignin precursors and cambial growth
[61,62]. A recent study indicates that the UDP-glycosyltransferase (UGT) superfamily may also be involved in metabolic redirection and biosynthesis of flavonoids and monolignols
[63]. The qRT-PCR results further indicate that the expression level of
Pm024074 was significantly higher in the weeping parents and individuals than that in the upright parents and individuals, and this difference in expression was not impacted by growth environments (Fig.4(d–f)). Tissue specific expression pattern analysis shows that the overall expression level of
Pm024074 was increasing from shoot tip to basal stem in both weeping and upright
P. mume and was higher in each part of weeping than in upright
P. mume (Fig.4(g)). These results hint that
Pm024074 is potentially associated with weeping traits in
P. mume.
3.6 Discovery of genomic differences between weeping and upright P. mume
The weeping growth habit of
P. mume has important value for landscape architectural aesthetics and most weeping landraces are artificial selections with this aesthetic appeal
[64]. We wanted to determine if candidate
Pm024074 was a key gene for artificial selection. To this end, SNPs of
Pm024074 and its flanking 2-kb were extracted from 27 weeping and 190 upright resequencing accessions. The nucleotide diversity (π) and genetic differentiation (Fst) of 100-bp non-overlapping windows were then calculated. The calculation shows that π and Fst between weeping and upright accessions were clearly different at 10,245,401–10,247,301 region (Fig.5(a)). We further compared the difference in variation sites in
Pm024074 protein coding region between two bulks and the parents. Pm_10245464 SNP (defined
PmW allele) was exclusively identified in both weeping bulk and weeping parents (Fig.5(b)). We successfully genotyped this allele in 235 individuals (Fig.5(c) and Data S8). Allele frequency shows that the
PmW allele was strongly associated with weeping traits (Fig.5(d)). Individuals with CC genotype showed significantly higher branch angle (A2) and weeping angle (T5) compared to individuals with TT genotype (Fig.5(e)).
The
PmW allele was a synonymous mutation SNP located at position 66 bp in the cDNA of the
Pm024074 gene, a transition of T
66 in upright
P. mume to C in weeping
P. mume (Fig.6(b)). This variation cannot cause an amino acid change but gene expression levels could also influence plant phenotype
[65]. We also sought to determine if variation in the promoter region might impact gene expression. We therefore compared the promoter region sequences of the parents, ‘Fentai Chuizhi’ (weeping) and ‘Liuban’ (upright) (Fig.6(a)). A 470-bp deletion was identified in upright
P. mume, which likely contained four transcription factor binding sites (WRKY18, ARF16, Adof1 and ARR18) (Fig.6(b)). This indicates that a difference in expression level might be caused by the variation in the promoter region. Notably, we found that
PmW was a T/C heterozygous locus in most weeping landraces based on allele frequency (Fig.5(d)). Full-length cDNA of
Pm024074 was cloned and sequenced from three heterozygous weeping and upright landraces to investigate whether allele C was specifically expressed in weeping
P. mume. The result shows that only allele C was cloned in weeping
P. mume (Fig.6(c)). Notably, a putative miR838 was predicted to target this site (Fig.6(b)). Overall, the above results indicate that variation in the
Pm024074 gene might change its spatial and temporal expression patterns resulting in abnormal development of shoots in
P. mume. Details of the sequence variation information are shown in Supplementary T1.
3.7 PmUGT72B3 co-expressed correlation networks
The candidate
Pm02474 gene belongs to the UDP-glycosyltransferase superfamily, best matched
Arabidopsis thaliana UGT72B3 (
AT1G01420.1) and is defined as
PmUGT72B3. UGT can catalyze sugar moiety transfer to the specific acceptor which is important in plant growth and development via conjugation of phytohormones with sugars and amino acids
[66,67]. Networks around
PmUGT72B3 were extracted from the brown module with edge weight>0.45 based on WGCNA analysis to investigate the upstream or downstream regulated genes of
PmUGT72B3. The gene co-expression network shows that 23 genes associated with secondary metabolism, hormone metabolism and transcription regulation were directly related to
PmUGT72B3. The first and second neighboring genes of
PmUGT72B3 are shown in Fig.7 and more detailed information on these genes is listed in Table S8.
The genes associated with auxin, gibberellin and lignin, including Chitin-inducible gibberellin-responsive protein 1 (
Pm029452), Gibberellin 2-oxidase 1 (
Pm011163), Phytochrome-associated protein 1 (
Pm005182) and 12-oxophytodienoate reductase 2 (
Pm027056), which correlates with Trans-caffeoyl-CoA 3-O-methyltransferase (
CCoAMT,
Pm027248) are of particular interest among the 23 main hub genes in the
PmUGT72B3 network. All these genes were upregulated except the lignin-related
CCoAMT gene. In addition, transcription factors putatively binding to a 470-bp deletion region of
PmUGT72B3 promoter are also potentially crucial genes involved in the regulation of weeping traits, including WRKY18 (Pm005698), Aux/IAA (Pm005182), cycling DOF factor 3 (CDF3, Pm006485), and early-phytochrome-responsive 1 (EPR1, Pm025253) potentially binding site. Notably, multiple co-expressed genes are associated with the phytochrome signal such as
CIGR1,
PAP1,
EPR1 and
CDF3[68–70], suggesting that
PmUGT72B3 might control weeping traits via influencing phytochrome. Therefore, a possible hypothesis is that
PmUGT72B3 might influence the expression of genes associated with hormones, lignin and light signal to collaboratively manipulate weeping phenotype in
P. mume.
4 DISCUSSION
BSA-seq is an efficient and economic approach for the identification of genetic variants for important horticultural or agronomic traits
[19,71,72]. Weeping growth habits are popular in woody rosaceous plants but their genetic mechanisms vary greatly between different species, including recessive, dominant and epistatic genetic effects
[15,16,33]. Non-allelic weeping alleles have been identified in
Cercis canadensis[30]. These findings suggest that the genetic control of weeping traits is very complicated. In
P. mume, our earlier study indicates that the weeping traits were controlled by a major recessive locus together with some modified loci
[15]. However, insufficient markers and gene expression information hindered the identification of candidates and application of molecular assisted selection of weeping traits in
P. mume breeding programs.
Here, dynamic investigation of the shoot growth habits of weeping and upright P. mume show that the meristem and elongation regions might be key parts for the determination of weeping traits. Integrated transcriptome analysis, WGCNA, bulked segregant analysis and selective sweep analysis provide new insights into genetic control and regulation mechanisms of weeping P. mume. We identified five QTLs located on Pm7 potentially associated with weeping traits of P. mume and a key candidate gene involved in glucosylation of lignin precursors and cambial growth. A causative allele located at the candidate was first identified and validated in P. mume. These findings provide valuable information for understanding the genetic regulation of weeping traits and for molecular breeding in P. mume.
Gravity and light sensing are two of the most important environmental processes that control shoot growth direction in plants
[73–76]. In addition, stem mechanical rigidity via the formation of tension wood on the upper side of stems is also a crucial factor in the upward support of the stem through increasing a contractile force
[77]. Here, we have found that the shoots of weeping trees displayed gravitropic responses at an early stage of bud sprouting and the stem elongation significantly increased branch weeping, causing a larger branch angle in weeping than in upright shoots (Fig.1). When we settled the basal branch to reduce the stress of gravity the shoot tips maintained upward growth. In addition, the classic starch statolith hypothesis suggests that starch makes an important contribution to gravity sensing which occurs in the endodermis of the stem
[73]. In
Arabidopsis, studies have shown that the endodermis plays an important role in shoot gravity sensing, and the inflorescence stems of mutants show no response to gravity in darkness
[78,79]. According to the phenotype observation results we suspect that the endodermis in young shoot tissues (meristem or the elongation region) might be a crucial part in the formation of weeping traits. Tree architecture is highly plastic and largely affected by environmental stimuli. Certainly, more investigations into phototropism, gravitropism, tissue anatomy, hormonal composition and wood strength are needed to comprehensively investigate their contributions to the weeping phenotype in
P. mume.
Bulked segregant analysis strategy RNA-seq was conducted to reveal a potential molecular mechanism underlying the weeping trait in
P. mume. Overall, the number of expressed genes was higher in weeping than in upright trees, and most genes were upregulated in young buds and stems of weeping trees (Fig.2(e–f)). Similar expression regulation was also found in weeping shoot tips in peach
[17]. When comparing buds, numerous specific DEGs associated with secondary, stress, hormone metabolism, RNA, protein, transport, development and signaling were significantly downregulated in the stems of weeping
P. mume (Fig. S4(b)). Functional categories show that these genes have important roles in the regulation of phototropism, gravitropism and wood strength. Some genes involved in cell walls, lignin biosynthesis, auxin response and phototropic response are significantly downregulated by more than double (Data set S1). For instance, 1-aminocyclopropane-1-carboxylic acid synthase regulates cell wall function via interaction with FEI proteins
[80]; changes in the expression of auxin related SAUR-like and IAA carboxyl methyltransferase may alter plant stem elongation, gravitropism, phototropism and gene transcription
[81–84]. In other weeping species, numerous DEGs associated with hormone and light signaling transduction, wood development and tropism are also significantly enriched, for example in crape myrtle
[85],
Salix matsudana[86] and peach
[17].
Next generation BSA has been a particularly fruitful approach for rapidly identifying QTLs associated with a trait of interest
[19,41]. Here, five QTL loci on chromosome 7 were successfully mapped based on BSR and one overlapping QTL was confirmed in cultivars and hybrids. This indicates that the targeted QTL region represents a unique source of gene variants that can be exploited in bulked RNA-seq analysis. The present study has uncovered four QTLs significantly linked to weeping traits inherited from ‘Fentai Chuizhi’. This appears to be consistent with the segregation of phenotypes that some offspring represented diverse phenotypes, such as large branch angle or flat branches (Fig. S10). Our previous study also shows 95% confident QTL intervals almost covered the whole chromosome 7
[15]. These five QTLs might therefore be an important region of chromosome 7 in controlling weeping traits.
Recently, our team found that multiple epistatic loci of weeping and upright
P. mume, including overlapping Pm7-qtl-4 QTL, strongly interacted with another significant QTL located at 11.0–11.3 Mb on the same chromosome and explained 13.9% of the phenotypic variation. Similar genetic interactions of weeping traits have also been found in other rosaceous species such as weeping peach and weeping apple
[16,33]. In a future study it would be interesting to determine if these two interacting loci coregulate the weeping traits in
P. mume. Overlapping Pm7-qtl-4 is considered to be an important QTL for controlling weeping traits, and QTLs and linkage markers around this region were also detected in our previous studies
[15,49,87,88]. However, the core allele variations and candidate genes remain unclear. Here, we have identified a weeping allele called
PmW, located at the
PmUGT72B3 (
Pm024074) gene, potentially coding for a UGT protein with a conserved coniferyl-alcohol glucosyltransferase domain. Proteins with this domain make important contributions to the glucosylation of lignin precursors and cambial growth
[61,89,90]. The
PmW allele located in this candidate was a synonymous mutation SNP, which was predicted to be targeted by miR838. The promoter region of the candidate gene has a 470-bp deletion in upright
P. mume potentially containing multiple TF binding sites. However, whether these variations impact weeping traits remains to be demonstrated by molecular biological evidence, but both our gene expression evidence and SNP validated results support the hypothesis that
PmUGT72B3 controls weeping traits in
P. mume (Fig.5
, Fig.6). Several genetic and molecular biology studies have indicated that UDP-glucosyltransferase family genes are involved in the regulation of plant architecture
[91–93].
Auxin conjugates are known to be critical in maintaining auxin homeostasis in higher plants, mainly ester-linked IAA-sugar conjugates, amide-linked IAA-amino acid conjugates and methyl esters of IAA
[82,92,94]. Glycosylation of auxin is known to be an important mechanism to plant architecture by maintaining auxin homeostasis. In rice, a mutant of the UDP-glucosyltransferase family member, α,1,3-fucosyltransferase, can reduce basipetal auxin transport and gravitropic response and increase tiller angle
[93]. Overexpression of
OsIAAGLU and
OsIAGT1 also resulted in reduced sensitivity resulting in reduced sensitivity of auxin and altered gravitropic response of the roots in the transgenic rice
[91,95]. These results support the hypothesis that exceptionally high expression of
PmUGT72B3 in buds and young stems might reduce the sensitivity of the IAA and gravitropic responses, and result in a weeping phenotype in weeping
P. mume. In addition, several hormone, light and lignin hub genes directly correlated to
PmUGT72B3 are also identified via WGCNA, including (
ILR-like5 and
GH3 related IAA-amino acid conjugate, Aux/IAA related auxin active repressors,
CCoAMT related lignin biosynthesis,
LHY related circadian rhythm,
SCARECROW-like 21 related far-red light response,
Zeatin O-xylosyltransferase related cytokinin, and
Gibberellin 2-oxidase GA related biosynthesis (Table S8). These genes are essential in the control of gravitropic growth and development in light, affecting shoot or root architecture
[1,28,54,96]. Although little is known about the functions of UDP-glucosyltransferase in the regulation of tree architecture, futures studies on the regulated relationship between
PmUGT72B3 and its neighboring genes may help in understanding the genetic mechanism underlying the weeping phenotype in
P. mume.
5 CONCLUSIONS
The phenotype observations show that the weeping traits might be independently determined by the young meristem and stem. According to the BSR-seq analysis and WGCNA, we have identified a core candidate PmUGT72B3 coding a protein containing conserved coniferyl-alcohol glucosyltransferase, which is a key protein in regulating lignin and IAA. Compared to upright P. mume, PmUGT72B3 was exceptionally highly expressed in young buds and stems of weeping P. mume. Higher expression of UDP-glucosyltransferase appears to act by reducing the gravitropic response and inactivating IAA. Variations in PmUGT72B3 were detected and validated, and were significantly associated with weeping phenotypes. Further molecular experiments are required to verify its molecular functions. Identification of this candidate gene from weeping P. mume may greatly contribute to our understanding of the genetics basis of the weeping phenotypes in rosaceous trees and provide benefits for horticultural productivity.
The Author(s) 2020. Published by Higher Education Press. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0)