Telomere-to-telomere gap-free genome assembly provides genetic insight into the triterpenoid saponins biosynthesis in Platycodon grandiflorus

Hanwen Yu , Haixia Wang , Xiao Liang , Juan Liu , Chao Jiang , Xiulian Chi , Nannan Zhi , Ping Su , Liangping Zha , Shuangying Gui

Horticulture Research ›› 2025, Vol. 12 ›› Issue (5) : 30

PDF (6963KB)
Horticulture Research ›› 2025, Vol. 12 ›› Issue (5) :30 DOI: 10.1093/hr/uhaf030
Article
research-article
Telomere-to-telomere gap-free genome assembly provides genetic insight into the triterpenoid saponins biosynthesis in Platycodon grandiflorus
Author information +
History +
PDF (6963KB)

Abstract

Platycodon grandiflorus has been widely used in Asia as a medicinal herb and food because of its anti-inflammatory and hepatoprotective properties. P. grandiflorus has important clinical value because of the active triterpenoid saponins in its roots. However, the biosynthetic pathway of triterpenoid saponins in P. grandiflorus remains unclear, and the related genes remain unknown. Therefore, in this study, we assembled a high-quality and integrated telomere-to-telomere P. grandiflorus reference genome and combined time-specific transcriptome and metabolome profiling to identify the cytochrome P450s (CYPs) responsible for the hydroxylation processes involved in triterpenoid saponin biosynthesis. Nine chromosomes were assembled without gaps or mismatches, and nine centromeres and 18 telomere regions were identified. This genome eliminated redundant sequences from previous genome versions and incorporated structural variation information. Comparative analysis of the P. grandiflorus genome revealed that P. grandiflorus underwent a core eudicot γ-WGT event. We screened 211 CYPs and found that tandem and proximal duplications may be crucial for the expansion of CYP families. We outlined the proposed hydroxylation steps, likely catalyzed by the CYP716A/72A/749A families, in platycodin biosynthesis and identified three PgCYP716A, seven PgCYP72A, and seven PgCYP749A genes that showed a positive correlation with platycodin biosynthesis. By establishing a T2T assembly genome, transcriptome, and metabolome resource for P. grandiflorus, we provide a foundation for the complete elucidation of the platycodins biosynthetic pathway, which consequently leads to heterologous bioproduction, and serves as a fundamental genetic resource for molecular-assisted breeding and genetic improvement of P. grandiflorus.

Cite this article

Download citation ▾
Hanwen Yu, Haixia Wang, Xiao Liang, Juan Liu, Chao Jiang, Xiulian Chi, Nannan Zhi, Ping Su, Liangping Zha, Shuangying Gui. Telomere-to-telomere gap-free genome assembly provides genetic insight into the triterpenoid saponins biosynthesis in Platycodon grandiflorus. Horticulture Research, 2025, 12(5): 30 DOI:10.1093/hr/uhaf030

登录浏览全文

4963

注册一个新账户 忘记密码

Acknowledgements

We thank Benagen Technology Co., Ltd (China) for the T2T assembly of P. grandiflorus. This work was supported by the National Natural Science Foundation of China (U21A20406), the Excellent Young Scholars Project of Natural Science Foundation of Anhui Province in China (2208085Y30), the Key Project Foundation of Support Program for the Excellent Young Faculties in Universities of Anhui Province in China (gxyqZD2022051), Science Research Project at the Universities of Anhui Province for Distinguished Young Scholars (2023AH020036), Young Elite Scientists Sponsorship Program by CACM (CACM-2023-QNRC2-B23), Traditional Chinese Medicine High-Level Key Discipline Construction Project of National Administration of Traditional Chinese Medicine Science of Chinese Medicinal Material Resources (pharmaceutical botany) (zyyzdxk-2023095), Research Funds of Center for Xin’an Medicine and Modernization of Traditional Chinese Medicine of IHM (2023CXMMTCM008), and China Postdoctoral Science Foundation-Anhui Joint Support Program under grant (2024T030AH).

Author Contributions

S.G., L.Z., and P.S. conceived the project, supervised and managed the project. S.G., L.Z., P.S., H.Y., H.W., and X.L. designed the experiments. H.Y., J.L., and C.J. performed bioinformatic analyses. S.G., L.Z., P.S., H.Y., X.C. and N.Z. organized, wrote and revised the manuscript. All authors have read and approved the final version of the paper.

Data availability

The T2T genome assembly and RNA-seq data of P. grandiflorus were deposited in the National Genome Data Center, under the accession numbers PRJCA031148 and PRJCA026736. The genome data referred in this article: Arabidopsis thaliana (NCBI: GCF_000001735.4), Arctium lappa (NCBI: GCA_023525745.1), Camellia sinensis (NGDC: GWHASIV00000000), Daucus carota (NGDC: GCA_030127425.1), Cannabis sativa (NGDC: GWHABGK00000000), Helianthus annuus (NCBI: GCA_026651805.1), Lactuca sativa (NGDC: GCF_002870075.4), Oryza sativa (NCBI: GCF_001433935.1), Panax ginseng (NGDC: GWHBEIL00000000.1), Solanum lycopersicum (NCBI: GCA_915070445.1), Vitis vinifera (NCBI: GCF_030704535.1), Codonopsis lanceolata (https://figshare.com/articles/dataset/First_Report_of_Chromosome-Level_Genome_Assembly_for_Lance_Asiabell_Codonopsis_lanceolata_A_Medicinal_and_Vegetable_Plant_in_the_Campanulaceae_Family/21507774?file=38116599).

Conflict of interests

The authors declare that they have no conflicts of interest.

Supplementary Data

Supplementary data is available at Horticulture Research online.

References

[1]

Zhang L, Wang Y, Yang D. et al. Platycodon grandiflorus -an ethnopharmacological, phytochemical and pharmacological review. J Ethnopharmacol. 2015;164:147-161

[2]

Xunyan XY, Fang XM.The effect of Platycodon grandiflorum and its historical change in the clinical application of Platycodonis radix. Zhonghua yi shi za zhi (Beijing, China : 1980). 2021;51:167-176

[3]

Li W, Yang HJ. Phenolic constituents from Platycodon grandiflo-rum root and their anti-inflammatory activity. Molecules (Basel, Switzerland). 2021;26:4530

[4]

Liu Y, Dong Y, Shen W. et al. Platycodon grandiflorus polysac-charide regulates colonic immunity through mesenteric lym-phatic circulation to attenuate ulcerative colitis. Chin J Nat Med. 2023;21:263-278

[5]

LiuY, ChenQ, Ren R. et al. Platycodon grandiflorus polysaccharides deeply participate in the anti-chronic bronchitis effects of Platy-codon grandiflorus decoction, a representative of "the lung and intestine are related". Front Pharmacol. 2022;13:927384

[6]

Zhang LL, Huang MY, Yang Y. et al.Bioactive platycodins from Platycodonis radix: phytochemistry, pharmacological activities, toxicology and pharmacokinetics. Food Chem. 2020;327:127029

[7]

Zhang S, Chai X, Hou G. et al. Platycodon grandiflorum (Jacq.) A. DC.: a review of phytochemistry, pharmacology, toxicology and traditional use. Phytomedicine : international journal of phytotherapy and phytopharmacology. 2022;106:154422

[8]

Sun Y, Shang L, Zhu QH. et al. Twenty years of plant genome sequencing: achievements and challenges. Trends Plant Sci. 2022;27:391-401

[9]

Shi J, Tian Z, Lai J. et al. Plant pan-genomics and its applications. Mol Plant. 2023;16:168-186

[10]

Niu S, Li J, Bo W. et al. The Chinese pine genome and methy-lome unveil key features of conifer evolution. Cell. 2022;185:204-217. e14

[11]

Kim J, Kang SH, Park SG. et al. Whole-genome, transcriptome, and methylome analyses provide insights into the evolution of platycoside biosynthesis in Platycodon grandiflorus, a medici-nal plant. Horticulture research. 2020;7:112

[12]

Lee DJ, Choi JW, Kang JN. et al. Chromosome-scale genome assembly and triterpenoid Saponin biosynthesis in Korean bellflower (Platycodon grandiflorum). Int J Mol Sci. 2023;24:6534

[13]

Jia Y, Chen S, Chen W. et al. A chromosome-level reference genome of Chinese balloon flower (Platycodon grandiflorus). Front Genet. 2022;13:869784

[14]

Kille B, Balaji A, Sedlazeck FJ. et al. Multiple genome alignment in the telomere-to-telomere assembly era. Genome Biol. 2022;23:182

[15]

Yang H, Lian C, Liu J. et al. High-quality assembly of the T2T genome for Isodon rubescens f. lushanensis reveals genomic structure variations between 2 typical forms of Isodon rubescens. GigaScience. 2024;13:giae075

[16]

Yu H, Chen B, Li J. et al. Identification and functional charac-terization of two trans-isopentenyl diphosphate synthases and one squalene synthase involved in triterpenoid biosynthesis in Platycodon grandiflorus. Planta. 2023;258:115

[17]

Qiao X, Li Q, Yin H. et al. Gene duplication and evolution in recur-ring polyploidization-diploidization cycles in plants. Genome Biol. 2019;20:38

[18]

Biswas T, Dwivedi UN. Plant triterpenoid saponins: biosynthesis, in vitro production, and pharmacological relevance. Protoplasma. 2019;256:1463-1486

[19]

Chen K, Zhang M, Ye M. et al. Site-directed mutagenesis and substrate compatibility to reveal the structure-function rela-tionships of plant oxidosqualene cyclases. Nat Prod Rep. 2021;38: 2261-2275

[20]

Zhou C, Lin Q, Ren Y. et al. A CYP78As-small grain4-coat pro-tein complex II pathway promotes grain size in rice. Plant Cell. 2023;35:4325-4346

[21]

Ma Y, Cui G, Chen T. et al. Expansion within the CYP71D sub-family drives the heterocyclization of tanshinones synthesis in Salvia miltiorrhiza. Nat Commun. 2021;12:685

[22]

Kemper B. Structural basis for the role in protein folding of conserved proline-rich regions in cytochromes P450. Toxicol Appl Pharmacol. 2004;199:305-315

[23]

Hasemann CA, Kurumbail RG, Boddupalli SS. et al. Structure and function of cytochromes P450: a comparative analysis of three crystal structures. Structure. 1995;3:41-62

[24]

Pankov KV, McArthur AG, Gold DA. et al. The cytochrome P450 (CYP) superfamily in cnidarians. Sci Rep. 2021;11:9834

[25]

Werck-Reichhart D, Feyereisen R. Cytochromes P450: a success story. Genome Biol. 2000;1:reviews3003.1

[26]

Tamura K, Teranishi Y, Ueda S. et al. Cytochrome P450 monooxy-genase CYP716A141 is a unique β-Amyrin C-16β oxidase involved in triterpenoid saponin biosynthesis in Platycodon gran-diflorus. Plant & cell physiology. 2017;58:874-884

[27]

Yau LF, Huang H, Tong TT. et al. Characterization of deglycosy-lated metabolites of platycosides reveals their biotransforma-tion after oral administration. Food Chem. 2022;393:133383

[28]

Misra RC, Sharma S, Sandeep. et al. Two CYP716A subfamily cytochrome P450 monooxygenases of sweet basil play similar but nonredundant roles in ursane- and oleanane-type penta-cyclic triterpene biosynthesis. New Phytol. 2017;214:706-720

[29]

Miettinen K, Pollier J, Buyst D. et al. The ancient CYP716 family is a major contributor to the diversification of eudicot triterpenoid biosynthesis. Nat Commun. 2017;8:14153

[30]

Tamura K, Seki H, Suzuki H. et al. CYP716A179 functions as a triterpene C-28 oxidase in tissue-cultured stolons of Glycyrrhiza uralensis. Plant Cell Rep. 2017;36:437-445

[31]

Reed J, Orme A, El-Demerdash A. et al. Elucidation of the pathway for biosynthesis of saponin adjuvants from the soapbark tree. Science. 2023;379:1252-1264

[32]

Wang Y, Zhang H, Ri HC. et al. Deletion and tandem duplications of biosynthetic genes drive the diversity of triterpenoids in Aralia elata. Nat Commun. 2022;13:2224

[33]

Tzin V, Snyder JH, Yang DS. et al. Integrated metabolomics iden-tifies CYP72A67 and CYP72A68 oxidases in the biosynthesis of Medicago truncatula oleanate sapogenins Metabolomics : Official journal of the Metabolomic Society. 2019;15:85

[34]

Moses T, Thevelein JM, Goossens A. et al. Comparative analysis of CYP93E proteins for improved microbial synthesis of plant triterpenoids. Phytochemistry. 2014;108:47-56

[35]

Song Y, Zhang Y, Wang X. et al. Telomere-to-telomere reference genome for Panax ginseng highlights the evolution of saponin biosynthesis. Horticulture research. 2024;11:uhae107

[36]

Mascarenhas Dos Santos AC, Julian AT, Liang P. et al. Telomere-to-telomere genome assemblies of human-infecting Encephali-tozoon species. BMC Genomics. 2023;24:237

[37]

Xu XD, Zhao RP, Xiao L. et al. Telomere-to-telomere assembly of cassava genome reveals the evolution of cassava and divergence of allelic expression. Horticulture research. 2023;10:uhad200

[38]

Yun L, Zhang C, Liang T. et al. Insights into dammarane-type triterpenoid saponin biosynthesis from the telomere-to-telomere genome of Gynostemma pentaphyllum. Plant commu-nications. 2024;5:100932

[39]

Pei T, Zhu S, Liao W. et al. Gap-free genome assembly and CYP450 gene family analysis reveal the biosynthesis of anthocyanins in Scutellaria baicalensis. Horticulture research. 2023;10:uhad235

[40]

Jang W, Kang JN, Jo IH. et al. The chromosome-level genome assembly of lance asiabell (Codonopsis lanceolata), a medicinal and vegetable plant of the Campanulaceae family. Front Genet. 2023;14:1100819

[41]

Kang JN, Lee SM, Choi JW. et al. First contiguous genome assem-blyofJapaneseladybell(Adenophora triphylla) and insights into development of different leaf types. Genes. 2024;15:58

[42]

Bennetzen JL. Mechanisms and rates of genome expansion and contraction in flowering plants. Genetica. 2002;115:29-36

[43]

Moriyama Y, Koshiba-Takeuchi K. Significance of whole-genome duplications on the emergence of evolutionary novelties. Brief-ings in functional genomics. 2018;17:329-338

[44]

Sun W, Li M, Wang J. Characteristics of duplicated gene expres-sion and DNA methylation regulation in different tissues of allopolyploid Brassica napus. BMC Plant Biol. 2024;24:518

[45]

Zhang X, Xue L, Chen R. et al. Genome-wide identification of the cytochrome P450 family and analysis of CYP regarding salt tolerance in Medicago sativa L. Grass Research. 2023;3:21

[46]

Cheng Y, Liu H, Tong X. et al. Identification and analysis of CYP450 and UGT supergene family members from the tran-scriptome of Aralia elata (Miq.) seem reveal candidate genes for triterpenoid saponin biosynthesis. BMC Plant Biol. 2020; 20:214

[47]

Fukushima EO, Seki H, Ohyama K. et al. CYP716A subfamily members are multifunctional oxidases in triterpenoid biosyn-thesis. Plant & cell physiology. 2011;52:2050-2061

[48]

Liu Q, Khakimov B, Cárdenas PD. et al. The cytochrome P450 CYP72A552 is key to production of hederagenin-based saponins that mediate plant defense against herbivores. New Phytol. 2019;222:1599-1609

[49]

Marçais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics (Oxford, England). 2011;27:764-770

[50]

Vurture GW, Sedlazeck FJ, Nattestad M. et al. GenomeScope: fast reference-free genome profiling from short reads. Bioinformatics (Oxford, England). 2017;33:2202-2204

[51]

Bonenfant Q, Noé L, Touzet H.Porechop_ABI: discovering unknown adapters in Oxford Nanopore Technology sequenc-ing reads for downstream trimming. Bioinformatics advances. 2023;3:vbac085

[52]

Cai ZF, Hu JY, Yin TT. et al. Long amplicon HiFi sequencing for mitochondrial DNA genomes. Mol Ecol Resour. 2023;23:1014-1022

[53]

Cheng H, Concepcion GT, Feng X. et al. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods. 2021;18:170-175

[54]

Li H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics (Oxford, England). 2018;34:3094-3100

[55]

Zhang X, Zhang S, Zhao Q. et al. Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on Hi-C data. Nature plants. 2019;5:833-45

[56]

Dudchenko O, Batra SS, Omer AD. et al. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds. Science. 2017;356:92-95

[57]

Durand NC, Robinson JT, Shamim MS. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell systems. 2016;3:99-101

[58]

Wolff J, Rabbani L, Gilsbach R. et al.Galaxy HiCExplorer 3: a web server for reproducible Hi-C, capture Hi-C and single-cell Hi-C data analysis, quality control and visualization. Nucleic Acids Res. 2020;48:W177-W184

[59]

Lyˇcka M, Bubeník M, Závodník M. et al.TeloBase: a community-curated database of telomere sequences across the tree of life. Nucleic Acids Res. 2023;52:D311-D321

[60]

Kurtz S, Phillippy A, Delcher AL. et al. Versatile and open software for comparing large genomes. Genome Biol. 2004;5:R12

[61]

Xu Z, Wang H.LTR_FINDER: an efficient tool for the pre-diction of full-length LTR retrotransposons. Nucleic Acids Res. 2007;35:W265-W268

[62]

Flynn JM, Hubley R, Goubert C. et al. RepeatModeler 2 for auto-mated genomic discovery of transposable element families. Proc Natl Acad Sci USA. 2020;117:9451-9457

[63]

Slater GS, Birney E. Automated generation of heuristics for biological sequence comparison. BMC bioinformatics. 2005;6:31

[64]

Stanke M, Diekhans M, Baertsch R. et al. Using native and syn-tenically mapped cDNA alignments to improve de novo gene finding. Bioinformatics (Oxford, England). 2008;24:637-644

[65]

Kim D, Paggi JM, Park C. et al. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 2019;37:907-15

[66]

Kovaka S, Zimin AV, Pertea GM. et al. Transcriptome assembly from long-read RNA-seq alignments with StringTie2. Genome Biol. 2019;20:278

[67]

Chan PP, Lin BY, Mak AJ. et al. tRNAscan-SE 2.0: improved detection and functional classification of transfer RNA genes. Nucleic Acids Res. 2021;49:9077-9096

[68]

Lagesen K, Hallin P, Rødland EA. et al. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 2007;35:3100-3108

[69]

Kalvari I, Nawrocki EP, Ontiveros-Palacios N. et al. Rfam 14: expanded coverage of metagenomic, viral and microRNA fam-ilies. Nucleic Acids Res.2021;49:D192-D200

[70]

Goel M, Sun H, Jiao WB. et al. SyRI: finding genomic rear-rangements and local sequence differences from whole-genome assemblies. Genome Biol. 2019;20:277

[71]

Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999;27:573-580

[72]

Fu L, Niu B, Zhu Z. et al. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics (Oxford, England). 2012;28:3150-3152

[73]

Emms DM, Kelly S.OrthoFinder: phylogenetic orthology infer-ence for comparative genomics. Genome Biol. 2019;20:238

[74]

Ashburner M, Ball CA, Blake JA. et al. Gene ontology: tool for the unification of biology. The gene ontology consortium. Nat Genet. 2000;25:25-29

[75]

Kanehisa M, Goto S. KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28:27-30

[76]

Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32: 1792-1797

[77]

Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. trimAl: a tool for automated alignment trimming in large-scale phylogenetic analyses. Bioinformatics (Oxford, England). 2009;25:1972-1973

[78]

Han MV, Thomas GW, Lugo-Martinez J. et al. Estimating gene gain and loss rates in the presence of error in genome assembly and annotation using CAFE 3. MolBiolEvol. 2013;30:1987-1997

[79]

Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24:1586-1591

[80]

Frith MC, Hamada M, Horton P. Parameters for accurate genome alignment. BMC bioinformatics. 2010;11:80

[81]

Tang H, Bowers JE, Wang X. et al. Synteny and collinearity in plant genomes. Science. 2008;320:486-488

[82]

Wang Y, Tang H, Debarry JD. et al.MCScanX: a toolkit for detec-tion and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 2012;40:e49

[83]

Suyama M, Torrents D, Bork P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 2006;34:W609-W612

[84]

Aramaki T, Blanc-Mathieu R, Endo H. et al. KofamKOALA: KEGG Ortholog assignment based on profile HMM and adaptive score threshold. Bioinformatics (Oxford, England). 2020;36:2251-2252

[85]

Lu S, Wang J, Chitsaz F. et al. CDD/SPARCLE: the conserved domain database in 2020. Nucleic Acids Res. 2020;48:D265-D268

[86]

Tamura K, Stecher G, MEGA11: molecular evolutionary genetics analysis version 11. Mol Biol Evol. 2021;38:3022-3027

[87]

Chen C, Chen H, Zhang Y. et al. TBtools: an integrative toolkit developed for interactive analyses of big biological data. Mol Plant. 2020;13:1194-1202

PDF (6963KB)

1135

Accesses

0

Citation

Detail

Sections
Recommended

/