Genomic insights into the domestication and genetic basis of yield in papaya

Min Yang , Xiangdong Kong , Chenping Zhou , Ruibin Kuang , Xiaming Wu , Chuanhe Liu , Han He , Ze Xu , Yuerong Wei

Horticulture Research ›› 2025, Vol. 12 ›› Issue (5) : 45

PDF (3177KB)
Horticulture Research ›› 2025, Vol. 12 ›› Issue (5) :45 DOI: 10.1093/hr/uhaf045
Article
research-article
Genomic insights into the domestication and genetic basis of yield in papaya
Author information +
History +
PDF (3177KB)

Abstract

Papaya (Carica papaya L.) is an important tropical and subtropical fruit crop, and understanding its genome is essential for breeding. In this study, we assembled a high-quality genome of 344.17 Mb for the newly cultivated papaya ‘Zihui’, which contains 22 250 protein-coding genes. By integrating 201 resequenced papaya genomes, we identified four distinct papaya groups and a 34 Mb genomic region with strong domestication selection signals. Within these regions, two key genes associated with papaya yield were discovered: Cp_zihui06549, encoding a leucine-rich receptor-like protein kinase, and Cp_zihui06768, encoding the accumulation of photosystem one 1 (APO1) protein. Heterologous expression of Cp_zihui06549 in tomato confirmed that the total number of fruits in transgenic lines more than doubled compared to wild-type plants, resulting in a significant yield increase. Furthermore, we constructed a pan-genome of papaya and obtained a 77.41 Mb nonreference sequence containing 1543 genes. Within this pan-genome, 2483 variable genes, we detected, including four genes annotated as the ‘terpene synthase activity’ Gene Ontology term, which were lost in cultivars during domestication. Finally, gene retention analyses were performed using gene presence and absence variation data and differentially expressed genes across various tissues and organs. This study provides valuable insights into the genes and loci associated with phenotypes and domestication processes, laying a solid foundation for future papaya breeding efforts.

Cite this article

Download citation ▾
Min Yang, Xiangdong Kong, Chenping Zhou, Ruibin Kuang, Xiaming Wu, Chuanhe Liu, Han He, Ze Xu, Yuerong Wei. Genomic insights into the domestication and genetic basis of yield in papaya. Horticulture Research, 2025, 12(5): 45 DOI:10.1093/hr/uhaf045

登录浏览全文

4963

注册一个新账户 忘记密码

Acknowledgements

This work was supported by the ‘Young and Middle-aged Academic Leaders’ Training Fund Project of Guangdong Academy of Agricultural Sciences (Grant no. R2023PY-JX005), the Guangdong Basic and Applied Basic Research Foundation (Grant no. 2022A1515010697), the Cultivation Project of Fruit Tree Research Institute, Guangdong Academy of Agricultural Sciences (Grant no. 23107), and the Guangzhou Science and Technology Planning Project (Grant no. 2023B03J1369).

Author Contributions

M.Y. and Y.W. planned and designed the research. M.Y. and X.K. performed most of the experiments and all bioinformatics analyses. C.Z., R.K., X.W., C.L., H.H., and Z.X. helped collect the samples and extracted total RNAs or DNAs. M.Y., X.K., and Y.W. wrote the manuscript. All authors contributed to the article and approved the submitted version.

Data Availability

The Hi-C, HiFi, RNA-Seq and genome assembly data were deposited in the National Centre for Biotechnology Information (NCBI) under BioProject ID PRJNA968045. The whole-genome resequencing was deposited in the National Centre for Biotechnology Information (NCBI) under BioProject ID PRJNA970517. The nonreference contigs of Carica papaya pangenome and annotation are available at Figshare database.

Conflict of interest statement

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Supplementary data

Supplementary data is available at Horticulture Research online.

References

[1]

Azad MAK, Amin L, Sidik NM. Gene Technology for papaya ringspot virus disease management. Sci World J. 2014;2014:1-11

[2]

Tarora K, Shudo A, Kawano S. et al. Development of plants resistant to papaya leaf distortion mosaic virus by intergeneric hybridization between Carica papaya and Vasconcellea cundina-marcensis. Breed Sci. 2016;66:734-41

[3]

Yang M, Zhou C, Yang H. et al. Comparative transcriptomics and genomic analyses reveal differential gene expression related to Colletotrichum brevisporum resistance in papaya (Carica papaya L.). Front Plant Sci. 2022;13:1038598

[4]

Zhou Z, Ford R, Bar I. et al. Papaya (Carica papaya L.) flavour profiling. Genes Basel. 2021;12:1416

[5]

Aravind G, Bhowmik D, Duraivel S. et al. Traditional and medici-nal uses of Carica papaya. J Med Plants Stud. 2013;1:7-15

[6]

Amri E, Mamboya F. Papain, a plant enzyme of biological impor-tance: A review. Am J Biochem Biotechnol. 2012;8:99-104

[7]

Tsuge H, Nishimura T, Tada Y. et al. Inhibition mechanism of cathepsin L-specific inhibitors based on the crystal struc-ture of papain-CLIK148 complex. Biochem Biophys Res Commun. 1999;266:411-6

[8]

Nantawan U, Kanchana-Udomkan C, Bar I. et al. Linkage map-ping and quantitative trait loci analysis of sweetness and other fruit quality traits in papaya. BMC Plant Biol. 2019; 19:1-11

[9]

Hübner S, Bercovich N, Todesco M. et al. Sunflower pan-genome analysis shows that hybridization altered gene content and disease resistance. Nat Plants. 2019;5:54-62

[10]

Kim MS, Moore PH, Zee F. et al. Genetic diversity of Carica papaya as revealed by AFLP markers. Genome. 2002;45:503-12

[11]

Sengupta S, Das B, Acharyya P. et al. Genetic diversity analysis in aset ofCaricaceae accessions using resistance gene analogues. BMC Genet. 2014;15:137

[12]

Varshney RK, Roorkiwal M, Sun S. et al. A chickpea genetic vari-ation map based on the sequencing of 3,366 genomes. Nature. 2021;599:622-7

[13]

Wang W, Mauleon R, Hu Z. et al. Genomic variation in 3,010 diverse accessions of Asian cultivated rice. Nature. 2018;557:43-9

[14]

Liao Z, Zhang X, Zhang S. et al. Structural variations in papaya genomes. BMC Genomics. 2021;22:335

[15]

Yue J, VanBuren R, Liu J. et al. SunUp and sunset genomes revealed impact of particle bombardment mediated transfor-mation and domestication history in papaya. Nat Genet. 2022;54: 715-24

[16]

Wei Y, Yang H, Zhou C. et al. Breeding of a new papaya cultivar Zihui with good quality and high yield. JFruit Sc.. 2022;39: 1129-32

[17]

Van Buren R, Zeng F, Chen C. et al. Origin and domestication of papaya Yh chromosome. Genome Res. 2015;25:524-33

[18]

MuratF, ArmeroA, PontC. et al. Reconstructing the genome of the most recent common ancestor of flowering plants. Nat Genet. 2017;49:490-6

[19]

Qiao X, Li Q, Yin H. et al. Gene duplication and evolution in recur-ring polyploidization-diploidization cycles in plants. Genome Biol. 2019;20:38

[20]

Gao L, Gonda I, Sun H. et al. The tomato pan-genome uncovers new genes and a rare allele regulating fruit flavor. Nat Genet. 2019;2019:1044-51

[21]

Devitt LC, Sawbridge T, Holton TA. et al. Discovery of genes associated with fruit ripening in Carica papaya using expressed sequence tags. Plant Sci. 2006;170:356-63

[22]

Mou J, Zhang Z, Qiu H. et al. Multiomics-based dissection of citrus flavonoid metabolism using a Citrus reticulata × Poncirus trifoliata population. Hort Res. 2021;8:56

[23]

Fan Z, Zhai Y, Wang Y. et al. Genome-wide analysis of antho-cyanin biosynthesis regulatory WD40 gene FcTTG1 and related family in Ficus carica L. Front Plant Sci. 2022;13:948084

[24]

de Vetten N, Quattrocchio F, Mol J. et al. The an 11 locus con-trolling flower pigmentation in petunia encodes a novel WD-repeat protein conserved in yeast, plants, and animals. Genes Dev. 1997;11:1422-34

[25]

Jue D, Sang X, Shu B. et al. Characterization and expression anal-ysis of genes encoding ubiquitin conjugating domain-containing enzymes in Carica papaya. PLoS One. 2017;12:e0171357

[26]

Porter BW, Paidi M, Ming R. et al. Genome-wide analysis of Carica papaya reveals a small NBS resistance gene family. Mol Gen Genomics. 2009;281:609-26

[27]

Siriwan W, Roytrakul S, Chowpongpang S. et al. Study of inter-action between papaya ringspot virus coat protein and infected Carica papaya proteins. J Plant Interact. 2021;16:474-80

[28]

Wang W, Ni Z-J, Thakur K. et al. Recent update on the mechanism of hydrogen sulfide improving the preservation of postharvest fruits and vegetables. Curr Opin Food Sci. 2022;47:100906

[29]

Tripathi S, Suzuki JY, Ferreira SA. et al. Papaya ringspot virus-P: characteristics, pathogenicity, sequence variability and control. Mol Plant Pathol. 2008;9:269-80

[30]

Chen W, Chen L, Zhang X. et al. Convergent selection of a WD 40 protein that enhances grain yield in maize and rice. Science. 2023;375:eabg7985

[31]

Zhu F, Jadhav SS, Tohge T. et al. A comparative transcriptomics and eQTL approach identifies SlWD40 as a tomato fruit ripening regulator. Plant Physiol. 2022;190:250-66

[32]

Hohmann U, Ramakrishna P, Wang K. et al. Constitutive activa-tion of leucine-rich repeat receptor kinase signaling pathways by BAK1-interacting receptor-like kinase3 chimera. Plant Cell. 2020;32:3311-23

[33]

Lozano-Elena F, Caño-Delgado AI. Emerging roles of vascular brassinosteroid receptors of the BRI1-like family. Curr Opin Plant Biol. 2019;51:105-13

[34]

Amann K, Lezhneva L, Wanner G. et al. Accumulation of photosystem one1, a member of a novel gene family, is required for accumulation of [4Fe-4S] cluster-containing chloro-plast complexes and antenna proteins. Plant Cell. 2004;2004: 3084-97

[35]

Liu J, Yue R, Si M. et al. Effects of exogenous application of melatonin on quality and sugar metabolism in ‘Zaosu’ pear fruit. J Plant Growth Regul. 2019;38:1161-9

[36]

Jia Q, Brown R, Köllner TG. et al. Origin and early evolu-tion of the plant terpene synthase family. Proc Natl Acad Sci. 2022;119:e2100361119

[37]

Li N, He Q, Wang J. et al. Super-pangenome analyses highlight genomic diversity and structural variation across wild and cul-tivated tomato species. Nat Genet. 2023;55:852-60

[38]

Huang X, Xiao N, Zou Y. et al. Heterotypic transcriptional con-densates formed by prion-like paralogous proteins canalize flowering transition in tomato. Genome Biol. 2022;23:1-21

[39]

Kwon CT, Tang L, Wang X. et al. Dynamic evolution of small signalling peptide compensation in plant stem cell control. Nat Plants. 2022;8:346-55

[40]

SuiX, ShanN, HuL. et al. The complex character of photosyn-thesis in cucumber fruit. JExp Bot. 2017;68:1625-37

[41]

Cheng H, Concepcion GT, Feng X. et al. Haplotype-resolved de novo assembly using phased assembly graphs with hifiasm. Nat Methods. 2021;18:170-5

[42]

Li H, Durbin R. Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics. 2009;25:1754-60

[43]

Walker BJ, Abeel T, Shea T. et al. Pilon: an integrated tool for com-prehensive microbial variant detection and genome assembly improvement. PLoS One. 2014;9:e112963

[44]

Alonge M, Soyk S, Ramakrishnan S. et al. RaGOO: fast and accurate reference-guided scaffolding of draft genomes. Genome Biol. 2019;20:224

[45]

Zhang J, Zhang X, Tang H. et al. Allele-defined genome of the autopolyploid sugarcane ∗Saccharum spontaneum∗ L. Nat Genet. 2018;50:1565-73

[46]

Zhang X, Zhang S, Zhao Q. et al. Assembly of allele-aware, chromosomal-scale autopolyploid genomes based on hi-C data. Nat Plants. 2019;5:833-45

[47]

Alonge M, Lebeigle L, Kirsche M. et al. Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing. Genome Biol. 2022;23:258

[48]

Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics. 2004; Chapter 4: 1-14

[49]

Flynn JM, Hubley R, Goubert C. et al. RepeatModeler 2 for auto-mated genomic discovery of transposable element families. Proc Natl Acad Sci USA. 2020;117:9451-7

[50]

Benson G. Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999;27:573-80

[51]

Stanke M, Keller O, Gunduz I. et al. Augustus: a b initio prediction of alternative transcripts. Nucleic Acids Res. 2006;34:435-9

[52]

Holt C, Yandell M. Maker2: an annotation pipeline and genome-database management tool for second-generation genome projects. BMC Bioinformatics. 2011;12:1-14

[53]

Grabherr M, Haas B, Yassour M. et al. Trinity: reconstructing a full-length transcriptome without a genome from RNA-seq data. Nat Biotechnol. 2011;29:644-52

[54]

Emms DM, Kelly S. Orthofinder: phylogenetic orthology infer-ence for comparative genomics. Genome Biol. 2019;20:1-14

[55]

Katoh K, Misawa K, Kuma KI. et al. Mafft: a novel method for rapid multiple sequence alignment based on fast Fourier transform. Nucleic Acids Res. 2002;30:3059-66

[56]

Capella-Gutiérrez S, Silla-Martínez JM, Gabaldón T. Trimal: a tool for automated alignment trimming in large-scale phyloge-netic analyses. Bioinformatics. 2009;25:1972-3

[57]

Zhang C, Nielsen R, Mirarab S. Caster: direct species tree inference from whole-genome alignments. Science. 2025; eadk9688.

[58]

Nguyen LT, Schmidt HA, Von Haeseler A. et al. Iq-tree: a fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol Biol Evol. 2015;32:268-74

[59]

Sanderson MJ. R8s: inferring absolute rates of molecular evolu-tion and divergence times in the absence of a molecular clock. Bioinformatics. 2003;19:301-2

[60]

Tang H, Bowers JE, Wang X. et al. Synteny and collinearity in plant genomes. Science. 2008;320:486-8

[61]

Chen S, Zhou Y, Chen Y. et al. Fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i884-90

[62]

McKenna A, Hanna M, Banks E. et al. The genome analysis toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297-303

[63]

Danecek P, Auton A, Abecasis G. et al. The variant call format and VCFtools. Bioinformatics. 2011;27:2156-8

[64]

Chen H, Patterson N, Reich D. Population differentiation as a test for selective sweeps. Genome Res. 2010;20:393-402

[65]

Purcell S, Neale B, Todd-Brown K. et al. Plink: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet. 2007;81:559-75

[66]

Zhou X, Stephens M. Genome-wide efficient mixed-model anal-ysis for association studies. Nat Genet. 2012;44:821-4

[67]

Li X, Shi Z, Qie Q. et al. CandiHap: a toolkit for haplotype analysis for sequences of samples and fast identification of candidate causal gene(s) in genome-wide association studies. bioRxiv. 2020;967539

[68]

Li H, Handsaker B, Wysoker A. et al. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25: 2078-9

[69]

Zimin AV, Marçais G, Puiu D. et al. The MaSuRCA genome assem-bler. Bioinformatics. 2013;29:2669-77

[70]

Fu L, Niu B, Zhu Z. et al. CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics. 2012;28: 3150-2

[71]

Kim D, Paggi JM, Park C. et al. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 2019;37:907-15

[72]

Golicz AA, Martinez PA, Zander M. et al. Gene loss in the fun-gal canola pathogen ∗Leptosphaeria maculans∗. Funct Integr Genomics. 2015;15:189-96

[73]

Hubisz MJ, Falush D, Stephens M. et al. Inferring weak population structure with the assistance of sample group information. Mol Ecol Resour. 2009;9:1322-32

[74]

Liao Y, Smyth GK, Shi W. FeatureCounts: an efficient general purpose program for assigning sequence reads to genomic fea-tures. Bioinformatics. 2014;30:923-30

[75]

Oksanen J, Simpson GL, Blanchet FG. et al. Package ‘vegan’. Community Ecol Package. 2013;2:1-295

[76]

Love MI, Huber W, Anders S. Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biol. 2014;15:550

[77]

Katoh K, Standley DM. Mafft multiple sequence alignment soft-ware version 7: improvements in performance and usability. Mol Biol Evol. 2013;30:772-80

[78]

Tang N, Wu P, Cao Z. et al. A NAC transcription factor ZaNAC93 confers floral initiation, fruit development, and prickle formation in Zanthoxylum armatum. Plant Physiol Biochem. 2023;201:107813

PDF (3177KB)

1022

Accesses

0

Citation

Detail

Sections
Recommended

/