The near-complete genome assembly of allotetraploid Pennisetum purpureum ‘Purple’ reveals the genetic and epigenetic landscape of centromeres

Yongji Huang , Jinbin Lin , Jun Xu , Xinyi Lin , Zuhu Deng , Xiaoxian Zhong , Sheng Zuo , Zhiliang Zhang

Horticulture Research ›› 2026, Vol. 13 ›› Issue (2) : 301

PDF (5269KB)
Horticulture Research ›› 2026, Vol. 13 ›› Issue (2) :301 DOI: 10.1093/hr/uhaf301
Articles
research-article
The near-complete genome assembly of allotetraploid Pennisetum purpureum ‘Purple’ reveals the genetic and epigenetic landscape of centromeres
Author information +
History +
PDF (5269KB)

Abstract

Drastic karyotype changes are a major evolutionary force, potentially involving centromere position, number, distribution, or strength alterations. Yet, the genetic and epigenetic landscape of centromeres, especially in allopolyploid plants during subgenome reshuffling, remains poorly understood. Here, we present a near-complete chromosome-scale genome assembly of the allotetraploid Pennisetum purpureum ‘Purple’, resolving all 14 centromeres. We find that subgenome-biased expansion of six LTR retrotransposons drives architectural divergence between subgenomes. Centromeric satellite repeats (CentPs) show rapid sequence divergence across subgenomes and chromosomes, with CENH3 preferentially binding conserved higher order repeats. Intriguingly, centromeric retrotransposons in Pennisetum (CRPs) are evolutionarily younger compared to their noncentromeric counterparts, coupled with marked subgenome B-biased amplification. Notably, CRP insertions flanking CentP satellites correlate with elevated satellite DNA polymorphism, supporting a model wherein CentP homogenization processes actively purge retrotransposons from centromeric arrays. Despite rapid sequence diversification of centromeric repeats, the epigenetic landscapes remain evolutionarily conserved in the centromeres of two subgenomes. Additionally, comparative analyses across Pennisetum species demonstrate rapid species- and chromosome-level turnover of CentPs and CRPs. Overall, our study illuminates the genetic and epigenetic plasticity of centromeres in allopolyploids, revealing how centromeric repeats adapt post-subgenome reshuffling.

Cite this article

Download citation ▾
Yongji Huang, Jinbin Lin, Jun Xu, Xinyi Lin, Zuhu Deng, Xiaoxian Zhong, Sheng Zuo, Zhiliang Zhang. The near-complete genome assembly of allotetraploid Pennisetum purpureum ‘Purple’ reveals the genetic and epigenetic landscape of centromeres. Horticulture Research, 2026, 13(2): 301 DOI:10.1093/hr/uhaf301

登录浏览全文

4963

注册一个新账户 忘记密码

Acknowledgements

We thank Juying Wu (Institute of Grassland, Flower and Ecology, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China) and Linkai Huang (College of Grassland Science and Technology, Sichuan Agricultural University, Chengdu, Sichuan, China) for providing plant materials for this study. This work was supported by funds from the National Natural Science Foundation of China (32001605), the Natural Science Foundation of Fujian Province, China (2025 J01332), and the Open Project of State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources (SKLCUSA-b202408). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Authors contributions

Y.J.H. and Z.L.Z. conceived the project. X.Y.L. performed experiments. Y.J.H., J.B.L., J.X., and Z.L.Z. analyzed data. Y.J.H., J.B.L., Z.L.Z, S.Z., J.X., Z.H.D., and X.X.Z. interpreted data analyses. Y.J.H. wrote the manuscript with contributions from all authors. All authors read and approved the final manuscript.

Data availability

The whole-genome sequencing data (including Illumina short reads, HiFi reads, and Hi-C interaction reads), ChIP-seq data, and transcriptomes of different tissues used in this study have been deposited at the National Genomics Data Center (NGDC) under accession number PRJCA034549. The raw reads used in this study are available under the following accession numbers (ChIP-seq data: CRA021877; RNA-seq data: CRA021875; Illumina WGS data: CRA021874; HiFi data: CRA021873; Hi-C data: CRA021871). The genome assembly and gene annotation data for P. purpureum ‘Purple’ have been deposited at the Genome Warehouse (GWH) under accession number GWHFIJS00000000.1.

Conflicts of interest statement

The authors declare that they have no conflict of interest.

Supplementary material

Supplementary material is available at Horticulture Research online.

References

[1]

Talbert PB, Henikoff S. The genetics and epigenetics of satel-lite centromeres. Genome Res. 2022; 32:608-15

[2]

Henikoff S, Ahmad K, Malik HS. The centromere para-dox: stable inheritance with rapidly evolving DNA. Science. 2001; 293:1098-102

[3]

Cheng Z, Dong F, Langdon T. et al. Functional rice centromeres are marked by a satellite repeat and a centromere-specific retrotransposon. Plant Cell. 2002; 14:1691-704

[4]

Lee HR, Zhang W, Langdon T. et al. Chromatin immuno-precipitation cloning reveals rapid evolutionary patterns of centromeric DNA in Oryza species. Proc Natl Acad Sci USA. 2005; 102:11793-8

[5]

Yadav V, Sun S, Coelho MA. et al. Centromere scission drives chromosome shuffling and reproductive isolation. Proc Natl Acad Sci USA. 2020; 117:7917-28

[6]

Wang T, van Dijk ADJ, Bucher J. et al. Interploidy introgres-sion shaped adaptation during the origin and domestication history of Brassica napus. Mol Biol Evol. 2023;40:msad199

[7]

Zhuang W, Chen H, Yang M. et al. The genome of culti-vated peanut provides insight into legume karyotypes, poly-ploid evolution and crop domestication. Nat Genet. 2019; 51: 865-76

[8]

Wang M, Tu L, Yuan D. et al. Reference genome sequences of two cultivated allotetraploid cottons, Gossypium hirsutum and Gossypium barbadense. Nat Genet. 2019; 51:224-9

[9]

Chalhoub B, Denoeud F, Liu S. et al. Plant genetics. Early allopolyploid evolution in the post-Neolithic Brassica napus oilseed genome. Science. 2014; 345:950-3

[10]

Wang T, van Dijk ADJ, Zhao R. et al. Contribution of homoe-ologous exchange to domestication of polyploid brassica. Genome Biol. 2024; 25:231

[11]

Jiang X, Song Q, Ye W. et al. Concerted genomic and epigenomic changes accompany stabilization of Arabidopsis allopolyploids. Nat Ecol Evol. 2021; 5:1382-93

[12]

Yan H, Han J, Jin S. et al. Post-polyploidization centromere evolution in cotton. Nat Genet. 2025; 57:1021-30

[13]

Gao S, Jia Y, Guo H. et al. The centromere landscapes of four karyotypically diverse Papaver species provide insights into chromosome evolution and speciation. Cell Genom. 2024; 4:100626

[14]

Chen C, Wu S, Sun Y. et al. Three near-complete genome assemblies reveal substantial centromere dynamics from diploid to tetraploid in Brachypodium genus. Genome Biol. 2024; 25:63

[15]

Zhang Y, Li J, Wang X. et al. A bibliometric analysis review of the Pennisetum (1970-2023). Front Sustain Food Syst. 2024; 8:1405684

[16]

Brito da Silva V, Daher RF, de Souza YP. et al. Assessment of energy production in full-sibling families of elephant grass by mixed models. Renew Energy. 2020; 146:744-9

[17]

Zheng H, Wang B, Hua X. et al. A near-complete genome assembly of the allotetrapolyploid Cenchrus fungigraminus (JUJUNCAO) provides insights into its evolution and C4 pho-tosynthesis. Plant Commun. 2023; 4:100633

[18]

YanQ, WuF, XuP. et al. The elephant grass (Cenchrus pur-pureus) genome provides insights into anthocyanidin accu-mulation and fast growth. Mol Ecol Resour. 2021; 21:526-42

[19]

Song JM, Xie WZ, Wang S. et al. Two gap-free reference genomes and a global view of the centromere architecture in rice. Mol Plant. 2021; 14:1757-67

[20]

Chen J, Wang Z, Tan K. et al. A complete telomere-to-telomere assembly of the maize genome. Nat Genet. 2023; 55:1221-31

[21]

Altemose N, Logsdon GA, Bzikadze AV. et al. Complete genomic and epigenetic maps of human centromeres. Sci-ence. 2022;376:eabl4178

[22]

Naish M, Alonge M, Wlodzimierz P. et al. The genetic and epi-genetic landscape of the Arabidopsis centromeres. Science. 2021;374:eabi7489

[23]

Zhang S, Xia Z, Li C. et al. Chromosome-scale genome assem-bly provides insights into speciation of allotetraploid and massive biomass accumulation of elephant grass (Pennise-tum purpureum Schum.). Mol Ecol Resour. 2022; 22:2363-78

[24]

Nurk S, Walenz BP, Rhie A. et al. HiCanu: accurate assembly of segmental duplications, satellites, and allelic variants from high-fidelity long reads. Genome Res. 2020; 30:1291-305

[25]

Hu J, Wang Z, Sun Z. et al. NextDenovo: an efficient error correction and accurate assembly tool for noisy long reads. Genome Biol. 2024; 25:107

[26]

Jia KH, Wang ZX, Wang L. et al. SubPhaser: a robust allopoly-ploid subgenome phasing method based on subgenome-specific k-mers. New Phytol. 2022; 235:801-9

[27]

Wicker T, Gundlach H, Spannagl M. et al. Impact of transpos-able elements on genome structure and evolution in bread wheat. Genome Biol. 2018; 19:103

[28]

Huang Y, Ding W, Zhang M. et al. The formation and evolution of centromeric satellite repeats in Saccharum species. Plant J. 2021; 106:616-29

[29]

Wlodzimierz P, Rabanal FA, Burns R. et al. Cycles of satel-lite and transposon evolution in Arabidopsis centromeres. Nature. 2023; 618:557-65

[30]

Teng K, Guo Q, Liu L. et al. Chromosome-level reference genome assembly provides insights into the evolution of Pen-nisetum alopecuroides. Front Plant Sci. 2023; 14:1195479

[31]

Melters DP, Bradnam KR, Young HA. et al. Comparative anal-ysis of tandem repeats from hundreds of species reveals unique insights into centromere evolution. Genome Biol. 2013;14:R10

[32]

Yang X, Zhao H, Zhang T. et al. Amplification and adaptation of centromeric repeats in polyploid switchgrass species. New Phytol. 2018; 218:1645-57

[33]

Zhang T, Talbert PB, Zhang W. et al. The CentO satel-lite confers translational and rotational phasing on cenH3 nucleosomes in rice centromeres. Proc Natl Acad Sci USA. 2013; 110:4875-83

[34]

Gong Z, Wu Y, Koblížková A. et al. Repeatless and repeat-based centromeres in potato: implications for centromere evolution. Plant Cell. 2012; 24:3559-74

[35]

Zhang H, Koblížková A, Wang K. et al. Boom-bust turnovers of Megabase-sized Centromeric DNA in Solanum species: rapid evolution of DNA sequences associated with centromeres. Plant Cell. 2014; 26:1436-47

[36]

Shang WH, Hori T, Toyoda A. et al. Chickens possess centromeres with both extended tandem repeats and short non-tandem-repetitive sequences. Genome Res. 2010; 20: 1219-28

[37]

Cappelletti E, Piras FM, Sola L. et al. Robertsonian fusion and centromere repositioning contributed to the formation of satellite-free centromeres during the evolution of zebras. Mol Biol Evol. 2022;39:msac162

[38]

Yan H, Ito H, Nobuta K. et al. Genomic and genetic character-ization of rice Cen3 reveals extensive transcription and evo-lutionary implications of a complex centromere. Plant Cell. 2006; 18:2123-33

[39]

Nagaki K, Cheng Z, Ouyang S. et al. Sequencing of a rice centromere uncovers active genes. Nat Genet. 2004; 36:138-45

[40]

Huang Y, Liu Y, Liu C. et al. Distinct evolutionary trajectories of subgenomic centromeres in polyploid wheat. Genome Biol. 2025; 26:271

[41]

Zhu Z, Gui S, Jin J. et al. The NnCenH3 protein and cen-tromeric DNA sequence profiles of Nelumbo nucifera Gaertn.(sacred lotus) reveal the DNA structures and dynamics of cen-tromeres in basal eudicots. Plant J. 2016; 87:568-82

[42]

Tsukahara S, Bousios A, Perez-Roman E. et al. Centrophilic retrotransposon integration via CENH3 chromatin in Ara-bidopsis. Nature. 2025; 637:744-8

[43]

Li H, Durbin R. Fast and accurate short read alignment with burrows-wheeler transform. Bioinformatics. 2009; 25: 1754-60

[44]

Lin Y, Ye C, Li X. et al. quarTeT: a telomere-to-telomere toolkit for gap-free genome assembly and centromeric repeat iden-tification. Hortic Res. 2023;10:uhad127

[45]

Hu J, Wang Z, Liang F. et al. NextPolish2: a repeat-aware polishing tool for genomes assembled using HiFi long reads. Genomic Proteomic Bioinformatics. 2024;22:qzad009

[46]

Marçais G, Delcher AL, Phillippy AM. et al. MUMmer4: a fast and versatile genome alignment system. PLoS Comput Biol. 2018; 14:e1005944

[47]

Zhou ZW, Yu ZG, Huang XM. et al. GenomeSyn: a bioinformat-ics tool for visualizing genome synteny and structural varia-tions. J Genet Genomics. 2022; 49:1174-6

[48]

Griesmann M, Chang Y, Liu X. et al. Phylogenomics reveals multiple losses of nitrogen-fixing root nodule symbiosis. Sci-ence. 2018;361:eaat1743

[49]

Tang H, Krishnakumar V, Zeng X. et al. JCVI: a versatile toolkit for comparative genomics analysis. Imeta. 2024; 3:e211

[50]

Kim D, Paggi JM, Park C. et al. Graph-based genome align-ment and genotyping with HISAT2 and HISAT-genotype. Nat Biotechnol. 2019; 37:907-15

[51]

Fu L, Niu B, Zhu Z. et al. CD-HIT: accelerated for clus-tering the next-generation sequencing data. Bioinformatics. 2012; 28:3150-2

[52]

Gabriel L, Brůna T, Hoff KJ. et al. BRAKER3: fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS, and TSEBRA. Genome Res. 2024; 34:769-77

[53]

Cantalapiedra CP, Hernández-Plaza A, Letunic I. et al. eggNOG-mapper v2: functional annotation, Orthology assignments, and domain prediction at the metagenomic scale. Mol Biol Evol. 2021; 38:5825-9

[54]

Aramaki T, Blanc-Mathieu R, Endo H. et al. KofamKOALA: KEGG Ortholog assignment based on profile HMM and adap-tive score threshold. Bioinformatics. 2020; 36:2251-2

[55]

Buchfink B, Reuter K, Drost HG. Sensitive protein align-ments at tree-of-life scale using DIAMOND. Nat Methods. 2021; 18:366-8

[56]

Flynn JM, Hubley R, Goubert C. et al. RepeatModeler2 for auto-mated genomic discovery of transposable element families. Proc Natl Acad Sci USA. 2020; 117:9451-7

[57]

Chen N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics. 2004 Chapter 4: Unit 4.10;5:

[58]

Zhang RG, Li GY, Wang XL. et al. TEsorter: an accurate and fast method to classify LTR-retrotransposons in plant genomes. Hortic Res. 2022;9:uhac017

[59]

Vollger MR, Kerpedjiev P, Phillippy AM. et al. StainedGlass: interactive visualization of massive tandem repeat structures with identity heatmaps. Bioinformatics. 2022; 38:2049-51

[60]

Gao S, Yang X, Guo H. et al. HiCAT: a tool for automatic anno-tation of centromere structure. Genome Biol. 2023; 24:58

[61]

Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: improvements in performance and usabil-ity. Mol Biol Evol. 2013; 30:772-80

[62]

Price MN, Dehal PS, Arkin AP. FastTree: computing large minimum evolution trees with profiles instead of a distance matrix. Mol Biol Evol. 2009; 26:1641-50

[63]

Letunic I, Bork P. Interactive tree of life (iTOL) v4: recent updates and new developments. Nucleic Acids Res. 2019;47: W256-9

[64]

Novák P, Neumann P, Macas J. Global analysis of repeti- tive DNA from unassembled sequence reads using RepeatEx-plorer2. Nat Protoc. 2020; 15:3745-76

PDF (5269KB)

383

Accesses

0

Citation

Detail

Sections
Recommended

/