Graph-based pan-genome analysis reveals diversity of structural variations in native and commercial chicken
Yiming WANG, Zijia NI, Yinhua HUANG
Graph-based pan-genome analysis reveals diversity of structural variations in native and commercial chicken
● A graph-based pan-genome of native chicken was constructed. | |
● Structural variations related to egg production were identified in Leghorn. | |
● Structural variations related to highland adaption were identified in Tibetan chicken. | |
● A methodology for structural variations calling is proposed. |
Chickens are one of the most important domesticated animals, serving as an important protein source. Studying genetic variations in chickens to enhance their production performance is of great potential value. The emergence of next-generation sequencing has enabled precise analysis of single nucleotide polymorphisms and insertions/deletions in chicken, while third-generation sequencing achieves the accurate structural variant identification. However, the high cost of third-generation sequencing technology limits its application in population studies. The graph-based pan-genome strategy can overcome this challenge by enabling the detection of structural variations using cost-effective next-generation sequencing data. This study constructed a graph-based pan-genome for chickens using 12 high-quality genomes. This pan-genome used linear genome GRCg6a as the reference genome, containing variant information from two commercial and nine native chicken breeds. Compared to the linear genome, the pan-genome provided significant improvements in the efficiency of structural variation identification. On the basis of the graph-based pan-genome, high-frequency structural variations related to high egg production in Leghorn chicken were predicted. Additionally, it was discovered that potential structural variations was associated with highland adaptation in Tibetan chickens according to next-generation sequencing and transcriptomics data. Using the pan-genome graph, a new strategy to identify structural variations related to traits of interest in chickens is presented.
Graph-based pan-genome / chicken / next-generation sequencing / structural variations
[1] |
Wang M S, Thakur M, Peng M S, Jiang Y, Frantz L A F, Li M, Zhang J J, Wang S, Peters J, Otecko N O, Suwannapoom C, Guo X, Zheng Z Q, Esmailizadeh A, Hirimuthugoda N Y, Ashari H, Suladari S, Zein M S A, Kusza S, Sohrabi S, Kharrati-Koopaee H, Shen Q K, Zeng L, Yang M M, Wu Y J, Yang X Y, Lu X M, Jia X Z, Nie Q H, Lamont S J, Lasagna E, Ceccobelli S, Gunwardana H G T N, Senasige T M, Feng S H, Si J F, Zhang H, Jin J Q, Li M L, Liu Y H, Chen H M, Ma C, Dai S S, Bhuiyan A K F H, Khan M S, Silva G L L P, Le T T, Mwai O A, Ibrahim M N M, Supple M, Shapiro B, Hanotte O, Zhang G J, Larson G, Han J L, Wu D D, Zhang Y P . 863 genomes reveal the origin and domestication of chicken. Cell Research, 2020, 30(8): 693–701
CrossRef
Google scholar
|
[2] |
Pollock S L, Stephen C, Skuridina N, Kosatsky T . Raising chickens in city backyards: the public health role. Journal of Community Health, 2012, 37(3): 734–742
CrossRef
Google scholar
|
[3] |
Jaturasitha S, Srikanchai T, Kreuzer M, Wicke M . Differences in carcass and meat characteristics between chicken indigenous to northern Thailand (Black-Boned and Thai native) and imported extensive breeds (Bresse and Rhode Island Red). Poultry Science, 2008, 87(1): 160–169
CrossRef
Google scholar
|
[4] |
Wattanachant S, Benjakul S, Ledward D A . Composition, color, and texture of Thai indigenous and broiler chicken muscles. Poultry Science, 2004, 83(1): 123–128
CrossRef
Google scholar
|
[5] |
Jaturasitha S, Chaiwang N, Kreuzer M . Thai native chicken meat: an option to meet the demands for specific meat quality by certain groups of consumers: a review. Animal Production Science, 2017, 57(8): 1582–1587
CrossRef
Google scholar
|
[6] |
Guan R F, Lyu F, Chen X Q, Ma J Q, Jiang H, Xiao C G . Meat quality traits of four Chinese indigenous chicken breeds and one commercial broiler stock. Journal of Zhejiang University. Science. B., 2013, 14(10): 896–902
CrossRef
Google scholar
|
[7] |
Han D P, Tai Y R, Hua G Y, Yang X, Chen J F, Li J Y, Deng X . Melanocytes in black-boned chicken have immune contribution under infectious bursal disease virus infection. Poultry Science, 2021, 100(12): 101498
CrossRef
Google scholar
|
[8] |
Tai Y R, Yang X, Han D P, Xu Z H, Cai G X, Hao J Q, Zhang B, Deng X . Transcriptomic diversification of granulosa cells during follicular development between White Leghorn and Silky Fowl hens. Frontiers in Genetics, 2022, 13: 965414
CrossRef
Google scholar
|
[9] |
Nan J, Yang S, Zhang X, Leng T, Zhuoma J, Zhuoma R, Yuan J, Pi J, Sheng Z, Li S . Identification of candidate genes related to highland adaptation from multiple Chinese local chicken breeds by whole genome sequencing analysis. Animal Genetics, 2023, 54(1): 55–67
CrossRef
Google scholar
|
[10] |
Li K, Dan Z, Gesang L, Wang H, Zhou Y, Du Y, Ren Y, Shi Y, Nie Y . Comparative analysis of gut microbiota of native Tibetan and Han populations living at different altitudes. PLoS One, 2016, 11(5): e0155863
CrossRef
Google scholar
|
[11] |
Desta T T . The genetic basis and robustness of naked neck mutation in chicken. Tropical Animal Health and Production, 2021, 53(1): 95
CrossRef
Google scholar
|
[12] |
Fernandes E, Raymundo A, Martins L L, Lordelo M, de Almeida A M . The naked neck gene in the domestic chicken: a genetic strategy to mitigate the impact of heat stress in poultry production—A review. Animals, 2023, 13(6): 1007
CrossRef
Google scholar
|
[13] |
Ballouz S, Dobin A, Gillis J A . Is it time to change the reference genome. Genome Biology, 2019, 20(1): 159
CrossRef
Google scholar
|
[14] |
Merker J D, Wenger A M, Sneddon T, Grove M, Zappala Z, Fresard L, Waggott D, Utiramerur S, Hou Y, Smith K S, Montgomery S B, Wheeler M, Buchan J G, Lambert C C, Eng K S, Hickey L, Korlach J, Ford J, Ashley E A . Long-read genome sequencing identifies causal structural variation in a Mendelian disease. Genetics in Medicine, 2018, 20(1): 159–163
CrossRef
Google scholar
|
[15] |
Xiao T, Zhou W . The third generation sequencing: the advanced approach to genetic diseases. Translational Pediatrics, 2020, 9(2): 163–173
CrossRef
Google scholar
|
[16] |
Yang X F, Lee W P, Ye K, Lee C . One reference genome is not enough. Genome Biology, 2019, 20(1): 104
CrossRef
Google scholar
|
[17] |
Sherman R M, Salzberg S L . Pan-genomics in the human genome era. Nature Reviews. Genetics, 2020, 21(4): 243–254
CrossRef
Google scholar
|
[18] |
Liu Y C, Du H L, Li P C, Shen Y T, Peng H, Liu S L, Zhou G A, Zhang H, Liu Z, Shi M, Huang X, Li Y, Zhang M, Wang Z, Zhu B, Han B, Liang C, Tian Z . Pan-genome of wild and cultivated soybeans. Cell, 2020, 182(1): 162–176.E13
CrossRef
Google scholar
|
[19] |
He Q, Tang S, Zhi H, Chen J F, Zhang J, Liang H K, Alam O, Li H, Zhang H, Xing L, Li X, Zhang W, Wang H, Shi J, Du H, Wu H, Wang L, Yang P, Xing L, Yan H, Song Z, Liu J, Wang H, Tian X, Qiao Z, Feng G, Guo R, Zhu W, Ren Y, Hao H, Li M, Zhang A, Guo E, Yan F, Li Q, Liu Y, Tian B, Zhao X, Jia R, Feng B, Zhang J, Wei J, Lai J, Jia G, Purugganan M, Diao X . A graph-based genome and pan-genome variation of the model plant Setaria. Nature Genetics, 2023, 55(7): 1232–1242
CrossRef
Google scholar
|
[20] |
Iqbal Z, Caccamo M, Turner I, Flicek P, McVean G . De novo assembly and genotyping of variants using colored de Bruijn graphs. Nature Genetics, 2012, 44(2): 226–232
CrossRef
Google scholar
|
[21] |
Li M, Sun C J, Xu N Y, Bian P P, Tian X M, Wang X H, Wang Y, Jia X, Heller R, Wang M, Wang F, Dai X, Luo R, Guo Y, Wang X, Yang P, Hu D, Liu Z, Fu W, Zhang S, Li X, Wen C, Lan F, Siddiki A Z, Suwannapoom C, Zhao X, Nie Q, Hu X, Jiang Y, Yang N . De novo assembly of 20 chicken genomes reveals the undetectable phenomenon for thousands of core genes on microchromosomes and subtelomeric regions. Molecular Biology and Evolution, 2022, 39(4): msac066
CrossRef
Google scholar
|
[22] |
Huang Z, Xu Z X, Bai H, Huang Y J, Kang N, Ding X T, Liu J, Luo H, Yang C, Chen W, Guo Q, Xue L, Zhang X, Xu L, Chen M, Fu H, Chen Y, Yue Z, Fukagawa T, Liu S, Chang G, Xu L . Evolutionary analysis of a complete chicken genome. Proceedings of the National Academy of Sciences of the United States of America, 2023, 120(8): e2216641120
CrossRef
Google scholar
|
[23] |
Guo Y, Ou J H, Zan Y J, Wang Y Z, Li H F, Zhu C H, Chen K, Zhou X, Hu X, Carlborg Ö . Researching on the fine structure and admixture of the worldwide chicken population reveal connections between populations and important events in breeding history. Evolutionary Applications, 2022, 15(4): 553–564
CrossRef
Google scholar
|
[24] |
Zhou J K, Chang Y, Li J Y, Bao H, Wu C . Integrating whole-genome resequencing and RNA requencing data reveals selective sweeps and differentially expressed genes related to nervous system changes in Luxi Gamecocks. Genes, 2023, 14(3): 584
CrossRef
Google scholar
|
[25] |
Huang Y, Luo W, Luo X, Wu X, Li J, Sun Y, Tang S, Cao J, Gong Y . Comparative analysis among different species reveals that the androgen receptor regulates chicken follicle selection through species-specific genes related to follicle development. Frontiers in Genetics, 2022, 12: 752976
CrossRef
Google scholar
|
[26] |
Armstrong J, Hickey G, Diekhans M, Fiddes I T, Novak A M, Deran A, Fang Q, Xie D, Feng S, Stiller J, Genereux D, Johnson J, Marinescu V D, Alföldi J, Harris R S, Lindblad-Toh K, Haussler D, Karlsson E, Jarvis E D, Zhang G, Paten B . Progressive Cactus is a multiple-genome aligner for the thousand-genome era. Nature, 2020, 587(7833): 246–251
CrossRef
Google scholar
|
[27] |
Hickey G, Paten B, Earl D, Zerbino D, Haussler D . HAL: a hierarchical format for storing and analyzing multiple genome alignments. Bioinformatics, 2013, 29(10): 1341–1342
CrossRef
Google scholar
|
[28] |
Hickey G, Heller D, Monlong J, Sibbesen J A, Siren J, Eizenga J, Dawson E T, Garrison E, Novak A M, Paten B . Genotyping structural variants in pangenome graphs using the vg toolkit. Genome Biology, 2020, 21(1): 35
CrossRef
Google scholar
|
[29] |
Sirén J, Monlong J, Chang X, Novak A M, Eizenga J M, Markello C, Sibbesen J A, Hickey G, Chang P C, Carroll A, Gupta N, Gabriel S, Blackwell T W, Ratan A, Taylor K D, Rich S S, Rotter J I, Haussler D, Garrison E, Paten B . Pangenomics enables genotyping of known structural variants in 5202 diverse genomes. Science, 2021, 374(6574): abg8871
CrossRef
Google scholar
|
[30] |
Danecek P, Bonfield J K, Liddle J, Marshall J, Ohan V, Pollard M O, Whitwham A, Keane T, McCarthy S A, Davies R M, Li H . Twelve years of SAMtools and BCFtools. GigaScience, 2021, 10(2): giab008
CrossRef
Google scholar
|
[31] |
Layer R M, Chiang C, Quinlan A R, Hall I M . LUMPY: a probabilistic framework for structural variant discovery. Genome Biology, 2014, 15(6): R84
CrossRef
Google scholar
|
[32] |
Wang K J, Hu H F, Tian Y D, Li J Y, Scheben A, Zhang C X, Li Y, Wu J, Yang L, Fan X, Sun G, Li D, Zhang Y, Han R, Jiang R, Huang H, Yan F, Wang Y, Li Z, Li G, Liu X, Li W, Edwards D, Kang X . The chicken pan-genome reveals gene content variation and a promoter region deletion in IGF2BP1 affecting body size. Molecular Biology and Evolution, 2021, 38(11): 5066–5081
CrossRef
Google scholar
|
[33] |
Quinlan A R, Hall I M . BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics, 2010, 26(6): 841–842
CrossRef
Google scholar
|
[34] |
Bu D C, Luo H T, Huo P P, Wang Z H, Zhang S, He Z H, Wu Y, Zhao L, Liu J, Guo J, Fang S, Cao W, Yi L, Zhao Y, Kong L . KOBAS-i: intelligent prioritization and exploratory visualization of biological functions for gene enrichment analysis. Nucleic Acids Research, 2021, 49(W1): W317–W325
CrossRef
Google scholar
|
[35] |
Huang D W, Sherman B T, Lempicki R A . Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature Protocols, 2009, 4(1): 44–57
CrossRef
Google scholar
|
[36] |
Sherman B T, Hao M, Qiu J, Jiao X, Baseler M W, Lane H C, Imamichi T, Chang W . DAVID: a web server for functional enrichment analysis and functional annotation of gene lists (2021 update). Nucleic Acids Research, 2022, 50(W1): W216–W221
CrossRef
Google scholar
|
[37] |
Kim D, Paggi J M, Park C, Bennett C, Salzberg S L . Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nature Biotechnology, 2019, 37(8): 907–915
CrossRef
Google scholar
|
[38] |
Pertea M, Pertea G M, Antonescu C M, Chang T C, Mendell J T, Salzberg S L . StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nature Biotechnology, 2015, 33(3): 290–295
CrossRef
Google scholar
|
[39] |
Love M I, Huber W, Anders S . Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2. Genome Biology, 2014, 15(12): 550
CrossRef
Google scholar
|
[40] |
Chen R, Qin Y, Du J, Liu J, Dai S, Lei M, Zhu H . Circadian clock gene BMAL1 regulates STAR expression in goose ovarian preovulatory granulosa cells. Poultry Science, 2023, 102(12): 103159
CrossRef
Google scholar
|
[41] |
Amorim C A, Dolmans M M, David A, Jaeger J, Vanacker J, Camboni A, Donnez J, Van Langendonckt A . Vitrification and xenografting of human ovarian tissue. Fertility and Sterility, 2012, 98(5): 1291–1298
CrossRef
Google scholar
|
[42] |
Brooks K, Burns G, Spencer T E . Biological roles of hydroxysteroid (11-Beta) dehydrogenase 1 (HSD11B1), HSD11B2, and glucocorticoid receptor (NR3C1) in sheep conceptus elongation. Biology of Reproduction, 2015, 93(2): 38
CrossRef
Google scholar
|
[43] |
Koizumi M, Momoeda M, Hiroi H, Hosokawa Y, Tsutsumi R, Osuga Y, Yano T, Taketani Y . Expression and regulation of cholesterol sulfotransferase (SULT2B1b) in human endometrium. Fertility and Sterility, 2010, 93(5): 1538–1544
CrossRef
Google scholar
|
[44] |
Bigham A W, Kiyamu M, Leon-Velarde F, Parra E J, Rivera-Ch M, Shriver M D, Brutsaert T D . Angiotensin-converting enzyme genotype and arterial oxygen saturation at high altitude in Peruvian Quechua. High Altitude Medicine & Biology, 2008, 9(2): 167–178
CrossRef
Google scholar
|
[45] |
Wang H, Ishizaki R, Xu J, Kasai K, Kobayashi E, Gomi H, Izumi T . The Rab27a effector exophilin7 promotes fusion of secretory granules that have not been docked to the plasma membrane. Molecular Biology of the Cell, 2013, 24(3): 319–330
CrossRef
Google scholar
|
[46] |
Rausch T, Zichner T, Schlattl A, Stütz A M, Benes V, Korbel J O . DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics, 2012, 28(18): i333–i339
CrossRef
Google scholar
|
[47] |
Li H B, Wang S H, Chai S, Yang Z Q, Zhang Q Q, Xin H J, Xu Y, Lin S, Chen X, Yao Z, Yang Q, Fei Z, Huang S, Zhang Z . Graph-based pan-genome reveals structural and sequence variations related to agronomic traits and domestication in cucumber. Nature Communications, 2022, 13(1): 682
CrossRef
Google scholar
|
[48] |
Monsu M, Comin M . Fast alignment of reads to a variation graph with application to SNP detection. Journal of Integrative Bioinformatics, 2021, 18(4): 20210032
CrossRef
Google scholar
|
[49] |
Eggertsson H P, Kristmundsdottir S, Beyter D, Jonsson H, Skuladottir A, Hardarson M T, Gudbjartsson D F, Stefansson K, Halldorsson B V, Melsted P . GraphTyper2 enables population-scale genotyping of structural variation using pangenome graphs. Nature Communications, 2019, 10(1): 5402
CrossRef
Google scholar
|
[50] |
Torgasheva A A, Malinovskaya L P, Zadesenets K S, Karamysheva T V, Kizilova E A, Akberdina E A, Pristyazhnyuk I E, Shnaider E P, Volodkina V A, Saifitdinova A F, Galkina S A, Larkin D M, Rubtsov N B, Borodin P M . Germline-restricted chromosome (GRC) is widespread among songbirds. Proceedings of the National Academy of Sciences of the United States of America, 2019, 116(24): 11845–11850
CrossRef
Google scholar
|
[51] |
Nam K, Mugal C, Nabholz B, Schielzeth H, Wolf J B W, Backstrom N, Künstner A, Balakrishnan C N, Heger A, Ponting C P, Clayton D F, Ellegren H . Molecular evolution of genes in avian genomes. Genome Biology, 2010, 11(6): R68
CrossRef
Google scholar
|
[52] |
Alonge M, Lebeigle L, Kirsche M, Jenike K, Ou S, Aganezov S, Wang X, Lippman Z B, Schatz M C, Soyk S . Automated assembly scaffolding using RagTag elevates a new tomato system for high-throughput genome editing. Genome Biology, 2022, 23(1): 258
CrossRef
Google scholar
|
/
〈 | 〉 |