A post-GWAS replication study confirming the association of C1<?Pub Caret?>4H8orf33 gene with milk production traits in dairy cattle

Shaohua YANG; Chao QI; Yan XIE; Xiaogang CUI; Yahui GAO; Jianping JIANG; Li JIANG; Shengli ZHANG; Qin ZHANG; Dongxiao SUN

doi:10.15302/J-FASE-2014037

Front. Agr. Sci. Eng. ›› 2014, Vol. 1 ›› Issue (4) :321 -330. DOI: 10.15302/J-FASE-2014037

RESEARCH ARTICLE

A post-GWAS replication study confirming the association of C1<?Pub Caret?>4H8orf33 gene with milk production traits in dairy cattle

Author information +

History +

PDF (311KB)

Abstract

Genome-wide association studies with an Illumina Bovine50K chip have detected 105 SNPs associated with one or multiple milk production traits in the Chinese Holstein population. Of these, 38 significant SNPs detected with high confidence by both L1-TDT and MMRA methods were selected to further mine potential key genes affecting milk yield and milk composition. By blasting the flanking sequences of these 38 SNPs with the bovine genome sequence combined with comparative genomics analysis, 26 genes were found to contain or be near to such SNPs. Among them, the C14H8orf33 gene is merely 87 bp away from the significant SNP, Hapmap30383-BTC-005848. Hence, we report herein genotype-phenotype associations to further validate the genetic effects of the C14H8orf33 gene. By pooled DNA sequencing of 14 unrelated Holstein sires, a total of 18 with seven novel SNPs were identified. Among them, nine SNPs were in the 5′ regulatory region, one in exon 6 and the other in the 3′ UTR and 3′ regulatory region. A total of nine of these identified SNPs were successfully genotyped and analyzed by mass spectrometry for association with five milk production traits in an independent resource population. The results showed that these SNPs were statistically significant for more than two traits [P<(0.0001-0.0267)]. In addition, mRNA expression analyses revealed that C14H8orf33 was ubiquitous in eight different tissues, with a relatively higher expression level in the mammary gland than in other tissues. These findings, therefore, provide strong evidence for association of C14H8orf33 variants with milk yield and milk composition traits and may be applied in Chinese Holstein breeding programs.

Keywords

GWAS / functional annotation / Chinese Holstein / milk production traits / C14H8orf33 gene / single nucleotide polymorphisms / association study

Cite this article

Download citation ▾

Shaohua YANG, Chao QI, Yan XIE, Xiaogang CUI, Yahui GAO, Jianping JIANG, Li JIANG, Shengli ZHANG, Qin ZHANG, Dongxiao SUN. A post-GWAS replication study confirming the association of C1<?Pub Caret?>4H8orf33 gene with milk production traits in dairy cattle. Front. Agr. Sci. Eng., 2014, 1(4): 321-330 DOI:10.15302/J-FASE-2014037

登录浏览全文

4963

注册一个新账户忘记密码

Introduction

QTL linkage analyses and fine mapping studies have achieved remarkable results in recent decades [1-3]. However, the low density markers of the genetic variation in the complex economic traits cannot be captured using this method [4-7]. Genome wide association study (GWAS), which utilizes a large number of high-density genetic markers throughout the entire genome, provides a new approach to detect causal variations underlying complex traits [8,9]. So far, GWAS has been successfully applied to identify genes involved in human diseases [10,11], economical traits and various complex traits in animals [12,13]. Our previous GWAS with an illumina 50K chip detected 105 SNPs which were significantly associated with one or multiple milk production traits <FootNote>

The Author(s) 2014.This article is published with open access at http://engineering.cae.cn

</FootNote> in dairy cattle [14]. As the first step in gene discovery [15,16], the results from GWAS still need further functional annotation and validation by use of genetic association studies. Thus, we selected 38 highly significant SNPs detected with high confidence by two statistical methods from these 105 significant SNPs. Through bioinformatics and comparative genomics analysis, a total of 26 genes were found to contain or be near to at least one of 38 significant SNPs, including the well-known DGAT1 and GHR genes [17,18]. Of these, the chromosome 14 open reading frame 33 ortholog (C14H8orf33) gene had the nearest location to the significant SNP, Hapmap30383-BTC-005848 [14], and was considered as a promising candidate gene for milk production traits.

The C14H8orf33 gene is located on BTA14, which includes a large number of QTLs for milk production traits, i.e. DGAT1 [19-22]. The bovine C14H8orf33 gene spans 2054 bp and contains 6 exons and 5 introns. The cDNA consists of 1220 bp with an open reading frame encoding a 188-amino acid protein. It is 313 kb away from the causal mutation K232A of the DGAT1 gene. However, until recently, almost no relevant reports have been available for the C14H8orf33 gene. In this research, an association study was conducted to confirm our previous GWAS result and to search for potential variants of the C14H8orf33 gene affecting milk production traits in dairy cattle.

Materials and methods

Bioinformatics and comparative genomics analysis

To further validate the exact physical location of the 38 SNPs selected from the 105 significant SNPs, we separately compared each of the 60 bp upstream and downstream nucleotide sequences with NCBI (http://www.ncbi.nlm.nih.gov) and UCSC (http://genome.ucsc.edu/) website Btau 3.1 databases. From the exact physical location, we inferred the gene that the SNP was located within or near to.

The potential biochemistry and physiology of the gene based on its genome sequence was predicted by searching for the homologous and similar sequences from cattle, human and mouse. For the purpose of precise and accurate prediction, we used the websites; NCBI (http://www.ncbi.nlm.nih.gov), Ensemble (http://asia.ensembl.org/index.html), Uniprot (http://www.uniprot.org), KEGG (http://www.genome.jp/KEGG), GeneCards (http://www.genecards.org) and wikipathways (http://www.wikipathways.org) to achieve functional annotations for each gene.

Animal resource and DNA extraction

A daughter design was employed in this study. A total of 742 daughters from 14 corresponding sires were selected to construct the study population. The numbers of daughters for each of the 14 sires ranged from 22 to 125. These daughters were from 15 dairy farms in Beijing Sanyuanlvhe Dairy Farming Center. The official estimated breeding values (EBVs) for the five milk production traits, including milk yield (MY), fat yield (FY), protein yield (PY), fat percentage (FP) and protein percentage (PP) were provided by the Dairy Data Center of the Dairy Association of China (DAC) (http://www.holstein.org.cn). Genomic DNA was isolated from whole blood samples of cows and frozen semen of sires. A DNA pool was constructed from the DNA of 14 sires at the same concentration of 50 ng·μL^-1.

SNP identification and genotyping

A total of 18 pairs of PCR primers (Appendix A, Table S1) were designed with Primer Premier 3 (Premier, Canada), according to the genomic sequence of the bovine C14H8orf33 gene, to amplify all exons plus 5′ and 3′ franking regions. The SNPs identified using the pooled DNA from daughters of 14 sires, further SNPs were genotyped for all experimental cows using the iPLEX MassArray system (Sequenom Inc.). In addition, the SNP Hapmap30383-BTC-005848 from a previous GWAS [14] was genotyped for the purpose of replication in this study.

Statistical analyses

Allele and genotype frequencies were compared between the mutant and wild type through a chi-square test. The chi-square tests were also used to determine whether individual variants were in equilibrium at each locus by comparing the expected and observed genotype frequencies (Hardy–Weinberg equilibrium). Pedigrees of the population were traced back for three generations to create the relationship matrix. We calculated linkage disequilibrium between all pairs of biallelic loci using HAPLOVIEW 4.2. For single locus and haplotype analyses, the mixed procedure in SAS 9.1.3 with the animal model was fitted as follows:

y = 1 μ + b x + Z a + e

Where y is the vector of EBVs for each trait, μ is the overall mean, b is the regression coefficient of EBVs on SNP genotypes, x is the fixed effect vector, a is the vector of polygenetic effects with a~N (0,

A δ a

) (where A is the additive kinship matrix and

δ a

is the additive variance), and e is the vector of residual errors distributed as e~N (0,

W δ a

) [23].

Total RNA isolation and cDNA synthesis

Total RNA from 8 different tissues, i.e., heart, liver, small intestine, kidney, mammary gland, ovary, uterus and gluteus, was extracted using Trizol Reagent (Invitrogen, Carlsbad, CA, USA) according to the manufacturer’s protocols and DNA contamination removed from RNA extracts with RNase-free DNase I for 30 min at 37°C. RNA integrity was checked by 1% agarose gels and the quantity was detected with NANODROP 2000 (Thermo Scientific, DE, USA). One microgram of total RNA for each tissue was reverse transcribed by PrimeScript^® RT reagent Kit (TaKaRa, Ostn, Japan) to obtain the cDNA. Each cDNA sample was amplified to ascertain its quality with a pair of specific primers for GAPDH, which covers two partial adjacent exons and the whole intron between those two exons.

Real-time quantitative RT-PCR

With the primers as shown in Appendix A (table S1), quantitative real-time RT-PCR was carried out with a LightCycler 480 Real-Time PCR System (Roche, Hercules, CA, USA). The reaction condition were as follows: pre-denaturation at 95°C for 10 s; amplification 45 cycles of 95°C for 10 s, 60°C for 10 s, and 72°C for 10 s. The relative expression level was normalized by the GAPDH with 2^ΔΔCT method as described previously (Livak and Schmittgen, 2001). All the measurements of C14H8orf33 gene expression in different tissues were performed in triplicate, and the average values obtained. These data were analyzed by a t-test using the SAS9.0 program (SAS Institute, Inc., Cary, NC, USA), with a P value of<0.05 considered significant.

Results

Function annotation

Based on the 38 SNPs selected with high confidence from the 105 significant SNPs identified by our initial GWAS, a total of 26 genes were obtained through bioinformatics and comparative genomics analysis, and their functions placed into seven major categories: including body metabolism and nutrient balance; cytoskeleton or extracellular matrix components; regulation of cell proliferation and apoptosis; cell signal transduction and salt ion channel composition; kinase activity; mRNA transcription and translation regulation (Table 1).

SNP identification and selection

By sequencing of pooled DNA from daughters of 14 unrelated sires in a Chinese Holstein population, a total of 18, including seven novel SNPs were identified (Table 2). Among them, nine SNPs were in the 5′ regulatory region, one in exon 6 and the remainder in the 3′ UTR and 3′ regulatory region. The SNP in exon 6 was a non-synonymous SNP with the amino acid alteration from proline (CCC) to histidine (CAC). Nine out of these identified SNPs were successfully genotyped by mass spectrometry and analyzed for association with five milk production traits in an independent resource population. Chi-square test showed all nine SNPs were in Hardy–Weinberg equilibrium (P>0.05). The genotypic and allele frequencies are shown in Table 3.

Associations analyses

The association results are shown in Table 4. These SNPs were significantly associated with protein yield [P<(0.0001-0.0267)] but not associated with other milk production traits (P>0.05) in present study. In addition, the SNP Hapmap30383-BTC-005848 identified in our initial GWAS [14] was successfully confirmed to have significant associations with MY, PY, FP and PP in this independent dairy cattle population. This provided convincing statistical evidence for our previous study.

Linkage disequilibrium analysis

The LD block generated by all nine SNPs within 5 kb (Fig. 1), consisted of three haplotypes, TCACCGTTT, AACTGACAC and TCACCGCTT with frequencies of 0.44, 0.39 and 0.15, respectively. The statistical analysis of the haplotypes with EBVs of five milk production traits showed that the haplotypes were associated with PY (P = 2.31×10⁴) (Table 5). The results were consistent with the associations of single SNPs.

**Expression analysis of the bovine C14H8orf33 gene**

The relative mRNA expression of C14H8orf33 in eight different tissues was determined by quantitative real-time PCR. The results revealed that C14H8orf33 was ubiquitous in these eight tissues, and at a relatively higher expression level in the mammary gland than in other tissues. In addition, the expression of C14H8orf33 in small intestine, kidney, ovary and uterus was also relatively higher than in three other tissues (Fig. 2).

Discussion

In this study, we annotated the function of 26 genes that correspond to 31 of 38 SNPs identified as highly significant via bioinformatics and comparative genomics analysis, and identified several novel C14H8orf33 variants associated with milk production traits.

In previous studies, the known functional genes DGAT1 [24], ABCG2 [25] and SCD1 for milk production traits [26] have been observed to have high expression in mammary tissue of mammals. In this study, we found that the C14H8orf33 gene was also expressed at a relatively higher level in the mammary gland compared with seven other tissues, indicating its importance in mammary biologic processing in dairy cattle. Furthermore, our association data showed that the nine identified SNPs, including the SNP Hapmap30383-BTC-005848 identified by our initial GWAS [14], in the C14H8orf33 gene were significantly associated with at least one milk trait. Therefore, it was inferred that the C14H8orf33 gene showed relatively independent effects on the milk traits. At the same time, our findings provided convincing evidence for our previous GWAS study by a replication study. In conjunction with association analyses, the SNP Hapmap30383-BTC-005848 in the 3′ UTR in the C14H8orf33 gene could be selected to examine whether this mutation is involved in interaction with some miRNA in a follow-up investigation.

In addition, the C14H8orf33 gene is 313 kb away from the causal mutation K232A of the true QTL for milk composition, i.e., the DGAT1 gene. Although the significant associations of the C14H8orf33 gene with milk production traits were associated with higher expression in mammary gland of lactating cows, it is suspected that these significant associations could be due to the linkage disequilibrium (LD) between C14H8orf33 and DGAT1. We found a total of 25 genes between C14H8orf33 and DGAT1 and further investigations are needed in order to verify whether the strong LD exist between these two genes.

Conclusions

This study provided strong evidence for association of C14H8orf33 variants with milk yield and milk composition traits and may be applied in Chinese Holstein breeding programs.

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Georges M, Nielsen D, Mackinnon M, Mishra A, Okimoto R, Pasquino A T, Sargeant L S, Sorensen A, Steele M R, Zhao X, Womack J E, Hoeschele I. Mapping quantitative trait loci controlling milk production in dairy cattle by exploiting progeny testing. Genetics, 1995, 139(2): 907–920

[2]	Khatkar M S, Thomson P C, Tammen I, Raadsma H W. Quantitative trait loci mapping in dairy cattle: review and meta-analysis. Genetics Selection Evolution, 2004, 36(2): 163–190

[3]	Smaragdov M. Genetic mapping of loci responsible for milk production traits in dairy cattle. Russian Journal of Genetics, 2006, 42(1): 1–15

[4]	Zhang H, Wang Z, Wang S, Li H. Progress of genome wide association study in domestic animals. Journal of Animal Science and Biotechnology, 2012, 3(1): 26

[5]	Olsen H G, Lien S, Gautier M, Nilsen H, Roseth A, Berg P R, Sundsaasen K K, Svendsen M, Meuwissen T H. Mapping of a milk production quantitative trait locus to a 420kb region on bovine chromosome 6. Genetics, 2005, 169(1): 275–283

[6]	Gutiérrez-Gil B, Williams J L, Homer D, Burton D, Haley C S, Wiener P. Search for quantitative trait loci affecting growth and carcass traits in a cross population of beef and dairy cattle. Journal of Animal Science, 2009, 87(1): 24–36

[7]

Kneeland J, Li C, Basarab J, Snelling W M, Benkel B, Murdoch B, Hansen C, Moore S S. Identification and fine mapping of quantitative trait loci for growth traits on bovine chromosomes 2, 6, 14, 19, 21, and 23 within one commercial line of Bos taurus. Journal of Animal Science, 2004, 82(12): 3405–3414

[8]	Bolormaa S, Pryce J E, Hayes B J, Goddard M E. Multivariate analysis of a genome-wide association study in dairy cattle. Journal of Dairy Science, 2010, 93(8): 3818–3833

[9]	Mai M D, Sahana G, Christiansen F B, Guldbrandtsen B. A genome-wide association study for milk production traits in Danish Jersey cattle using a 50K single nucleotide polymorphism chip. Journal of Animal Science, 2010, 88(11): 3522–3528

[10]

Hampe J, Franke A, Rosenstiel P, Till A, Teuber M, Huse K, Albrecht M, Mayr G, De La Vega F M, Briggs J, Günther S, Prescott N J, Onnie C M, Häsler R, Sipos B, Fölsch U R, Lengauer T, Platzer M, Mathew C G, Krawczak M, Schreiber S. A genome-wide association scan of nonsynonymous SNPs identifies a susceptibility variant for Crohn disease in ATG16L1. Nature Genetics, 2007, 39(2): 207–211

[11]

Sun L D, Xiao F L, Li Y, Zhou W M, Tang H Y, Tang X F, Zhang H, Schaarschmidt H, Zuo X B, Foelster-Holst R, He S M, Shi M, Liu Q, Lv Y M, Chen X L, Zhu K J, Guo Y F, Hu D Y, Li M, Li M, Zhang Y H, Zhang X, Tang J P, Guo B R, Wang H, Liu Y, Zou X Y, Zhou F S, Liu X Y, Chen G, Ma L, Zhang S M, Jiang A P, Zheng X D, Gao X H, Li P, Tu C X, Yin X Y, Han X P, Ren Y Q, Song S P, Lu Z Y, Zhang X L, Cui Y, Chang J, Gao M, Luo X Y, Wang P G, Dai X, Su W, Li H, Shen C P, Liu S X, Feng X B, Yang C J, Lin G S, Wang Z X, Huang J Q, Fan X, Wang Y, Bao Y X, Yang S, Liu J J, Franke A, Weidinger S, Yao Z R, Zhang X J. Genome-wide association study identifies two new susceptibility loci for atopic dermatitis in the Chinese Han population. Nature Genetics, 2011, 43(7): 690–694

[12]	García-Gámez E, Gutiérrez-Gil B, Sahana G, Sánchez J P, Bayón Y, Arranz J J. GWA analysis for milk production traits in dairy sheep and genetic support for a QTN influencing milk protein percentage in the LALBA gene. PLoS ONE, 2012, 7(10): e47782

[13]	Magwire M M, Fabian D K, Schweyen H, Cao C, Longdon B, Bayer F, Jiggins F M. Genome-wide association studies reveal a simple genetic basis of resistance to naturally coevolving viruses in Drosophila melanogaster. PLOS Genetics, 2012, 8(11): e1003057

[14]	Jiang L, Liu J, Sun D, Ma P, Ding X, Yu Y, Zhang Q. Genome wide association studies for milk production traits in Chinese Holstein population. PLoS ONE, 2010, 5(10): e13661

[15]	Hirschhorn J N. Genomewide association studies—illuminating biologic pathways. The New England Journal of Medicine, 2009, 360(17): 1699–1701

[16]	Hardy J, Singleton A. Genomewide association studies and human disease. The New England Journal of Medicine, 2009, 360(17): 1759–1768

[17]

Grisart B, Farnir F, Karim L, Cambisano N, Kim J J, Kvasz A, Mni M, Simon P, Frère J M, Coppieters W, Georges M. Genetic and functional confirmation of the causality of the DGAT1 K232A quantitative trait nucleotide in affecting milk yield and composition. Proceedings of the National Academy of Sciences of the United States of America, 2004, 101(8): 2398–2403

[18]

Blott S, Kim J J, Moisio S, Schmidt-Küntzel A, Cornet A, Berzi P, Cambisano N, Ford C, Grisart B, Johnson D, Karim L, Simon P, Snell R, Spelman R, Wong J, Vilkki J, Georges M, Farnir F, Coppieters W. Molecular dissection of a quantitative trait locus: a phenylalanine-to-tyrosine substitution in the transmembrane domain of the bovine growth hormone receptor is associated with a major effect on milk yield and composition. Genetics, 2003, 163(1): 253–266

[19]	Coppieters W, Riquet J, Arranz J J, Berzi P, Cambisano N, Grisart B, Karim L, Marcq F, Moreau L, Nezer C, Simon P, Vanmanshoven P, Wagenaar D, Georges M. A QTL with major effect on milk yield and composition maps to bovine chromosome 14. Mammalian Genome, 1998, 9(7): 540–544

[20]	Sun D, Jia J, Ma Y, Zhang Y, Wang Y, Yu Y, Zhang Y. Effects of DGAT1 and GHR on milk yield and milk composition in the Chinese dairy population. Animal Genetics, 2009, 40(6): 997–1000

[21]

Riquet J, Coppieters W, Cambisano N, Arranz J J, Berzi P, Davis S K, Grisart B, Farnir F, Karim L, Mni M, Simon P, Taylor J F, Vanmanshoven P, Wagenaar D, Womack J E, Georges M. Fine-mapping of quantitative trait loci by identity by descent in outbred populations: application to milk production in dairy cattle. Proceedings of the National Academy of Sciences of the United States of America, 1999, 96(16): 9252–9257

[22]	Looft C, Reinsch N, Karall-Albrecht C, Paul S, Brink M, Thomsen H, Brockmann G, Kühn C, Schwerin M, Kalm E. A mammary gland EST showing linkage disequilibrium to a milk production QTL on bovine Chromosome 14. Mammalian Genome, 2001, 12(8): 646–650

[23]	Wang H, Jiang L, Liu X, Yang J, Wei J, Xu J, Zhang Q, Liu J F. A post-GWAS replication study confirming the PTK2 gene associated with milk production traits in Chinese Holstein. PLoS ONE, 2013, 8(12): e83625

[24]

Cases S, Smith S J, Zheng Y W, Myers H M, Lear S R, Sande E, Novak S, Collins C, Welch C B, Lusis A J, Erickson S K, Farese R V Jr. Identification of a gene encoding an acyl CoA:diacylglycerol acyltransferase, a key enzyme in triacylglycerol synthesis. Proceedings of the National Academy of Sciences of the United States of America, 1998, 95(22): 13018–13023

[25]	Bionaz M, Loor J J. Gene networks driving bovine milk fat synthesis during the lactation cycle. BMC Genomics, 2008, 9(1): 366

[26]	Lengi A J, Corl B A. Identification and characterization of a novel bovine stearoyl-CoA desaturase isoform with homology to human SCD5. Lipids, 2007, 42(6): 499–508