Introduction
5-methylcytosine (5mC) is the predominant and most extensively studied epigenetic modification in the mammalian genome (
Bird, 2002). Although it is well established that DNA methylation is deposited by a conserved family of DNA methyltransferases (DNMTs) (
Goll and Bestor, 2005), the mechanisms underlying the enzymatic removal of this modification remain hotly debated (
Ma et al., 2009a;
Zhu, 2009; Wu and Zhang, 2010;
Bhutani et al., 2011). Ten-eleven-translocated (TET) family dioxygenases oxidize the 5-methyl group of 5mC, and generate 5hmC (
Tahiliani et al., 2009;
Ito et al., 2010) as well as the more oxidized 5-formylcytosine and 5-carboxylcytosine (
He et al., 2011;
Ito et al., 2011). Multiple lines of evidence have supported a role for 5mC-to-5hmC conversion in the process of DNA demethylation. First, 5hmC is known to prevents DNMT1 from recognizing hemi-methylated DNA in vitro, which may cause “dilution” of the existing DNA methylation pattern in dividing cells (
Valinluck and Sowers, 2007;
Frauer et al., 2011). Indeed, such a passive mode of DNA demethylation was recently found to occur in early mouse embryos (
Inoue and Zhang, 2011). Second, irreplicable 5hmC-containing DNA can be demethylated in both human cell lines and cultured mouse neurons (
Zhang et al., 2010;
Guo et al., 2011c), indicating that 5hmC can also function as a precursor of active DNA demethylation. Finally, gain- and loss-of-function manipulations of TET proteins lead to changes in DNA methylation that are consistent with their proposed roles in active DNA demethylation (
Kohli and Zhang, 2013;
Pastor et al., 2013).
Intriguingly, among all the tissues surveyed thus far, the mammalian brain has the highest abundance of 5hmC (
Kriaucionis and Heintz, 2009;
Globisch et al., 2010). In the brain, 5hmC accounts for 10%–40% of all methylated cytosines, compared to 1%–2% in other tissues, suggesting the potential functional importance of 5hmC in the nervous system. Previously generated genomic maps of 5hmC in largely post-mitotic mouse hippocampus as well as mouse and human cerebellum tissue indicate that 5hmC associates strongly with genic regions, is depleted at transcription start sites (TSSs), and correlates with gene expression (
Song et al., 2011). Our previous studies suggest that 5hmC levels are significantly higher in both the cerebellum and hippocampus of adult mice at six weeks and one year of age than in the early postnatal stage, indicating an age-dependent acquisition of 5hmC and regulation of brain-specific gene expression (
Szulwach et al., 2011b). Recent genome-wide studies from us and others have uncovered global changes of 5hmC associated with Rett syndrome as well as a number of neurodegenerative disorders, suggesting potential roles of precise 5hmC distributions in the development and homeostasis of the nervous system (
Szulwach et al., 2011a;
Szulwach et al., 2011b;
Wang et al., 2013;
Yao et al., 2013). However, the exact role and scope of 5hmC-mediated active DNA demethylation in shaping the landscape of DNA methylation in neurons
in vivo are still unclear.
In this study, we systematically characterized the 5hmC distribution in dentate granule neurons (DGNs) in vivo and compared it to the 5hmC map of cultured mouse embryonic stem cells (ESCs). We correlated the 5hmC distribution in DGN with both gene expression and transcription factor occupancy. Cross-comparison between 5hmC profiles and overall DNA methylation distributions revealed the global antagonism between these two modification states, supporting a role for 5hmC in shaping the neuronal DNA methylome on a genome-wide scale.
Materials and methods
Tissue preparation. Adult mice (8 to 10 weeks old, male, C57BL/6 background) were used for analysis in accordance with protocols approved by the Institutional Animal Care and Use Committee. Dentate gyrus tissues were rapidly micro-dissected bilaterally from adult mice. This preparation was highly enriched for mature neurons as shown by immunohistology to contain ~90% NeuN
+ dentate granule neurons (
Ma et al., 2009b). Previous studies of this preparation showed a very similar CpG methylation status at selective loci with FACS purified NeuN
+ mature neurons (
Guo et al., 2011a). Validation experiments were performed using independent biological samples of bilaterally micro-dissected dentate gyri from individual animals.
5hmC DNA capture. 5hmC enrichment was performed as previously described with an improved selective chemical labeling method (
Song et al., 2011). 5hmC labeling reactions were performed in a 100 µL solution containing 50 mM HEPES buffer (pH 7.9), 25 mM MgCl
2, 300 ng/μL sonicated genomic DNA (100–500 bp), 250 μM UDP-6-N
3-Glu, and 2.25 µM wild-type β-GT. Reactions were incubated for 1 h at 37°C. DNA substrates were purified via Qiagen DNA purification kit or by phenol-chloroform precipitation and reconstituted in H
2O. Click chemistry was performed with the addition of 150 µM dibenzocyclooctyne modified biotin into the DNA solution and incubated for 2 h at 37°C. Samples were purified by Pierce Monomeric Avidin Kit (Thermo) following manufacturer’s recommendations. After elution, biotin-5-N
3-gmC-containing DNA was concentrated by 10K Amicon Ultra-0.5 mL Centrifugal Filters (Millipore) and purified by Qiagen DNA purification kit.
Library construction and high-throughput sequencing. Five ng of 5hmC-enriched-genomic DNA from 3 independent 5hmC captures or one non-enriched input genomic DNA was end-repaired, adenylated, and ligated to Illumina Genomic DNA Adapters (Genomic DNA adapter oligo mix) according to standard Illumina protocols for ChIP-Seq library construction, maintaining the proper molar ratios of adapter to insert. Adapter-ligated fragments of ~200–350 bp were gel-purified by 2% agarose gel electrophoresis and PCR-amplified for 18 PCR cycles. Libraries were checked for quality and quantified using an Agilent 2100 Bioanalyzer DNA 1000 Chip.
Libraries were sequenced using the Illumina HiScan platform. Cluster generation was performed with Illumina TruSeq cluster kit v2-cBot-HS. Single-read 51-bp sequencing was completed with Illumina TruSeq SBS kit v3-HS. A dedicated PhiX control lane, as well as 1% PhiX spike in all other lanes, was used for automated matrix and phasing calculations. Image analysis and base calling were performed with the standard Illumina pipeline.
Data processing and analysis. FASTQ reads were aligned to NCBIv1/mm9 with Bowtie v0.17.2 retaining non-duplicate, unique matches to the genome, with no more than 3 mismatches in the first 30 bases. Ensembl gene annotations were downloaded from the UCSC Genome Browser (http://genome.ucsc.edu). Data analysis and visualization were done using built-in functions of R (http://www.r-project.org) and in-house Perl scripts.
qPCR validation of 5hmC-enriched loci. One ng of input or 5mC-enriched DNA, from an independent 5mC capture experiment, was used in triplicate 20 µL qPCR reactions, each with 1X Power SYBR Green PCR Master Mix (Applied Biosystems), 0.5 µM forward and reverse primers, and water. Reactions were run on an SDS 7500 Fast Instrument using 7500 Standard cycling conditions. Fold enrichment was calculated as 2-dCt, where dCt= Ct (5-hmC-enriched) -Ct (Input).
Results
Global properties of the neuronal 5hmC profile
We used a previously developed chemical tagging method to profile the 5hmC distribution in the neuronal genome (
Song et al., 2011). The dentate gyrus from an in vivo preparation was used because the majority of cells in this region are Prox1
+NeuN
+ post-mitotic DGNs (
Ma et al., 2009b;
Guo et al., 2011a), providing relatively high homogeneity compared to other regions in the mouse brain. A total of ~33 million unique non-duplicate reads were obtained from two biological replicates, which were highly correlated (
r = 0.952; Fig. 1A). In addition, we also profiled mouse ESCs (~19 million unique non-duplicate reads). To validate the neuronal 5hmC profiles, we used quantitative PCR to measure 5hmC enrichment at specific loci that were identified as 5hmC-marked by sequencing. In total, 16 out of 17 regions showed significant enrichment relative to a negative control region that is not 5hmC-marked (Fig. 1B). Global analysis of input-normalized 5hmC signals showed that all autosomes had comparable 5hmC levels (Fig. 1C), whereas sex chromosomes showed lower 5hmC levels, consistent with previous studies in both ESCs (
Pastor et al., 2011;
Szulwach et al., 2011a) and the brain (
Szulwach et al., 2011b). The mitochondrial genome was depleted in 5hmC, which is consistent with our previous finding that the neuronal mitochondrial genome is virtually non-methylated (
Guo et al., 2011a).
We next determined the average 5hmC signal across the genic regions. All regions near or within annotated genes, including TSS upstream regions, 5′-untranslated regions (UTR), coding exons, introns, 3′-UTR, and polyadenylation site (PAS) downstream regions, showed higher 5hmC levels than intergenic regions (Fig. 1D), which is consistent with previous findings that 5hmC is enriched near genes in the brain (
Szulwach et al., 2011b;
Mellén et al., 2012). Meta-gene analysis showed that gene bodies were enriched in 5hmC (Fig. 1E). Within gene bodies, coding exons are most enriched for 5hmC (Fig. 1D and 1E). 5hmC levels gradually increased from 5′ to 3′ ends within gene bodies (Fig. 1E), although to a lesser degree compared to ESCs (
Xu et al., 2011).
Relationship between 5hmC and gene expression in neurons
Given the gene-centric property of 5hmC, we determined the relationship between 5hmC levels and associated gene expression levels by correlating 5hmC profiles with previously obtained DGN RNA-seq results (
Guo et al., 2013). Meta-gene analysis showed a striking position-specific relationship between 5hmC levels and transcript abundance (Fig. 2A). Near TSSs, 5hmC anti-correlated with gene expression level; 5hmC-marked TSSs were restricted to the genes that had the lowest expression levels. However, in all other structures, including the TSS upstream regions, gene body, and PAS downstream regions, 5hmC levels were positively correlated with gene expression (Fig. 2A). Furthermore, the positive correlation gradually increased from 5′ to 3′ within the gene body.
Recent studies have revealed an important role for protein-DNA interactions in determining DNA methylation levels at binding sites (
Lister et al., 2009;
Guo et al., 2011a;
Lienert et al., 2011;
Stadler et al., 2011b). To test whether protein-DNA interactions also shape the distribution of 5hmC at these binding sites, we averaged 5hmC levels within 10-kb windows across all binding sites for each of several transcription factors for which ChIP-seq profiles were available (
Kim et al., 2010). Interestingly, 5hmC was depleted at the binding sites for all the tested neuronal DNA binding factors, including cAMP-responsive element binding protein (CREB), CREB binding protein (CBP), RNA polymerase II (RNAP), and serum response factor (SRF; Fig. 2B). These results suggested that protein-DNA interactions may protect the DNA from methylation and/or hydroxymethylation, or cause DNA demethylation. It is also possible that DNA (hydroxy-) methylation prevents transcription factor binding, which is a general property of many DNA binding factors (
Stadler et al., 2011a;
Spruijt et al., 2013).
Extensive differences in 5hmC distributions between ESCs and DGNs
5hmC was first identified in the genomic DNA of ESCs (
Tahiliani et al., 2009) and cerebellar neurons (
Kriaucionis and Heintz, 2009). Recent studies have shown that these two cell types have higher 5hmC abundance than other somatic cell types (
Globisch et al., 2010). We compared the two 5hmC profiles obtained using the same chemical tagging method. Global correlations between ESCs and either DGN sample (
r = 0.69 and 0.62, respectively) were lower than between the two DGN samples (
r = 0.83; Fig. 3A), suggesting extensive differences between the 5hmC distributions in ESC and DGN.
A comparison of gene-body 5hmC levels for individual genes between the two cell types identified 1129 genes that showed≥4-fold higher 5hmC density in neurons than in ESCs (DGN-specific) and 1004 genes that are ESC-specific (Fig. 3B). For example, Sema5a, an autism susceptibility gene that encodes an axon guidance cue, showed a higher 5hmC density in DGNs than in ESCs (Fig. 3C). In contrast, the HoxA gene cluster, which encodes an important group of developmental regulators, had much higher 5hmC levels in ESCs than in DGNs (Fig. 3C). Gene ontology analysis further revealed that neuron-specific 5hmC-marked genes were enriched in ribosomal proteins and synaptic proteins, such as neurotransmitter receptors (Fig. 3D), whereas ESC-specific 5hmC-marked genes were enriched in developmental regulatory pathways. Therefore, differentially 5hmC-marked genes were associated with the functional differences between the two cell types.
DNA methylation and 5hmC are antagonistic genome-wide in both DGNs and ESCs
To examine the relationship between overall DNA methylation (i.e. the sum of 5mC and 5hmC) and 5hmC, we overlaid the 5hmC map with the previously determined DNA methylation profile for the same DGNs (
Guo et al., 2011a). Strikingly, the global profiles of these two modification states showed highly complementary patterns (Fig. 4A). The high 5hmC density regions showed low levels of DNA methylation and vice versa. Across the genome, 5hmC and overall DNA methylation were significantly anti-correlated (Fig. 4B).
To rule out the possibility that the global antagonism between 5hmC and DNA methylation was merely due to the association of 5hmC and gene-rich regions in the genome, we calculated the averaged gene-body 5hmC and overall DNA methylation levels for each gene and ranked all the genes by their expression levels. As we showed previously, gene-body 5hmC levels were correlated with gene expression (Fig. 4C), whereas DNA methylation was anti-correlated with gene expression, suggesting that the antagonism between 5hmC and overall DNA methylation also exists at the individual gene level.
Finally, we tested whether 5hmC and DNA methylation were also antagonistic genome-wide in ESCs. Using the previously determined ESC methylome data (
Meissner et al., 2008), we found a significant anti-correlation between 5hmC and overall DNA methylation in ESCs (Fig. 4D), suggesting that the global antagonism between these two states of cytosine modification is not restricted to post-mitotic neurons.
Discussion
Since the identification of 5hmC in the genomic DNA from ESCs and Purkinje cells, the biological role of this DNA base has been the subject of intensive study. Here we systematically compared the 5hmC distributions between two cells types, ESCs and DGNs, both of which exhibit relatively high abundances of 5hmC. We uncovered both similarities and differences in the 5hmC profiles in these two cell types. In both ESCs and DGNs, 5hmC was enriched near and within genes, especially in coding exons, suggesting a potential role for 5hmC in exon definition and regulation of pre-mRNA splicing, as has been indicated for 5mC (
Shukla et al., 2011). Intragenic 5hmC levels were correlated with gene expression in both cell types. As a result, differentially 5hmC-marked genes between ESCs and DGNs reflected the functional differences between the two cell types, supporting a potential role for 5hmC in the regulation of cell type-specific gene expression.
A central question of 5hmC biology is whether it indeed functions as an intermediate product in active DNA demethylation (
Wu and Zhang, 2010;
Guo et al., 2011b;
Branco et al., 2012;
Kohli and Zhang, 2013). We have previously shown that fully hydroxymethylated exogenous linear DNA can be demethylated in both human cell lines and mouse neurons in culture (
Guo et al., 2011c), suggesting that 5hmC can be converted to non-methylated cytosines by an active mechanism. Here, by correlating the DGN 5hmC profile with our previously reported DGN methylome (
Guo et al., 2011a), we showed that 5hmC and DNA methylation were antagonistic genome-wide, further supporting a role for 5hmC in shaping the genomic DNA methylome by promoting active DNA demethylation in post-mitotic neurons. In addition, the global antagonism between 5hmC and DNA methylation was also observed in ESCs, suggesting that maintaining DNA methylation at low levels is a cell-type–independent function of 5hmC.
Previous studies have shown that DNMT1 has a lower methyltransferase activity toward hemi-hydroxymethylated DNA
in vitro (
Valinluck and Sowers, 2007), raising the possibility that 5hmC may also promote passive DNA demethylation. However, DNMT1-associated factor UHRF1 can recognize hemi-hydroxymethylated DNA (
Frauer et al., 2011;
Spruijt et al., 2013). Therefore, whether 5hmC has an influence on the maintenance of the symmetric DNA methylation patterns after DNA replication remains to be tested. On the other hand, TET proteins have been shown to play important roles in locus-specific DNA demethylation in several systems, including preimplantation embryos (
Gu et al., 2011), primordial germ cells (
Dawlaty et al., 2013;
Yamaguchi et al., 2013), and various brain regions (
Kaas et al., 2013;
Rudenko et al., 2013). While it remains unclear to what extent DNA replication and passive dilution of 5hmC play roles in TET-mediated DNA demethylation in embryonic stages, the post-mitotic nature of neurons provides a suitable system for studying the role of 5hmC in active DNA demethylation without any contribution from DNA replication. Future analyses using methods that can distinguish 5mC and its oxidized forms at base resolution (
Booth et al., 2012;
Yu et al., 2012;
Song et al., 2013) will potentially reveal the kinetic details in each individual steps of 5hmC-mediated DNA demethylation, which may yield novel insights into the dynamic regulation of DNA modification states in post-mitotic neurons.
DNA methylation has been shown to play a critical role in synaptic plasticity related to learning and memory in mature neurons, likely owing to the regulation of specific gene expression (
Miller and Sweatt, 2007;
Feng et al., 2010). Previous studies of DNA methyltransferases have pointed to a mechanism whereby individual DNMTs play distinctive roles during neurodevelopment and are orchestrated to maintain long-term proper neuronal functions (
Goto et al., 1994;
Feng et al., 2005). Our understanding of the detailed functions and molecular mechanisms of 5hmC and TET proteins during neurodevelopment and in mature neurons is still very limited. A recent study of the dynamic change of TET proteins and 5hmC during neurogenesis revealed increased 5hmC levels during neuronal differentiation (Hahn et al., 2013). It also showed a negative correlation of 5hmC with repressive histone mark H3K27 tri-methylation and its effector Polycomb protein complex. These data indicate a potential role of 5hmC and TET proteins in neurodevelopment, as shown in
Xenopus (Xu et al., 2011). Several recent publications simultaneously reported key roles of Tet1 in neurogenesis, cognition and memory formation (
Kaas et al., 2013;
Rudenko et al., 2013;
Zhang et al., 2013). Either depletion or overexpression of Tet1 in mouse brain could lead to severe consequences. For example, Tet1 KO mice display impaired neurogenesis, poor learning and memory, abnormal long-term depression and impaired memory extinction, whereas Tet1 overexpression could result in impaired long-term memory formation. Mechanistically, up- or downregulation of Tet1 were found to lead to dysregulation of genes that are involved in critical neuronal activities accompanied by local 5hmC changes. These data together strongly indicate that the level and genomic distribution of Tet1 must be precisely controlled, possibly in a distinct manner in different neural cell types, during neurodevelopment and in mature neurons. This notion was also supported by the observation that the 5hmC distribution and gene expression in different neuronal cell types, such as Purkinje neurons, granule cells and Bergmann glial cells, displayed strong cell-type bias (
Mellén et al., 2012). In that study, MeCP2 was also identified as 5hmC binding protein, establishing a dual role for MeCP2 in the orchestration of neuronal plasticity by coordinating different cytosine modifications (
Mellén et al., 2012). Our work presented here on the integrative analyses of 5hmC, overall DNA methylation and gene expression profiles in the
in vivo dentate granule neurons, lay the foundation for the future study of 5hmC in shaping the neuronal DNA methylomes during neurodevelopment and in mature neurons.
Compliance with ethics guidelines
The authors declare no competing financial interests. Junjie Guo, Keith E. Szulwach, Yijing Su, Yujing Li, Bing Yao, Zihui Xu, Joo Heon Shin, Bing Xie, Yuan Gao, Guo-li Ming, Peng Jin and Hongjun Song declare that they have no conflict of interest.All institutional and national guidelines for the care and use of laboratory animals were followed.
Author contributions
H.S. and P.J. conceived the project. Y.S. prepared genomic DNA samples. K.E.S., Y.L. and Z.X. performed 5hmC-seq, data pre-processing, and qPCR validation. B.X. and Y.G. performed RNA-seq, J.U.G. and K.E.S. analyzed the data. J.U.G., K.E.S., B.Y., G.L.M., P.J., and H.S. wrote the manuscript.
Higher Education Press and Springer-Verlag Berlin Heidelberg