DNA methylation and demethylation

Learn about DNA methylation (5mC) and the mechanisms of DNA demethylation and techniques used to map DNA modifications 5mC, 5hmC, 5fC, and 5caC.

Epigenetics application guide

PDF

Download
button-secondary
icon-none

DNA methylation

Throughout DNA, chemical modifications add a layer of regulation to the expression of genes encoded within the DNA sequence. The most well-studied of these chemical modifications is 5-methylcytosine (5mC), a modification most commonly recognized as a stable, repressive regulator of gene expression. The human genome consists of approximately 1% methylated cytosine making it the most abundant and widespread DNA modification (Moore et al 2012). There a several methods available to sequence 5mC throughout the genome, all of which have pros and cons, which we will discuss later in this guide. These methods include high-resolution approaches, such as whole-genome bisulfite sequencing, and antibody-dependent DNA immunoprecipitation (DIP) or MeDIP.

5mC was initially discovered to reside within CpG islands – stretches of DNA commonly found within promoter regions enriched in CpG dinucleotides. It is within these promoter regions that 5mC acts as a stable epigenetic mark repressing gene transcription. Within the mammalian genome, methylated cytosine is initially incorporated into the DNA during early development by the de novo methyltransferase enzymes DNMT3a and DNMT3b (Okano et al 1999). These methylation marks are maintained throughout the genome by an additional methyltransferase, DNMT1, which copies DNA methylation patterns to daughter strands during DNA replication (Vertino et al 1996).

DNA demethylation: 5mC, 5hmC, 5fC, and 5caC

Today the notion of 5mC being an entirely stable DNA modification is less concrete. Many methylated cytosines throughout the genome, particularly within gene bodies, undergo a process known as DNA demethylation – a process that ultimately results in the removal of 5mC back to an unmodified cytosine (C). DNA demethylation can occur in one of two ways: passive DNA demethylation, where methylated cytosine is diluted from the genome due to an absence of methylation maintenance enzymes. Or active DNA demethylation, which involves the oxidation of 5mC by ten-eleven translocation (TET) enzymes into oxidized derivatives of 5mC (reviewed in Wu et al 2017).

Active DNA demethylation occurs in a cycle, starting with 5mC and finishing with an unmodified C. 5mC is initially oxidized to 5-hydroxymethlcytosine (5hmC), which is further oxidized to 5-formylcytosine (5fC), and finally, this is oxidized once more to 5-carboxylcytosine (5caC). 5fC and 5caC can be removed from DNA by thymine DNA glycosylase (TDG) in combination with base excision repair (BER) to result in an unmodified C (figure 8). 5hmC, 5fC, and 5caC have been the focus of many recent epigenetic studies. More and more are being found out about these epigenetic marks, including the potential for them to have stable epigenetic roles. Many sequencing methods have been developed to distinguish these marks throughout the genome including variations on MeDIP using 5hmC, 5fC, and 5caC antibodies, and variations on bisulfite sequencing such as TET assisted bisulfite sequencing (TAB-seq). The differences between these methods will be discussed later in this guide.

Figure 8. The cycle of DNA demethylation. Active DNA demethylation occurs by thymine DNA glycosylase (TDG) coupled with base excision repair (BER) or replication-dependent dilution of 5hmC, 5fC or 5caC. Active modification–passive dilution (AM–PD). active modification–active removal (AM–AR).

Figure 8. The cycle of DNA demethylation. Active DNA demethylation occurs by thymine DNA glycosylase (TDG) coupled with base excision repair (BER) or replication-dependent dilution of 5hmC, 5fC or 5caC. Active modification–passive dilution (AM–PD). active modification–active removal (AM–AR).

Bisulfite sequencing

It is not possible to detect 5mC using traditional DNA amplification approaches because the mark is not maintained during sample preparation and amplification. Bisulfite conversion is one of the most widely used approaches to convert DNA methylation marks into a suitable template for amplification and downstream analysis. Bisulfite conversion uses the treatment of DNA with NaOH and sodium bisulfite in a chemical reaction that converts cytosine bases into uracil (U), while methylated cytosines are protected from the conversion (figure 9).

During downstream analysis such as PCR or sequencing, unmethylated C bases that undergo deamination in the bisulfite reaction will be interpreted as thymine (T), whereas 5-mC bases will remain unchanged and still be detected as a C by the sequencing output. This allows you to determine the locations in the genome containing methylated cytosine (Frommer et al., 1992)

Figure 9. Bisulfite conversion. Treatment of DNA with bisulfite (sulphonation) leads to the deamination of cytosine residues and converts them to uracil, while 5-methylcytosine residues remain the same.

Figure 9. Bisulfite conversion. Treatment of DNA with bisulfite (sulphonation) leads to the deamination of cytosine residues and converts them to uracil, while 5-methylcytosine residues remain the same.

Bisulfite-based applications

Bisulfite conversion has become the basis for several variations and applications designed for high throughput applications or the investigation of broader, whole genome-scale regions.
Here are some examples of bisulfite-based methods.

Genome-wide DNA methylation analysis

Targeted DNA methylation analysis

Bisulfite conversion: technical considerations

Incomplete conversion

Bisulfite conversion is a very powerful method because it is relatively simple to perform, and it can deliver single-base resolution of DNA methylation status. However, the method does have some drawbacks: incomplete conversion (or on occasion, over-conversion) can occur under sub-optimal reaction conditions leading to insufficient DNA denaturation, or when the DNA strands re-anneal before completion of the reaction.

Distinguishing 5hmC

DNA degradation is often a byproduct of the harsh bisulfite conversion reaction conditions, which can make working with smaller samples challenging. Insufficient desulfonation of the reaction will leave behind residues that can inhibit DNA polymerases used in PCR. Recent evidence indicates that bisulfite conversion does not distinguish between 5mC and 5hmC Bisulfite conversion therefore lowers the overall complexity of the DNA sequence. This reduction sequence complexity can complicate primer design for downstream PCR-based interrogation or introduce challenges when attempting to uniquely map sequencing reads to a reference genome.

DNA immunoprecipitation (DIP)

Another method commonly used to map the location of DNA methylation marks is DIP. DIP relies heavily on having antibodies capable of recognizing the DNA modifications of interest. However, once you have this, DIP is a straightforward and effective method. It is also considerably cheaper and easier to analyze compared to WGBS sequencing, which requires the whole genome to be sequenced. DIP only requires sequencing of the small sheared DNA regions pulled down in your IP step.

DIP has been successfully carried out for the most well-characterized DNA modifications: 5mC, 5hmC, 5fC, and 5caC (Pastor et al., 2011, Shen et al., 2013). It has been used in a range of samples, including embryonic stem (ES) cells, brain tissue, and zebrafish fish embryos. The method is similar to ChIP, but your starting material is raw genomic DNA with no chromatin required. This genomic DNA will undergo shearing to approximately 150–300bp, and then this sheared DNA can undergo heat denaturation. This step is essential as the antibody will only be able to access the modifications within denatured (open) DNA.

After DNA denaturation the sheared DNA is incubated with the antibody recognizing your modification of interest, usually overnight, and then the samples undergo an IP step to pull down all the DNA bound to the antibody and washing away any unbound DNA. We recommend using magnetic beads for this type of IP step. When you carry out DIP, it is important to treat your initial genomic DNA with RNase to remove any RNA from the samples.

Figure 10. DIP methodology. Genomic DNA is sheared, and immunoprecipitation is carried out using antibodies against your DNA modification. Pulldown DNA and input samples can then be used for qPCR, microarray, or NGS.

Figure 10. DIP methodology. Genomic DNA is sheared, and immunoprecipitation is carried out using antibodies against your DNA modification. Pulldown DNA and input samples can then be used for qPCR, microarray, or NGS.

DIP-based applications

Genome-wide DIP analysis

Targeted DIP analysis

DIP: technical considerations

Shear your samples appropriately.

Unlike WGBS, DIP is not single-base resolution. When you are shearing your DNA samples, it is important to get these DNA fragments to a good size of between 150–300bp, to try to improve the resolution of your DIP sequencing. Having larger fragments means you will inevitably pull down more DNA flanking your DNA modification of interest and not physically bound to it. This results in broad, unspecific peaks in your sequencing analysis.

Get good antibodies.

Another problem with DIP is that you need to have an antibody specific for your modification of interest. You need to make sure there is minimal cross-reactivity with similar modifications, for example, if your 5fC antibody also recognizes 5hmC, this is not ideal for mapping the location of 5fC throughout the genome. The use of antibodies for this type of sequencing also has many advantages. You are only limited by the antibodies available to you. If you wanted to investigate a modification not previously characterized in DNA, eg m6A (more commonly associated with RNA), you could do so provided that you have a specific m6A antibody.

Alternate methods to capture 5hmC, 5fC, and 5caC

The biggest drawback of traditional bisulfite sequencing is that it is unable to distinguish the oxidized derivatives of 5mC and will profile only 5mC itself. Fortunately, there have been many variations on bisulfite sequencing and some entirely new approaches to tackling the problem of sequencing 5hmC, 5fC, and 5caC. Here we look at some of these new methods in more detail.

5hmC mapping

5fC/5caC mapping

Comparison of DNA modification sequencing methods

It is important that you choose the best method for detecting DNA modifications that suit your needs. Consider things like whether you need single-base resolution, if you need to be able to quantify the absolute levels of the modification, and how feasible the method will be to use in your model system or sample type. Below you can find a table where we have summarized these key features for some of the available methods for sequencing 5hmC, 5fC, and 5caC.

Name
Description
Single base resolution?
Allows absolute quantification of the modification?
Reference
5hmC mapping only
5hmC-DIP
Using 5hmC specific antibodies to enrich for 5hmC.
No
No
Pastor, W. A.  et al  Nature 2011
TAB-seq
5hmC is converted to 5gmC to protect it. 5mC is converted to 5caC by TET enzymes. After Bisulphite conversion 5hmC is read as C. 5mC and 5caC are read as T.
Yes
Yes
Yu, M.  et al  Cell, 2012
oxBS-seq
Chemical conversion of 5hmC to 5fC using KRuO4 allows the differentiation of 5mC and 5hmC at single
Yes
Yes
Booth, M. J.  et al  Science, 2012
hMe-Seal
Glucosylation of 5hmC with an azide-containing glucose molecule and biotin allows for 5hmC enrichment using a biotin/streptavidin pulldown.
No
No
Song, C. X.  et al. Nature Biotechnology 2011
SCL-exo
Azide-glucose glycosylation of 5hmC followed by a biotin reaction allows endonuclease activity to stall at biotin-5gmCs.
Yes
No
Sérandour, A. A.  et al.  Genome Biology 2016
5fC and 5caC mapping
5fC/5caC DIP
Using 5fC and 5caC specific antibodies to enrich for these marks.
No
No
Shen, L.  et al. Cell 2013
MAB-seq
M.SssI treatment of DNA converts all C into 5mC. Bisulfite conversion will then cause all C, 5mC, and 5hmC to read as C. All 5fC and 5caC will read as T’s.
Yes
Yes
Wu, H., Nature Biotechnology 2014
fCAB-seq
EtONH2 protects all 5fC in the genome from oxidation after bisulfite treatment.
Yes
Yes
Song, X.  et al. Cell 2013
caCAB-seq
EDC is used to catalyze the formation of amide bonds to 5caC preventing deamination of 5caC on bisulfite conversion.
Yes
Yes
Lu, X.  et al. JACS 2013
CLEVER-seq
Malononitrile selectively labels 5fC creating a 5fC-M adduct which is read as a T in the sequencing.
Yes
Yes
Zhu, C.  et al. Cell Stem Cell 2017

Table 1: DNA modification sequencing methods

Liquid chromatography tandem-mass spectrometry (LC/MS/MS)

If you have access to LC-MS/MS, then this is the best way quantify the amount of a DNA modification within total genomic DNA (Le et al 2011 and Fernandez et al., 2018). Using absolute quantification methods, LC-MS/MS gives you parallel quantification of all the DNA modifications found in total DNA from any organism and cell type (Zhang et al 2012). For absolute quantification, you are only limited which isotopic standards you have available to use as a standard to measure your sample against.

Using this technique combined with DIP (DIP-MS) allows you to determine if your DNA modification antibody is binding to your modification of interest and it will also allow you to see if it binds any other non-specific modifications. If you generate LC-MS/MS data of your DIP input and pull-down samples, you should see an enrichment of your modification of interest in the pulldown sample compared to the input. You can also then check other modifications with these same data to see if anything else came out as enriched in your samples to test for non-specific antibody binding. There is software being developed now that can even help you with this type of analysis.

LC/MS/MS: technical considerations

DNA modification IHC/ICC

It is also possible to carry out IHC/ICC for DNA modifications. This can be done with a few simple additions to your standard IHC/ICC protocol. The most significant difference you will need to consider is that antibodies against DNA modifications cannot access and bind to the modification if it sits within double-stranded DNA. This means that you will need to denature the DNA making it single-stranded and accessible by the antibodies.

The most common of DNA denaturation is to treat your samples with acid. This is usually 4N hydrochloric acid (HCL) applied directly to you IHC/ICC slides (Yamaguchi et al., 2013 and Kaefer et al., 2016). The best time to add this step to your protocol is before the addition of the primary antibody. Once you have permeabilized your cells or tissues with a detergent (eg PBS 0.1% Triton) you can wash and add 4N HCl to denature the DNA strands. It is important to thoroughly wash the acid off once the step is complete and neutralize the acid with an alkali (eg 100 µM NaOH in PBS). After the acid is washed and neutralized you can proceed with your usual IHC/ICC steps and add the primary antibody.

When carrying out an IHC/ICC for DNA modifications you should also be wary that your antibody may recognize very similar modifications on RNA (eg 5mC on DNA and m5C on RNA). To avoid this problem, you can treat your samples with an RNase step to remove all RNA present. Again, this step should be optimized as leaving your sample in RNase for too long can also cause damage to the DNA present.

DNA modification IHC/ICC: technical considerations

Methyl binding domain proteins (MBDs)

5mC and its oxidized derivatives play an important role in gene silencing and promoting gene expression after DNA demethylation. It is now known that some of these DNA modifications can act as markers to recruit proteins to specific DNA sites, altering gene expression and acting as epigenetic marks. MBD3 and methyl CpG binding protein 2 (MECP2) have both been shown to bind 5hmC in addition to 5mC. Once bound to 5hmC they play a role in DNA accessibility and activation of transcription (Yildirim et al., 2011 and Mellén et al., 2012).

A common method to screen for binders of a DNA modification is to use a pull-down technique followed by MS to screen for any proteins pulled down. This method has been successfully used to find binders of 5mC, 5hmC, and 5fC (Iurlaro et al., 2013 and Sprujit et al., 2013). For this experiment, you need to create a synthetic DNA bait containing the modification you are interested in as well as baits containing other modifications and unmodified cytosine to act as controls. This DNA bait should be linked to a biotin molecule at one end that can be used to tether the bait to streptavidin-linked magnetic beads. Protein extract from your sample of interest can then be added to the tethered bait and flushed through with various wash steps to remove any non-specifically bound proteins. After this, you can elute the remaining proteins and carry out MS analysis to find out what your specific binders are.

MBDs: Technical considerations

Novel DNA modifications

New DNA modifications could still be out there, just not discovered yet. It has been demonstrated that some modifications traditionally considered to be RNA modifications may also be present within DNA. One good example of this is N6-adenine methylation, known as m6A within RNA and 6mA within DNA. This modification is one of the most famous and abundant RNA modifications, but now it’s known to also reside within DNA. One of the first studies to show this was from John Gurdon’s lab in 2016 (Koziol et al., 2016). They show that 6mA is within Xenopus laevis, mice, and the human genome using an antibody against 6mA to carry out DIP-seq.

Since this study, there have been several more claims that 6mA is present within DNA in zebrafish and pig genomes (Liu et al., 2016), the mouse brain following environmental stress (Yao et al., 2017), and within the Arabidopsis thaliana genome (Liang et al., 2018). One study from 2018 took this one step further and uncovered the enzymes responsible for 6mA methylation and demethylation N6AMT1 and ALKBH1 respectively (Xiao et al., 2018). The presence of enzymes actively adding and removing the DNA modification suggests that it has a real purpose to be there and potentially its own epigenetic function.

Novel DNA modifications: technical considerations

References

  1. Brahma, S.,, Henikoff, S. RSC-associated subnucleosomes define MNase-sensitive promoters in yeast Mol Cell  73 (e233),238–249 (2019)
  2. Buenrostro, J.D.,, Giresi, P.G.,, Zaba, L.C.,, et al. Transposition of native chromatin for fast and sensitive epigenomic profiling of open chromatin, DNA-binding proteins and nucleosome position Nat Methods  10 ,1213–1218 (2013)
  3. Hainer, S.J.,, Fazzio, T.G. High-resolution chromatin profiling using CUT&RUN Curr Protoc Mol Biol  126 (e85), (2019)
  4. Janssens, D.H.,, et al. Automated in situ chromatin profiling efficiently resolves cell types and gene regulatory programs Epigenet Chromatin  11 ,74 (2018)
  5. Meers, M.P.,, Bryson, T.D.,, Henikoff, J.G.,, et al. Improved CUT&RUN chromatin profiling tools eLife (e46314), (2019)
  6. Meers, M.P.,, Tenenbaum, D.,, Henikoff, S. Peak calling by Sparse Enrichment Analysis for CUT&RUN chromatin profiling Epigenet Chromatin  12 ,42 (2019)
  7. Schmid, M.,, Durussel, T.,, Laemmli, U.K. ChIC and ChEC; genomic mapping of chromatin proteins. Mol Cell  16 ,147–157 (2004)
  8. Skene, P.J.,, Henikoff, J.G.,, Henikoff, S. Targeted in situ genome-wide profiling with high efficiency for low cell numbers Nat Protoc  13 ,1006–1019 (2018)
  9. Skene, P.J.,, Henikoff, S. An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites eLife (e21856), (2017)
  10. Thakur, J.,, Henikoff, S. Unexpected conformational variations of the human centromeric chromatin complex Genes Dev  32 ,20–25 (2018)
  11. Zentner, G.E.,, Kasinathan, S.,, Xin, B.,, et al. ChEC-seq kinetics discriminates transcription factor binding sites by DNA sequence and shape in vivo Nat Commun  6 ,8733 (2015)