Overview of DNA Methylation (DNAm)
About 70% of CpG sites in the human genome are methylated27. CpGs are concentrated in CpG islands (CGIs) – regions at least 200 base pairs (bp) in length – where C-G dinucleotides make up more than 50% of the sequence1,28. CGIs house the promoter regions of ~ 70% of human genes29,28. The effect of DNAm on gene expression is influenced by CpG density in these promoter regions30.
Different technologies have been developed for assessing DNAm, but arrays and sequencing protocols form the basis of the literature. Both rely on the bisulfite conversion of DNA. Arrays compare signal intensities between methylated and unmethylated probes at specific sites while in sequencing, the proportion of methylated Cytosines is calculated. Three arrays have been most commonly used to study DNAm in humans: the legacy IlluminaHumanMethylation27 BeadChip31 , the Illumina HumanMethylation450 BeadChip32 and finally the Illumina MethylationEPIC BeadChip array33. Each arrays features progressive expansion of CpG coverage and increased representation of different regions of the genome. The EPIC array covers ~30x more CpGs compared to the Illumina 27K array and puts greater focus on CpGs outside of CGIs as these regions are important for gene regulation33.