1 Introduction
Understanding organism adaptation to variable environmental conditions is pivotal for weighting the relevance of natural selection over species and population evolution. Phenotypic plasticity, stress responses and acclimation display significant contribution from epigenetic mechanisms (Moler et al. , 2019). Among epigenetic modifications, DNA methylation has been shown to be key in the control of several biological phenomena in eukaryotes and prokaryotes (Jones, 2012) and in the last years the study of variation in epigenetic response is stirring the attention of several investigators (Chen et al. , 2020). Third-generation sequencing technologies, namely single molecule real-time (SMRT) (Flusberg et al. , 2010; Fang et al. , 2012) and nanopore ONT (Clarke et al. , 2009; Simpson et al. , 2017) sequencing allow to directly identify the most commonly methylated bases (Gouil and Keniry, 2019; Sánchez-Romero and Casadesús, 2020; Rand et al. , 2017). These methods are boosting genome-wide DNA methylation studies, especially in prokaryotes, where the compact size of genomes allows the generation of whole-genome methylome with relative ease. In prokaryotic microorganisms DNA methylation is playing various roles, which span from the control of cell cycle, the protection against phages (e.g. Restriction-Modification systems), and regulation of gene expression (see for examples (Sánchez-Romero and Casadesús, 2021)). Concerning cell cycle control, genome-wide DNA methylation profiles have been shown to vary in ecologically relevant contexts (e.g. bacterial differentiation, (diCenzo et al. , 2022)), as well as for Restriction-Modification systems strain-by-strain or population variation are documented (diCenzo et al. , 2022).
Consequently, the interest toward computational pipelines which can easily profile DNA methylation features in a genome-wide manner (thus allowing to compare strains and individuals across multiple conditions) is growing. Several tools have been developed for the analysis of DNA methylation profiles deriving from bisulphite sequencing and microarrays (e.g. (Müller et al. , 2019; Teng et al. , 2020; Hillary and Marioni, 2021; Aryee et al. , 2014; Bock et al. , 2005)), for a recent benchmarking see (Nunn et al. , 2021)). Recently, three packages have been released (Su et al. , 2021; Leger, 2020; De Coster et al. , 2020), which allow to visualize methylation profiles from SMRT or ONT sequencing data. A recent tool on GitHub has also been developed to specifically analyse DNA methylation profiles on metagenomic data (https://github.com/hoonjeseong/Meta-epigenomics). However, to the best of our knowledge, no specific pipeline has been developed for extracting DNA methylation information from sequencing data and allowing a direct quantification/comparison of the position of methylated sites with respect to genome-derived features, such as coding and noncoding sequences and report outputs which can be used in population epigenomic analyses.
Here we present MeStudio, a pipeline for SMRT sequencing methylation data integration and visualization. MeStudio combines methylation data with genome sequence and annotation to facilitate the extraction of biological information from DNA methylation profiles and to visualize the results of these analyses. We show the usage of MeStudio on a set of SMRT outputs from two strains of the bacterial speciesSinorhizobium meliloti .