2.7 16S rRNA gene amplicon sequencing of the microbiome from
Ellinge WWTP
For detailed information see supplementary method 1 and for information
on sequenced samples see supplementary table 3. Briefly, sequencing
libraries were made in a dual-PCR setup. In the first PCR, amplifying
the 16s rRNA gene, primers Uni341F and Uni806R
(Yu et al., 2005) were used,
which amplifies the V3-V4 region of this gene. In the second PCR primers
introducing sequencing adaptors and barcode tags were used
(Nunes et al., 2016). 16s
rRNA gene amplicon sequencing was done using an Illumina MiSeq Desktop
Sequencer (Illumina Inc.). Raw sequence reads were trimmed using
cutadapt version 2.3 (Martin,
2011). Primer-trimmed sequence reads were error-corrected, merged and
amplicon sequence variants (ASVs) identified using DADA2 version 1.10.0
(Callahan et al., 2016)
plugin for QIIME2 (Bolyen et
al., 2019). For rarefaction curves see supplementary figure 6. A
multiple sequence alignment of the ASVs was performed with mafft v7.407
(Katoh & Standley, 2013)
and used to build an approximate ML tree with FastTree v2.1.10
(Price et al., 2010). R
(R Core Team, 2020) was used
for sequence and data analysis for the 16S rRNA gene community
profiling. Furthermore were the tidyverse
(Wickham et al., 2019) and
phyloseq (McMurdie & Holmes,
2013) packages used for visualization and general data handling.
Taxonomy was assigned with the dada2 package
(Callahan et al., 2016)
using the Genome Taxonomy Database (GTDB;
https://doi.org/10.5281/zenodo.2541239)
(Parks et al., 2018). The
Alpha diversity metrics Faith’s phylogenetic diversity
(Faith, 1992), Mean pairwise
distance (Webb et al., 2002)
was calculated with the PhyloMeasures package
(Tsirogiannis & Sandel,
2016). For the beta diversity, weighted Unifrac distances were
calculated (Lozupone &
Knight, 2005). The phylogenetic tree (figure 3.a) was made using the
iTOL webtool (Letunic &
Bork, 2019). For investigations of the low biomass samples sewage
community/pB10 and sewage community/R27 such as alpha diversity
measures, phylogeny, and abundances, a cleaned data object was used (see
supplementary method 1), to avoid the influence of the kitome (i.e. the
background signal of kits used) and other potential contaminants to
which low biomass samples are more vulnerable than high biomass samples
(Davis et al., 2018). Data
cleaning for the eight low biomass samples resulted in the removal of
24165 reads, from 162040 to 137875 reads, thus removal of 14.9 % of the
reads. The mean number of reads for the cleaned samples were 17234, and
the minimum/maximum was 15038/20999 reads. The number of taxa for these
samples was reduced from 299 to 65, thus removal of 78.26% of the taxa.