Comparisons of deep metagenomic, shallow metagenomic, and 16S amplicon sequencing
Unless specified, only reads identified as bacterial were retained for taxonomic analyses, and all reports of realized sampling depths hereafter refer to the fraction of reads classified as bacteria. All datasets were normalized prior to analysis via rarefaction, except for datasets used in ANCOM-BC differential abundance testing. A schematic of our study design can be found in the supporting information (Figure S1).
To determine the requisite shotgun metagenomic sequencing depth required to accurately characterize the Sable Island horse microbiome, we first analyzed a deeply sequenced 16 sample Nextera XT prepared library. Successively rarefied versions of this dataset were benchmarked against a minimally rarefied dataset (9.3 million read-pairs). Similarly, to determine the affect of sequencing depth on functional profile reconstruction, we compared successively rarefied datasets of MetaCyc reaction and pathway abundance tables.
Next, to identify discrepancies between prevailing and newer high-throughput library preparation methods, we compared the results of shotgun metagenomics libraries prepared from the same DNA extracts, using Nextera XT and iGenomx Riptide methods. For this comparison, datasets were rarefied to the lowest sequencing depth observed amongst these 13 paired sample datasets, 1.6 million read pairs.
Finally, we identified differences between metagenomic and amplicon-based characterization of the bacterial microbiome using the same DNA extracts from 13 samples. As above, these datasets were rarefied to the same minimal sequencing depth, in this case 35,000 read pairs. Comparisons of taxon relative abundances between shotgun metagenomic and amplicon datasets occurred primarily at the level of family, since this was the finest resolution to which most 16S rRNA amplicon reads could be classified. Analyses across these methodological comparisons were comprised of general linear models to quantify correlations in alpha diversity (Shannon diversity indices) and taxon relative abundances between dataset types. We also tested for correlations between beta-diversity estimates using mantel tests of Bray-Curtis dissimilarity matrices.