Evaluating effects of applying SCNIC to discern microbes that
differ between groups in the HIV and Great Lakes datasets
OTUs/modules that differed with MSM status (HIV study) and between lakes
(Great Lakes study) were identified using ANCOM [28] for each
feature. For the first study, we focused on evaluating differences in
the microbiome between MSM and non-MSM without confounding by HIV
infection status, by only using samples from HIV negative individuals.
We chose ANCOM because it is also a tool designed specifically for
working with compositional microbiome data. ANCOM was applied to the
original feature table where SCNIC was not applied, as well as to
feature tables output from SCNIC using SparCC at different R-value
thresholds with the SMD algorithm.
While applying SparCC, SCNIC uses the recommended practice of the SparCC
manuscript of filtering based on average relative abundance across
samples [38] .The SparCC manuscript suggests this filter because
removing features with high abundances, even in a few samples, will
upset the ability of the method to control for the number of reads per
sample in its compositionality adjustment. Because this method can
retain OTUs that are highly abundant in only a single sample, we removed
features that had 0 values in more than 5% (~ 29/146)
of samples before applying ANCOM but after applying SparCC. Significant
differences between groups were determined as those above the W-value
threshold determined by ANCOM.