Random sampling in metagenomic profiling leads to overestimated
microbial stochasticity inference using null models
Abstract
Revealing the mechanisms governing the complex microbial community
assembly is a central issue in microbial ecology. Null models are
commonly used to quantitatively disentangle the relative importance of
deterministic vs. stochastic processes in structuring the compositional
variations. However, microbial profiling is influenced by random
sampling issues, which lead to overestimated -diversity of microbial
communities and may further affect stochasticity inference. By
implementing simulated datasets, we investigated whether and how
microbial stochasticity inference is affected by random sampling issues.
Our results demonstrated solid evidences that random sampling
dramatically overestimated the -diversity of microbial communities,
which further led to overestimated community stochasticity inference.
The effects of random sampling issues on stochasticity inference for the
whole community and the abundant subcommunities were different using
different null models. The stochasticity of rare subcommunities,
however, was persistently overestimated no matter which null model was
used. Such effects of random sampling issues on community stochasticity
inference were constantly observed for communities with different
-diversity. As more studies begin to focus on the different mechanisms
governing abundant and rare subcommunities, we urge cautions be taken
for microbial stochasticity inference based on -diversity (e.g. null
models), especially for rare subcommunities with stochastic ratio
slightly higher than 0.5. When necessary, the cutoff used for judging
the relative importance of deterministic vs. stochastic processes shall
be redefined.