For example, assignment of taxa into r-strategists via their taxonomic affiliation with a phylum that is generally assumed to represent fast-growing organisms among soil microbiologists (e.g. Proteobacteria), and using these assumptions to explain processes in soil samples, should be avoided (Jeewani et al. 2020).
However, these prediction-based software packages can be used to generate valuable hypotheses for further investigation or an additional line of evidence to support a finding. In such cases, we recommend to follow up by either FISH-counting of the identified species, to include functional gene-targeted sequencing, or SIP experiments to learn more about the species or community that is hypothesized to perform an ecosystem process (further discussed in 'Complementary approaches to improve ecological insights').
Inference from co-occurrence analysis
Challenges associated with amplicon sequencing analysis and interpretation also complicate the use of co-occurrence network analysis from soil samples. Network construction is based on the detection of significant correlations between taxa, and can be constructed to investigate properties of microbial communities including organismal co-existence (e.g. REF), identification of keystone species (e.g. \cite{Banerjee2018}) and the stability of community structure (e.g. \cite{de2018,Shi2016}). Generally, co-occurrence analysis generates networks with biological species as nodes and edges representing associations between them. There has been a recent increase in the number of studies including the construction of association networks for microbial communities (REF). However, many of these studies are criticized for their highly descriptive use of networks, that do not propose an ecological interpretation of detected patterns (REFs?).
The difficulty in interpretation stems from inferring causal relationships between taxa based on correlations, which is a long-standing topic of discussion in ecology \cite{Blanchet_2020,Barner_2018}. Particularly for soil, it is important to keep in mind that the data contained in each environmental sample is only a snapshot of complex spatio-temporal dynamics (see section XYZ). In fact, these data capture a noisy signal which reflects several biological processes including: reproduction, death, dispersal, as well as intra- and inter-specific interactions; all subjected to environmental filtering (REF). Moreover, while interactions occur at the level of individual microorganisms, the detectable abundance patterns can only be measured on relatively large and highly heterogeneous soil samples (see section 'Spatial Heterogeneity'). The heterogeneity and resulting sparsity of amplicon datasets represents an additional confounding effect that may introduce spurious associations, posing additional challenges unique to the study of soil ecosystems.
For microbiome data, the associations are most often assigned through detection of significant correlations between relative abundances, where spurious links can be detected if compositional data is not appropriately handled (as explained in 'Addressing and interpreting compositional sequencing data'). Several popular network construction tools, including SparCC (log ratios) and SPIECEASI (clr), apply log ratios to address compositionality in the process of network construction \cite{Kurtz2015,Friedman2012}. Another option is to convert relative abundances into absolute values by using the total gene copy numbers obtained from qPCR (see section XYZ). To improve this analysis we suggest a careful comparison of data with null models to help interpret the results and eliminate some indirect associations between species \cite{Connor_2017}. Additionally, the use of complementary environmental measurement data can improve ecological insights from networks \cite{Goberna2019,Lima_Mendez_2015}. Follow-up experiments to further investigate potential interactions should be undertaken to explore inferences made through network analysis. In summary, the field of network inference is rapidly evolving and alternatives are emerging to address currently standing issues. Nevertheless we still lack a definite framework which allows for straightforward interpretation of generated co-occurrence networks.
6. Suggestions for more robust statistical analyses in sequencing studies
Amplicon sequencing data is well-suited for exploratory analysis and hypothesis generation in soil research, but can also be applied for targeted hypothesis testing if appropriate statistical methods are selected (see Fig 2, 4?; e.g. Gloor et al., 2017). As amplicon datasets from soil are characterized by compositionality, heterogeneity and sparsity, the use of standard statistical methods (including Pearson correlations or t tests on proportions) can lead to very high false positive discovery rates (up to 100% ; 56, 57). Almost any soil microbiome data set will show significant correlations, as the data consists of thousands of individual variables. The possibility to obtain significant results, therefore, may also lead to an abuse of the statistical significance (also referred to “p hacking”). These effects are further compounded by spatio-temporal dynamics that contribute to challenges in statistical inference from amplicon sequencing in soils. Consequently, we suggest researchers apply caution when inferring effects or associations solely based on statistical significance. To address this, recent discussion surrounding abuse of p-values has resulted in alternatives and suggestions for the use of more stringent p-values to reduce the false positive discovery rate (58-62). This approach would require an unrealistic 70% increase in sample size, however we recognize that this could save future efforts born on unsubstantiated research.
To explore how sample size influences statistical power in soil microbiome analyses, we calculated the dependency of the statistical power of permutational multivariate analysis of variance (PERMANOVA) on the effect size for different data sets varying in replicate number. Our data set featured a range of soils \cite{Zheng_2019} and can therefore be regarded as an example that appropriately covers the heterogeneity inherent to soil and associated microbial communities (see section XYZ). We used the R package "micropower" \cite{Kelly_2015} which allows to simulate distance matrices from a set of parameters to generate available PERMANOVA power or necessary sample size for a planned microbiome analysis. Data for both the 16S rRNA gene and the ITS1 region were filtered to include only bacteria and archaea (16S) and fungi (ITS)HOW WERE THEY FILTERED?.
We explored the impact of sample replication on statistcal power in soil microbiome analysis using a published dataset that features a range of soils representative of the heterogeneity and biological diversity of soils \cite{Zheng_2019}. The Jaccard similarity index was applied to simulate OTU/ASV tables with similar parameters for both ITS and 16S rRNA sequences, accounting for the average and standard deviation across all samples (Supplementary Fig. 1a,b). We then computed the dependency of statistical power of permutational multivariate analysis of variance (PERMANOVA) on the effect size for four datasets with varying replicate numbers (4, 5, 8 and 10 replicates; Fig. 5). Figure 5a shows the increase in statistical power needed to detect significant difference with increasing effect size for multiple group representing different sample size. The graph shows clearly that by increasing sample size the power needed to detect small differences is increased, even with small increase in number of samples. To better visualize these differences, we further calculated the average statistical power for a range of effect sizes ( ω2 ) defined as 'Low' (0.001-0.04), 'Medium' (0.04-0.08) and 'High' (0.08-0.12). Our analysis showed that the number of replicates hardly affects the statistical power if microbial communities feature strong differences (Figure 5b, "High"). However, if communities with higher similarity were to be analyzed, we found that an increase of the replicate number from 4 to 5 was sufficient to almost double the statistical power of small effect size ("Low") and to achieve the recommended power above 0.8 for medium effect sizes (Figure 5b, "Low" and "Medium"). Consequently, these effects were more pronounced when the number of replicates was doubled (4 to 8; Figure 5b). Similar effects were obtained for the fungal data set (Supplementary Fig. 1c).
In practice, obtaining knowledge about the level of differences in soil microbial communities a priori is a complicated undertaking. If preliminary sequencing data is available we encourage researchers to perform such power analyses before experimental planning. If not, one could still estimate the likelihood/degree of (dis)similarity from biogeochemical parameters or other microbiome data published on similar soils. For example, if soils of the same field site were to be analyzed in an experiment where short-term effects of treatments (e.g. fertilization) are the aim of the study, one could assume that the microbial communities would be rather similar in their structure which would suggest a need for increased replication. Such considerations should also include the amount of technical replicates that will be pooled to alleviate the spatial heterogeneity of soils (see section XYZ) and should be taken with care. We refer to further literature for experimental planning and robust statistical analyses (e.g. time-series; e.g. Coenen et al 2020 (24 other REFs).