5. Linking sequences to ecological context

5.1. Soil spatial complexity occurs on micro- and macro- scales

Investigating microbial community composition in soils presents unique challenges. Compared to well-mixed ecosystems, microbial life (i.e., growth, activity, dormancy, and turnover) in the soil is strongly limited by the complex network of pores, as well as gas transport and diffusion in the aqueous phase \citep{Bickel2020a,Young_2004,Vos_2013}⁠. Soil microarchitecture is a key factor that influences the potential for microorganisms to interact with each other \citep{Wilpiszeski_2019}. In practice, however, the analysis of soil microbial communities through amplicon sequencing does not account for soil microarchitecture. Researchers commonly use bulk homogenization approaches to extract nucleic acids from 250 - 500 mg of fresh soil which naturally obscures the physical structure and spatial arrangements of microbial cells in this soil sample. From the microbial perspective, nucleic acid extraction represents a macroscopic measurement of the “whole” microbial community. This practice does not negatively affect soil microbiome analyses unless interactions among microbial taxa are inferred (e.g., via network analysis, see section 5.4).  
The spatial heterogeneity of soil and the microbial communities therein does not only persist on the microscale, but certainly also on a centimeter, meter, field, or ecosystem scale \citep{Becker_2006,Wolfe_2006,Franklin_2003}. Sampling “the same soil” a few meters apart or at different depths in the soil profile might result in individual samples with varying biogeochemical properties such as pH, water saturation, soil texture, and also plant root distribution \citep{Zhang_2021}. Choosing a sufficient number of replicates to assess sample or plot variability while balancing the cost-to-gain ratio is certainly an important measure to address soil heterogeneity (see section 6). Thus, it is critical to carefully evaluate the representativeness of technical and biological replicates. In hypothesis testing, we assume that the effect of a treatment can be detected assuming that variability of replicates is not larger than the effect of the treatment. However, a recent study showed distinct and consistent differences in bacterial and fungal communities between individual replicate soil samples throughout a season even though 10-15 cores were randomly sampled in individual subplots and pooled \citep{Carini_2020}⁠. Another study showed that chemical soil properties, as well as microbial biomass and communities, exhibited high levels of spatial variation across 49 samples in a 6 \(\times\) 6 m forest plot  \citep{_tursov__2016}⁠. The pooling of samples, individual extractions of DNA/RNA and/or amplification reactions made from a single DNA template can certainly dampen confounding effects of community heterogeneity. Nevertheless, existing intraplot variability and representativeness of samples, as well as the appropriateness of sampling strategies to correctly address them, must be critically assessed in any study on soil microbiomes. Otherwise, drawing of generalized macroecological conclusions from soil samples taken and pooled across large distances may yield speculative information at best \citep{Zhang_2020,Dini_Andreote_2020}

5.2. Temporal scales to consider when analyzing microbial dynamics

When designing an experiment, one must not only consider the spatial scales at which microorganisms live and interact but as well the temporal scale, i.e., the frequency at which sampling should occur to capture temporal dynamics. Amplicon sequencing represents a snapshot of microbial prevalence at a given moment. Given that microbial community turnover among different soils is may range from weeks to years (e.g., \citealt{Spohn_2016}), it is difficult to assess the best temporal sampling strategy a priori. If for example effects of root exudation on soil microbial community dynamics are of interest, it is important to consider the different temporal scales of the processes to be correlated. Root exudation varies with plant development stage and shows diurnal patterns \citep{Oburger_2014},  whereas community changes on a DNA level may not be detectable on such a short temporal scale (in contrast to RNA, see below). Any pattern of a single sampling time point would rather represent a legacy community that established around plant roots instead of the current state of a community that can be linked to root exudation (composition, rate) measured at the same time point.
Another soil parameter that might mask the detection of community shifts is intrinsically linked with microbial turnover: relic or environmental/exogenous DNA. Relic DNA is extracellular DNA from nonviable cells that has leaked into the environment and that is thought to persist in soils for months to years \citep{Levy_Booth_2007,Carini_2016}. Relic DNA has been estimated to comprise between 30% and 97% of the amplifiable soil DNA pool and has been successfully removed from soil samples via the application of DNAses or propidium monoazide \citep{Lennon_2018,Carini_2020}⁠. The latter study found greater differences in soil communities across several time points where relic DNA was removed as compared to samples where relic DNA was still present. Consequently, the presence of relic DNA may complicate the interpretation of sequencing data by over- or under-estimating microbial diversity which may be of particular concern when temporal dynamics are key to the scientific question.
One possibility to address short temporal dynamics while eliminating bias of relic DNA is ribosomal RNA (rRNA) amplicon sequencing via complementary DNA (cDNA) synthesis. The lifetime of rRNA in soils is relatively short and has been estimated to range from days to a few weeks depending on biogeochemical parameters such as temperature, pH, and water saturation \citep{Schostag_2020,Blazewicz_2013}. Thus, rRNA-targeted amplicon sequencing may increase the chances of capturing dynamics within soil microbial communities over time and may be used to carefully assess the "active" fraction thereof \citep{Vieira_2019} (see Table S2). Caution should still be taken when sequencing of nucleic acids at higher frequencies, even if relic DNA has been removed or RNA is used. If community dynamics are to be investigated in short time intervals (e.g., minutes to hours) we suggest combining amplicon sequencing with methods for targeting the metabolically active cell fraction (as discussed in section 7).