2.4 Sequence data processing
The raw FASTQ files were subject to a series of standard processing including demultiplexing, read merging, quality filtering and chimeric removal using DADA2 (version 1.20.0)(Callahan et al., 2016). A relative abundance microbial profile was generated at the level of amplicon sequence variant (ASV). The taxonomic information for each ASV was determined using the Ribosomal Database Project (RDP) Classifier (http://rdp.cme.msu.edu)(Cole et al., 2009) with a confidence interval of 80%. The ”Rarefy” function in the ”GuniFrac” package is used to rarefy the microbial profile to a same sequencing depth before further statistical analyses were carried out. All analyses were completed in R v 4.1.2.
Microbial ASVs were classified into two different categories, including abundant and the rare taxa, according to their relative abundance and/or frequency(Bickel & Or, 2021). Here, ASVs with an average relative abundance of <0.01% across all samples were defined as rare taxa, whereas the remaining ones were defined as abundant ASVs. Notably, different criteria were used to define abundant and rare taxa in different studies(Jiao, Chen, & Wei, 2017; Y. Xue et al., 2018). Since only 72 ASVs were not rare, we classified all of them as abundant, without considering more precise concept such as occasional taxa.
For all ASVs, indices including niche breadth and niche overlap were calculated to see how well they may adapt to the environment. The niche breadth was evaluated using the Levins’ standardized niche breadth index(Feinsinger, Spears, & Poole, 1981). The niche overlap was calculated using Pianka’s niche overlap index equation, with the value of Pianka’s index between 0 and 1(Pianka, 1974). The R package ”spaa” was employed to calculate niche breadth and niche overlap indices.