Community richness and structure at multiple thresholds of genetic similarity
Filtered ASVs were used to generate a UPGMA tree using F84 corrected genetic distances, within which haplotypes were grouped into clusters of genetic similarity at 0.5%, 1%, 2%, 3%, 5%, 8% and 15% thresholds for the analysis of α and β diversity from intraspecific haplotype level (h ) variation through to supraspecific lineages. Subsequent community-level analyses were performed for either a selection of hierarchical levels (h , 3%, and 15% clustering) or the complete set of thresholds.
To test for significant differences in community richness (α diversity) among different habitats and soil layers for h , 3%, and 15%-level clusters, repeated-measures ANOVAs were conducted using habitat and soil layer as grouping factors and sampling site as a within-subject factor. DEEP and SUP samples were then combined within each sampling site (n=52), and Kruskal-Wallis rank sum tests were conducted using habitat as a grouping factor to assess whether α diversity differed between the communities of each of the four habitats. Endemicity at the scale of individual sampling sites was also calculated for h , 3%, and 15%-level clusters measured as the proportion of total lineages within a given sampling site that occur exclusively within that sampling site. Kruskal-Wallis rank sum tests were conducted to test for differences in community endemicity among the four habitats. Total observed richness (ɣ diversity) and accumulation curves (random method, 1000 permutations, specaccum function) were estimated for each habitat for h , 3% and 15%-level clusters, and total extrapolated richness (Chao equation, specpool function) by habitat was estimated. Total community dissimilarity across the communities of each habitat was estimated at all clustering levels, and pairwise community matrices were generated using total β diversity (Sorensen index, βsor) and its additive turnover (Simpson index, βsim) and nestedness (βsne) components (Baselga & Orme, 2012). Community composition matrices were used for non-parametric multidimensional scaling (NMDS) for h , 3% and 15%-level clusters, and plots were created with the ordispider option to visualise the compositional ordination of communities according to their respective habitat. Permutational ANOVAs were conducted over the community dissimilarity matrices using 999 permutations and the habitat as the grouping factor.
Variation in community composition with spatial distance was assessed following the ’multi-hierarchical macroecology’ approach of Baselgaet al. (2013), where distance decay of similarity is contrasted across hierarchical levels. For each habitat, the relationship between community similarity and spatial distance between sampled sites (1 – pairwise β diversity, see above) was assessed for each clustering level. The spatial distance was calculated using the R package gdistance(van Etten, 2017), which uses Tobler’s hiking function to provide the shortest route between two points given the slope of the terrain (m ) (Tobler, 1993). Pairwise calculations were made among sites within the same habitat. The lowest and highest elevations of each habitat within our sampling sites were used to constrain altitudinal movement, to avoid shortest paths transgressing a different habitat. A negative exponential function was used to adjust a generalised linear model (GLM) with Sorensen similarity as the response variable, spatial distance as the predictor, log link and Gaussian error, and maintaining untransformed spatial distances (Gómez-Rodríguez & Baselga, 2018). Fractal patterning (power-law function) among distance‐decay curves was assessed by a log-log Pearson correlation across clustering levels for (a) the number of lineages, (b) the initial similarity, and (c) the mean similarity of the distance-decay curves. High correlation values are indicative of self-similarity in lineage branching (i.e., number of lineages) and spatial geometry of lineage distributional ranges (i.e., initial and mean similarity; Baselga et al. , (2015)), which are predicted under a predominant neutral process of community evolution. Analyses were also conducted to assess the relationship between community similarity and environmental distance, computed using Gower’s distance over the elevation and 19 bioclimatic variables (from WORLDCLIM at 30 arc-seconds resolution), characterising each sampling site (Table S3). When significant relationships were found, variance partitioning was conducted to assess the fractions of variance in community dissimilarity that are uniquely and jointly explained by spatial and environmental distance.
Finally, we compared biodiversity measures for haplotypes and 3% OTUs for the four habitat types in Tenerife with those obtained by Arribaset al. (2020) in three forest and three grassland sampling regions in a continental setting (n=12 for each habitat on each sampling region). Kruskal-Wallis rank sum tests were used to compare α diversity by sample with insularity (Tenerife island n=52; continent n = 72) and sampling region as grouping factors. Comparisons of β diversity by sampling region were restricted to a comparable spatial scale of 15 km, conducting a Kruskal-Wallis rank sum test with insularity as a grouping factor. Comparisons of β diversity were repeated for intervals of spatial distance between 0-5 km, 5-10 km, and 10-15 km. Finally, ɣ diversity (total species richness) was estimated for each habitat and region using accumulated haplotypes and 3% OTUs across 12 community samples (using specaccum function when the available number of samples was higher). All analyses were performed using the R-packagesvegan (Oksanen et al., 2016), cluster (Maechler, Rousseeuw, Struyf, Hubert, & Hornik, 2021), PMCMR (Pohlert, 2014), hier.part (Mac Nally & Walsh, 2004), ecodist(Goslee & Urban, 2007), and betapart (Baselga & Orme, 2012).