Testing the influence of ecoregion-scale variables, ecoregion
identity, and spatial autocorrelation
We averaged species-level tip-based metrics across species of an
assemblage to obtain tip-based metrics at the level of ecological
assemblage (hereafter: aTR, aST, aLT) and run hypothesis test (Fig. 1).
We estimated the effect of ecoregion-scale variables on each
assemblage-level tip-based metric using linear mixed models (LMM,
Pinheiro and Bates 2000). Linear mixed models are a class of models that
allow estimating the effect of grouping factors describing the study
design (random effect), of spatial autocorrelation (as an error term),
and of interesting ecological processes (as a fixed effect, Table 1)
when modeling variation in aTR, aST, and aLT.
Here ecoregion identity was
considered as random effect in LMM analysis as they were part of the
sampling design, and differences in shape and convolutedness could mask
differences between cores and ecotones.
We identified high spatial autocorrelation (Moran’s I >0.5,
P<0.001) for all tip-based metrics analyzed here. We then
looked for spatial autocorrelation in residuals of our LMM models with
either aTR, aST, or aLT as response variables, ecoregion-scale variables
as fixed effects (Table 1), and ecoregion identity as a random effect.
Spatial autocorrelation was incorporated in the model through an
exponential correlation structure with nugget effect based on the
latitude and longitude values of each point. We used exponential
structure with nugget effect because the variograms generally showed a
highly stepped decrease in spatial autocorrelation, mainly between very
close points. Comparisons of models with and without nugget effect
generally supported the model with nugget effect (Table S2).
To account for phylogenetic uncertainty on tip-based metrics we ran one
LMM analysis per estimate of aTR, aST, and aLT. We accounted for
phylogenetic uncertainty using a randomly subsampled set of 2,000 of the
10,000 estimated values, due to computational limitations when
estimating fixed, random, and spatial parameters for the whole dataset
of estimates. Thus, uncertainty on random effect (standard deviation,
σ), spatial autocorrelation (range, r and nugget, n ), and
fixed effect (regression intercept, and regression coefficient of each
variable) were represented by the standard deviation calculated across
estimates from the 2,000 models. The LMM intercept represents the
average tip-based metric when quantitative variables are at their
average (i.e., zero in the standardized scale), and factors are at their
first level of contrast (Table 1). The regression coefficient of each
variable represents the number of standard deviations from the
intercept: the larger the coefficient, the stronger the effect of a
variable on the response variable (Schielzeth 2010). We used density
plots to represent and infer the effect of ecoregion-scale variables
because these plots can show the most likely average parameter value and
effect size, as suggested by most of phylogenies. Boxplots in the
margins represent the average, first and third quartiles of the
distribution of parameter estimates across the 2,000 models.