Fig. 1 The density distribution of assemblages. (a)
Co-occurrence plots covering Norway, and (b) presence-only pseudo-plots
spanning across Europe. Both panels represent all vascular plant species
recorded during the baseline period from 1950 to 1979 per 0.5 x 0.5°
degrees cell.
Temperature data
From here and henceforth, we will refer to assemblage for both
co-occurrence plots and presence-only pseudo-plots. Temperature was
extracted for each vegetation assemblage geographical location
(co-occurrence and pseudo-plot) using CHELSAcruts. This climate dataset
provide monthly data for every year from 1901 up to 2016 at a spatial
resolution of 30 arc seconds (Karger et al., 2017; Karger & Zimmermann,
2018). We calculated mean annual temperatures (MAT) by averaging maximum
and minimum temperature values for all months of the respective year.
The final MAT for each vegetation assemblage was calculated as the
average of the five years preceding the respective field survey,
including surveyed year (Steinbauer et al., 2018).
Community thermal index -
CTI
To estimate CTI for each assemblage, a transfer function was first
calibrated with the assemblages from the period before the recent major
increase in temperature, the baseline period (1950-1979, Fig.
1 , see Pacheco-Riaño et al. (2023) for a justification for selecting
this period). Subsequently, this transfer function was used to project
CTI for the assemblages before and after this baseline period (i.e.
1905-1949 and 1980-2007/16, respectively). The transfer function was
calibrated using a weighted-averaging partial least squares regression
(WA-PLS) using the fxTWAPLS package (Liu et al., 2020). This is an
efficient approach to creating unbiased CTI estimates (Bhatta et al.,
2019; Liu et al., 2020).
The WA-PLS is based on estimating species optima along a gradient of MAT
and assumes that the full temperature range of a species is captured in
the calibration dataset. If this is not the case, the niches are
truncated leading to an overestimation of the CTIs at the lower end and
an underestimation at the higher end of the temperature range (Liu et
al., 2020). Prior to the main analyses, we therefore evaluated the
effect of using three different datasets in the calibration process, and
computed three different transfer functions based on either 1) the
co-occurrence plots from Norway, 2) the pseudo-plots derived from
presence-only data from Europe (assuming that the extent captured a
wider thermal gradient than Norwegian plant species have and thus avoid
niche truncation, at least at the warmer limit), and 3) the two datasets
combined. To ensure sufficient species representation for the transfer
function, we only retained species that occurred in more than ten
assemblages in each dataset. In the calibration subset, this resulted in
717 species for the co-occurrence dataset, 2,273 species for the
presence-only dataset, and 2,290 species for the combined dataset. In
addition, we kept only those assemblages that contained a minimum of ten
species for the projection of CTI. Through the application of these
criteria, we ensured that each assemblage had sufficient species
representation to estimate CTI using a transfer function (Bhatta et al.,
2018).
The performance of each transfer function was evaluated through
leave-one-out cross-validation, specifically focusing on the calibration
assemblages located in Norway and utilizing only 75% of the calibrating
data from these Norwegian assemblages. We also used these
cross-validation analyses to identify the statistically reliable
component of the WA-PLS, which was determined by the combination of the
lowest root mean square error of prediction (RMSEP), the highest
R2 value, and the lowest average bias (Liu et al.,
2020). For all three datasets the inclusion of the second component gave
the RMSEP, the highest R2, and the lowest average bias
within Norway. For this evaluation, only assemblages from the same
period as used for the calibration itself were included. The
co-occurrence dataset had the highest values for RMSEP (3.54),
R2 (0.81) and average bias (0.117), followed by the
presence-only dataset with 3.12, 0.68 and 0.043, respectively, and the
combined dataset had the lowest values with 3.09, 0.67 and 0.032,
respectively (Fig. S3 ).
Plotting the predicted CTI versus the observed MAT for the assemblages
in the baseline period demonstrated that all transfer functions
overestimated the CTI at low temperatures and underestimated the CTI at
high temperatures (Fig.S4 ). To correct for this remaining over-
and underestimations due to niche truncation in the ends, we used a
local nonparametric regression (LOESS) with a span of 0.75 to fit the
residuals of the WA-PLS model. The difference between the residuals of
the WA-PLS model and the LOESS model was then used to correct the
initially predicted CTI in the main analyses. We fitted the LOESS only
for the vegetation assemblages within Norway due to their colder climate
relative to the rest of Europe, preventing potential underestimation of
CTI values (Fig. S4 ). Finally, to reduce niche truncation
effects near the edges during projection phase, we further refined our
dataset by excluding communities with extreme temperatures and focused
our CTI projections on assemblages within a MAT range of -2.5°C to 8.5°C
(initial MAT range -5.4 ° - 13°C), representing 95% of the calibration
data (Pacheco‐Riaño et al., 2023). After these corrections, the
co-occurrence dataset had an R2 of 0.86, the
presence-only dataset of 0.79 and the combined dataset of 0.78
(Fig. S5 ). Next, we carried out CTI projections for each
vegetation assemblage, covering the period of 1905 to 1949 before the
baseline and for the contemporary period spanning from 1980 to 2007/16.
Because the evaluation of the different datasets indicated that the
three different transfer functions had different advantages based on the
RMSEP, R2, and average bias, we made projections with
all three transfer functions to compare co-occurrence and presence-only
in the subsequent analyses.
Comparison of CTI between co-occurrence and
presence-only
Given that co-occurrence plots are commonly employed in CTI and
thermophilization analyses (De Frenne et al., 2013; Fadrique et al.,
2018; Richard et al., 2021), we established the CTI of co-occurrence
plots as a benchmark. We then assessed the degree of similarity between
the estimated CTI derived from pseudo-plots generated from presence-only
data and the CTI estimated from co-occurrence plots for the three
calibration models (co-occurrence, presence-only, and combined). To be
able to make this comparison, we
reproduced any spatiotemporal biases of the co-occurrence dataset in the
presence-only dataset. For this we paired up the co-occurrence plots
with the pseudo-plots by subsampling the latter, finding the closest
available pseudo-plot that was sampled. We only paired up a pseudo-plot
with a co-occurrence plot if the centre of a pseudo-plot was found
within a radius of 1 km from the co-occurrence plot, if the difference
in sampling time was less than two years, and if the MAT of the
assemblages had a difference of less than 0.5° C. If a co-occurrence
plot did not have a corresponding pseudo-plot given these criteria, it
was discarded. As the co-occurrence dataset only extends until 2007, the
paired period was constrained to the same timeframe, resulting in 3,839
assemblage pairs for the period 1905-2007. For each assemblage pair we
quantified the difference between the CTI estimated based on the
pseudo-plot and on the co-occurrence plot (Δ CTI). Positive Δ CTI
indicates an overestimation of CTI for the pseudo-plot relative to the
co-occurrence plot. A value of zero indicates equal CTI estimation while
negative Δ CTI indicates an underestimation of CTI for the pseudo-plots.
To statistically assess these differences, we conducted t-tests.
Further, to evaluate the effectiveness of the pseudo-plots in estimating
community responses, and to reveal any temporal deviance from the
co-occurrence plots, we examined the temporal trend of thermophilization
independently for the two already paired datasets. We estimated the
thermophilization index as the difference between the CTI and the
observed baseline temperature (mean MAT 1950-1979) for each vegetation
assemblage in the paired dataset (Pacheco‐Riaño et al., 2023). A
positive value indicates an increase in the CTI (an increase in
thermophilic species) compared to the baseline period of 1950-1979,
while a negative value suggests a decrease in the CTI (an increase in
cryophilic species) during the same reference period. Note that this
thermophilization trend was not bias-corrected as it was done in
Pacheco-Riaño et al. (2023), but the same potential spatiotemporal bias
was kept for both datasets.