Introduction
Ongoing climate warming induces range shifts of species which track
their optimal temperature conditions (Lenoir et al., 2020; Mamantov et
al., 2021; O’Sullivan et al., 2021; Steinbauer et al., 2018). These
shifts subsequently lead to local reorganizations of species communities
and assemblages (Menéndez et al., 2006; Walther, 2010), often resulting
in a relative increase of warm-demanding species and/or a decreasing
number of cold-demanding species, a pattern referred to as
thermophilization (Gottfried et al., 2012). Thermophilization has been
reported globally from a wide variety of habitats and areas, from
mountain tops (Gottfried et al., 2012) to forests (Zellweger et al.,
2020) and from temperate (Pacheco‐Riaño et al., 2023) to tropical
regions (Fadrique et al., 2018).
Studies on multi-decadal vegetation responses to climate change such as
thermophilization commonly rely on permanent or semi-permanent
plot-based datasets as the main source of information (Bertrand et al.,
2011; Chytrý et al., 2014; Fadrique et al., 2018; Freeman et al., 2021;
Götzenberger et al., 2012; Kapfer et al., 2017; Richard et al., 2021).
Permanent plots are best suited to track vegetation dynamics, multiple
initiatives have been set up in the beginning of the
21st century for continuous long-term monitoring
(Chytrý et al., 2014; Gottfried et al., 2012; Haider et al., 2022).
Although the number of permanent plots is continuously increasing, they
are, however, still geographically scattered, and most cover relatively
short time spans. Historical co-occurrence plots (herein broadly defined
as species records co-occurring in a specific site), were initially done
to describe the structure and diversity of vegetation types (e.g.,
phytosociological plots, also called relevés). Resurveys of these
semi-permanent plots have also proven to be a valuable source of
information to describe vegetation dynamics over decades (Kapfer et al.,
2017), study range shifts (Felde et al., 2012; Lenoir et al., 2008;
Rumpf et al., 2018) and thermophilization responses (Pacheco‐Riaño et
al., 2023).
To study vegetation dynamics over time, an alternative to co-occurrence
datasets is the aggregation of species occurrence records (i.e.,
presence-only data from individual species observations) such as museum
and herbaria collections, or, more recently, observations from various
structured or unstructured citizen science projects. Presence-only data
have been collected extensively over the last century and their number
has increased enormously over the past 20 years (Heberling et al.,
2021). This type of data generally has more extensive temporal and
spatial coverage compared to co-occurrence data due to the vast network
of data collectors but comes at the cost of missing information about
absences. The world’s largest biodiversity data network, mediated by The
Global Biodiversity Information Facility -GBIF (http://gbif.org),
stands as the leading open-access data portal for geo-referenced species
occurrence data collected from a myriad of different sources (König et
al., 2019; Wüest et al., 2020). GBIF provides access to more than 1.5
billion species records from across the globe and the tree of life. In
addition to missing absence data, many records are prone to biases
stemming from identification errors and spatially biased sampling due to
the diversity of collectors and data sources, among other issues (Beck
et al., 2014; Meyer et al., 2016). Therefore, these data are commonly
considered unreliable for many community analyses and have so far only
been exploited to a limited extent to assess community responses to
global warming (Bottin et al., 2020; Duchenne et al., 2021; Feeley,
2012; Lajeunesse & Fourcade, 2023). The availability of presence-only
data offers us the opportunity to investigate community responses in
various ways. This includes using presence-only data for regions or
species that lack historical co-occurrence data. Alternatively, we can
utilize presence-only data for regions with limited data availability in
the contemporary context. It’s also worth considering a combination of
both approaches, where the integration of presence-only data and
co-occurrence data can help mitigate some of the biases inherent to each
data type.
Assessing thermophilization relies on species co-occurrences in a
specific area combined with species-specific thermal indicator values to
calculate community temperature indices (CTI), i.e., the (weighted)
average of the thermal indicator values for species assemblages. The CTI
approach is an effective way to summarize thermophilization trends by
comparing changes in CTI over time (Duque et al., 2015; Feeley et al.,
2020; Freeman et al., 2021; Richard et al., 2021). It can provide an
unbiased estimation of thermophilization regardless of sampling
differences, as long as there is no disproportionate collection of
warm-demanding or cold-adapted species compared to their actual
occurrences in the area (or vice versa). In other words, if the sampling
efforts do not favour one type of species over the other in terms of
their thermal preferences, the CTI can be assumed to accurately reflect
the degree of thermophilization in each community. Presence-only data
could thus hold a great potential to fill spatial and temporal gaps in
studies of species dynamics, and, therefore, allow for a more
comprehensive understanding of climate-driven responses of species
across their geographical ranges (König et al., 2019).
One approach to quantify the CTI is a technique from paleoecology, the
transfer function approach (Bertrand et al., 2011; Pacheco‐Riaño et al.,
2023). Transfer functions are mathematical models that represent the
relationship between species occurrences and environmental variables
from a certain period, assuming that species have symmetrical, unimodal
response curves with an ecological optimum for climate variables
(Hutchinson, 1957). If the relationship between species and climate
remains constant through time (ecological uniformitarianism) (Rull,
2010), the inferred species-climate relationship can be used to
reconstruct past or present climates from community composition (Salonen
et al., 2011). Transfer functions have been used extensively by
palaeoecologists to reconstruct past climatic conditions from current
relationships between species co-occurrences and climatic conditions
(Guiot & de Vernal, 2007; Juggins & Birks, 2012; Schäbitz et al.,
2013). Subsequently, community compositions from sediment cores are used
to reconstruct paleoclimates (Birks & Simpson, 2013). This approach has
been used to reconstruct various environmental conditions, from water
chemistry (e.g. pH) using diatoms (ter Braak & Juggins, 1993), to
temperature (Chevalier et al., 2020) and precipitation (Lu et al., 2019)
using fossil pollen. A corresponding approach has recently been utilized
in modern ecology to estimate thermophilization by inferring temperature
from co-occurrence vegetation data based on a CTI approach (Bertrand et
al., 2011; Pacheco-Riaño et al., 2023). In this case historical
co-occurrence plots sampled prior to major climatic changes were used to
calibrate a transfer function, which was subsequently used to project
the CTI based on more recent vegetation plot data (Bertrand et al.,
2011; Bhatta et al., 2018; Pacheco‐Riaño et al., 2023). Based on this
approach, thermophilization can be estimated as the difference between
the floristically inferred temperature (i.e., CTI) and the observed
temperature from the calibrating period (Pacheco-Riaño et al., 2023).
Exploiting the vast amount of presence-only data to analyse the
responses of communities to climate warming requires, however, a
rigorous evaluation of robustness and reliability (Bayraktarov et al.,
2019). Therefore, our aim was to
answer two key questions. First, we wanted to determine if changes in
community dynamics due to climate warming, as deduced from presence-only
data from GBIF, corresponded with co-occurrence plot data. Second, we
aimed to assess whether these two datasets could be used
interchangeably, either individually or in combination, to yield similar
community responses during the model calibration or prediction phases.
In our study, we incorporated co-occurrence plot data from Norway
alongside spatially and temporally aggregated presence-only data
(referred to as pseudo-plots) in Europe. We intentionally employed a
broader geographical scope for the presence-only data to avoid niche
truncation, a benefit provided by the GBIF spatial coverage. Within this
context, we assessed the difference in CTI and thermophilization index.
We hypothesized that both types of data would exhibit a consistent
pattern and could be employed interchangeably for our analyses.