Discussion
Our assessment demonstrated the potential of employing presence-only
data for evaluating community responses to climate warming, specifically
by quantifying the Community Temperature Index (CTI) and
thermophilization. The aggregation of presence-only data in pseudo-plots
yields CTI values remarkably consistent with those obtained by
co-occurrence plots, which are conventionally perceived as more reliable
(Bertrand et al., 2011; Pacheco‐Riaño et al., 2023). Presence-only data
typically originate from museum collections and citizen science
projects. As such, they often suffer from different spatial and
taxonomical biases, e.g., variable sampling density per area and
low-quality control of species identification (Beck et al., 2014).
Despite the inherent biases and errors associated with presence-only
data, our study reveals a consistent estimate of the CTI for the paired
plots and a consistent temporal trend in thermophilization from the two
data sets. This consistency can be attributed to the absence of biases
in species observations regarding their temperature indicator values. In
simpler terms, there is no tendency to record more cold-adapted or
warm-adapted species in presence-only data compared to co-occurrence
data. Even if species in an area is under-sampled in the presence-only
pseudo-plots, the likelihood of observing a species, regardless of its
temperature indicator value, remains constant. Likewise for the
misidentification of species, there is no relationship with consistently
higher or lower thermal indicator values. In addition, even if a
proportion of the species was misidentified, we expect this to be minor
compared to the substantial number of correct identifications and thus
expect misidentification to have a minor impact on our analyses.
By including presence-only data in CTI analysis, we are able to cover
larger geographic areas than based on traditional co-occurrence plots
alone (König et al., 2019). This can be useful towards global
completeness and representativeness of species data and producing more
realistic projections of the community’s responses. Larger temporal and
spatial coverage also provide improved opportunities to unravel the
impacts of various climatic drivers and how climate interacts with other
variables. This becomes particularly valuable in situations where
co-occurrence surveys are restricted, either in terms of spatial or
temporal coverage. The inclusion of presence-only data also provides a
cost-effective way of monitoring ongoing dynamics in species
communities, an alternative to more intensive co-occurrence surveys,
which are often considered information of the highest quality but can be
time-consuming and expensive (Dengler et al., 2011). This can also be
particularly advantageous in areas where field data is difficult to
obtain or in cases where intensive fieldwork is not feasible. However,
in both datasets, there are still some underrepresented areas like the
tropics or continents like Africa, particularly those associated with
colder climates and more remote areas, such as higher latitudes and
elevations.
Although the CTI values were very similar, we observed that the
differences in predicted CTI values between the two datasets (Δ CTI)
were more pronounced towards the colder end of the thermal gradient.
This deviation could potentially be attributed to the distinct
methodologies employed in data collection. Co-occurrence data is often
gathered through more or less organized expeditions and may as such
cover a more representable distribution of the topographic relief in an
area including higher and more remote areas. The presence-only data,
however, are often compiled from more random observations made in
unplanned citizen science projects. In colder areas, which in most cases
will entail mountain regions in our study area, the more accessible
parts are in the valley bottoms (where roads are placed) resulting in a
bias towards lower elevations in topographically heterogenous areas. Our
ad hoc analysis substantiated this expectation (Fig. 3 ),
revealing that co-occurrence plots are predominantly situated at higher
elevations in areas where CTI from pseudo-plots is overestimated
compared to the co-occurrence plots. Being aware of this, it would be
possible to adjust for this potential bias by also incorporating
elevation when aggregating species to pseudo-plots, or by adjusting for
the bias in CTI by using the observed elevations of the presence-only
data. This would be especially important when comparing pseudo-plots, or
when combining co-occurrence plots and presence-only data in areas of
different topographical relief.
Overall, we noticed that the variations among the CTIs are comparable
when calibrating the transfer functions using all three types of
datasets. However, the difference observed in the model relying only on
the co-occurrence dataset can be attributed to the extent of the dataset
employed for calibrating the model. We found that including plant
communities from a larger area improved the transfer function, notably
when including communities with thermophilic species from lower
latitudes (e.g., from central Europe). This inclusion effectively
addresses the issue of niche truncation in the warmer end of the thermal
range. However, including presence-only data from outside our focus area
did not improve the accuracy of the CTI values at the colder end; this
is likely due to the lack of species in the dataset from higher
latitudinal areas or elevations. This consideration is of particular
significance given the ongoing global warming trend, as adjusting the
overestimation of the cold end would be less necessary as there will be
a higher representation of thermophilic species in the communities.
However, giving special attention to species adapted to warmer
conditions holds significant relevance in Norway, given its
predominantly cold climate. In contrast, in locations with milder
temperatures, especially those not situated at high latitudes or
elevations, relying on presence-only data would help mitigate the
truncation of species at both the cold and warm extremes. Moreover, we
saw that the differences were larger for older assemblages than for more
recent ones, which could be attributed to the improved accuracy of the
newer information.
It is important to note that the accuracy of the CTI values produced by
pseudo-plots will depend on the quality of the presence-only data being
used. The exponential growth of presence-only data records in the last
two decades has resulted in greater public access to these records (Jin
& Yang, 2020). However, before using these data for ecological
analyses, cleaning and standardization procedures must be undertaken.
This step is especially important due to the varying sources and
properties of the data, as temporal, spatial, and taxonomic criteria all
need to be considered (Meyer et al., 2016).
Integrating presence-only data into species distribution modelling has
been widely used (Beck et al., 2014; Pacifici et al., 2017; Smith et
al., 2023). However, very few studies explore their use in understanding
community responses, such as thermophilization, to environmental changes
(Feeley, 2012). Our study demonstrates that by including presence-only
data, we can better understand and learn from their advantages and
drawbacks in biogeographical analyses. Although the outcomes produced by
this method may have some flaws, they could be the first approximation
for many regions and taxonomical groups and provide a good starting
point for further research. As demonstrated here, open data portals,
such as GBIF, can be utilized to consolidate datasets that are used to
analyse communities’ responses to environmental change. To bridge
existing data gaps, the digitization and mobilization of scientific
biological collections and personal archives of researchers must be
continued. This will help to monitor species composition, abundance, and
diversity changes and identify potential threats to ecosystem
functioning, leading to a better understanding of the environmental
impact of global climate change.
Our study compellingly indicates that presence-only data can be used to
estimate community indicators (e.g., CTI) accurately. It serves as an
additional source of information to complement more traditional
co-occurrence plot-based datasets. Our findings suggest that the data
integration of presence-only data can be used to improve the calibration
of transfer function models and our understanding of vegetation
responses to climate change. Nevertheless, the overall patterns and
trends of thermophilization remain largely consistent across the two
datasets, suggesting that the thermophilization values are not
significantly different when using pseudo-plots. Even more important is
the fact that both datasets show a consistent trend in
thermophilization, independent of the calibration dataset. We
additionally presented an outline that can be used to study community
responses for global change research. Our main findings, therefore,
demonstrate that presence-only data can be used to quantify
thermophilization. Though some careful attention is needed when
integrating traditional co-occurrence plots with presence-only data,
there is a substantial potential to unlock new opportunities for rapid
and cost-effective monitoring of communities in response to changes in
climate.