Methods

All analyses were conducted in R v.4.0.2 (R Core Team, 2022). The package tidyverse v.1.3.2 (Wickham et al., 2019) was used to clean and handle the data, sf v.1.0-9 (Pebesma, 2018) and raster v.3.6 (Hijmans, 2023) for spatial data manipulation, and ggplot2 v.3.4.0 for visualization (Wickham et al., 2019).

Data

Co-occurrence dataset

We used the same non-standardized co-occurrence dataset used by Pacheco‐Riaño et al., (2023) consisted of 605,637 taxa records of terrestrial vascular plants (3,617 taxa) from particular locations without marked spatial boundaries from 1900 until 2007 from Norway. These records had their origins in a series of field campaigns, primarily involving relevés, conducted by various collectors across Norway since the early 1900s. After that, these records were meticulously documented in the field notes from The Agder naturmuseum og botaniske hage, University of Oslo, University of Trondheim, and The Norwegian University of Life Sciences, and integrated into GBIF and stored as occurrence data. To reconstruct the co-occurrence plots, we grouped the data by coordinates, year, elevation, and collector as described in Pacheco-Riaño et al. (2023), and only kept taxa at species level or merged lower taxonomical units to species level, using GBIF’s backbone taxonomy tool (The Global Biodiversity Information Facility, 2020). This resulted in 41,993 co-occurrence plots with 2,888 species covering the time span from 1905 to 2007 (Fig. 1 ).

Presence-only dataset

We downloaded from GBIF all global georeferenced records for those plant species included in the non-standardized co-occurrence dataset described above (3,617 taxa, Table S1 ). This included a total of 212,286,166 records from 6,831 datasets (December 8th, 2022, https://doi.org/10.15468/DL.VZVGK7 GBIF.org, 2022).
We applied six automated cleaning procedures to eliminate known issues with presence-only data using the package “CoordinateCleaner” (Zizka et al., 2019): 1) equal coordinates (records with identical longitude and latitude), 2) zero coordinates (plain zeros in the coordinates and a radius around), 3) capitals (radius around capital cities), 4) centroids (radius around country and province centroids), 5) sea coordinates (non-terrestrial records), and 6) biodiversity institutions (radius around biodiversity institutions). All flagged records or records with missing coordinates or missing record years were removed. As for the co-occurrence dataset, we harmonized the taxonomy using GBIF’s backbone taxonomy tool (The Global Biodiversity Information Facility, 2020), and kept records identified at the species and infraspecific level, e.g., subspecies, but merged all records to species level. Subsequently, duplicated records with the same species names, coordinates and year were removed, and only occurrences within Europe (longitude: -12° to 35°, latitude: 45° to 72°) were kept. Lastly, to prevent any duplication with the co-occurrence dataset, we excluded records with the same species names and geographical coordinates as contained the co-occurrence records. This process resulted in a cleaned, presence-only dataset of 68,112,130 occurrence records of 2,742 species.
To build “communities” with spatio-temporal information from the presence-only data, we created “pseudo-plots” by aggregating species occurrences temporally and spatially by using the same grid cells as for the temperature dataset described below (i.e., 30 arc seconds resolution). Only pseudo-plots between 1905-2016 were kept for further analyses (Fig.1 ).