Guidelines for using SDMs to project marine species
We break down the SDM analysis process into six main steps: goal
setting, data selection, model building, model evaluation and
validation, interpretation of results, and communication of results. We
propose guidelines that support a logical workflow starting from
articulating the goals of the study, through the modeling process, and
finally communicating results to other scientists, resource managers,
and policy makers (Figure 1). For each step in the SDM process, we have
identified key questions for analysts to consider and linked them to the
guidelines that will help to answer those questions. At each step, we
outline best practices with a focus on how to identify and minimize
uncertainty, when possible, and how to transparently communicate the
uncertainty that cannot be avoided.
1. Frame the research question
Clearly stating the research questions (i.e., the problem, the
objectives, and the hypotheses) is essential to ensure that objectives
are considered throughout the analysis and support transparent and
reproducible SDM results (Araújo et al. 2019, Zurell et al. 2020). A
research outline (Table 1) can communicate the intention of the
research, explicitly state the scope of the study, and help identify any
assumptions that may impact the outcome of the study. This understanding
can support qualitative identification of the tolerance for uncertainty.
For example, if projections of occurrence, rather than biomass or
abundance, are suitable for the objectives of the study, it may be
possible to combine data collected using different surveys because
presence-absence data are less sensitive than biomass data to
differences in gear type and methodology. Laying out the study plan
provides a clear communication tool for all parties involved in the
research and its outcomes.
2. Ensure the scope of study is relevant, both in space
and in time
The choice of extent and resolution in both space and time can impact
the accuracy of SDM projections and affect their utility to support
management decisions. It is assumed when projecting distributions into
future climates that species distributions across spatial climate
gradients will match species responses to temporal changes in climate.
Applications of SDMs to marine species have often involved fitting
models with observations from a subset of the species’ range within
geopolitical boundaries (e.g., Thorson et al. 2015). While these types
of SDMs may be appropriate for questions related to specific
assessments, they are ill-suited to climate change applications. Using
only a subset of data in space or time will usually lead to truncated
species-environment relationships and introduce uncertainty in the
fitted SDM parameters. When projecting into future climates, these
truncated models are likely to have reduced transferability as they are
extrapolating beyond the range of observed conditions where they are not
calibrated or validated, and therefore generate poor distribution
projections (Charney et al. 2021, Muhling et al. 2020, Thuiller et al.
2004). To characterize the species’ full niche, species observations
should be sourced from the widest spatial and temporal extent available
that best addresses the research question (Barbet-Massin et al. 2010,
Thuiller et al. 2004).
The spatial resolution of environmental covariates should also be at a
biologically relevant scale for the taxa being modeled (Austin and Van
Niel 2011). For example, the relevant scale for the relationship between
bathymetry and a highly migratory pelagic fish species (e.g., tuna) is
likely coarser than that for an intertidal invertebrate (e.g., oyster).
One challenge with modeling at an appropriate scale is that the
available spatial resolution of environmental covariates may not match
the resolution of the species observations. In these cases,
environmental covariates should be up- or down-scaled (Araújo et al.
2019, Hijmans et al. 2005). Future climatic variables are necessarily
coarse since they are typically modeled at a global scale. Downscaling
methods can be applied to match the desired scale in an attempt to
capture the variability at the scale relevant to the organism; however,
this process may introduce additional uncertainty. Modeling at coarser
spatial resolutions than is biologically appropriate can increase
uncertainty in projections by over- or under-predicting habitat
(Franklin et al. 2013, Gottschalk et al. 2011, Randin et al. 2009, Seo
et al. 2008, Willis and Bhagwat 2009). Importantly, the spatial scale at
which species projections are generated should be considered when making
management decisions. Coarser resolution models (e.g.,100 km) that do
not resolve local topographic features, for example, may not be well
suited to support local management decisions (e.g., within a 10 km
squared coastal protected area).
The temporal resolution of environmental covariates is another important
consideration for building models that characterize the full species
niche. Ideally, the temporal resolution of the environmental covariates
should match the scale of the species data to reduce uncertainty in the
species-environment relationship (Araújo et al. 2019, Batalden et al.
2007). Many SDMs are static and are built using environmental covariates
derived from climatologies (i.e., long-term means) (Bateman et al.
2012). These models ignore interannual variability and exclude extreme
weather events and thus will not be well calibrated to the full range of
conditions experienced by the species over time (Bateman et al. 2012).
Comparison between models built with different temporal data may be
necessary.
Errors in the observation and environmental data, as well as spatial and
temporal sampling biases can impact the extent of available data and
create uncertainty in projections (Fernandes et al. 2019, Naimi et al.
2014, Osborne and Leitão 2009). Although it may not be feasible to
resolve these issues, mapping both observations and environmental data
can illustrate where these gaps occur and may be important information
to share with end users.
3. Identify appropriate species data
While consistent and standardized datasets of presence/absence or
abundance are ideal for minimizing uncertainty when building SDMs, they
may not be readily available or logistically feasible. For example,
marine species of commercial importance may have standardized stock
assessment or catch monitoring data available, whereas non-commercial
species may only have sporadic presence-only data. Existing data may
also come from a biased subset of a species range or be biased to a
certain time of year due to logistical constraints or data collection
priorities.
Alternative information sources may confirm or expand species
observation data. For example, environmental DNA (eDNA) is becoming
increasingly viable, particularly for bony fishes (Muha et al. 2017).
Advancements in imagery analysis also allow for biological surveys of
coastal habitats with remotely piloted aircraft (e.g., drones; (Monteiro
et al. 2021). Citizen science platforms and global databases can provide
observational data, trading sample size for potential inaccuracy and
spatial bias (Beck et al. 2014, Johnston et al. 2020). Expert and
Indigenous Knowledge can also be used in conjunction with survey data to
capture the extent of a species’ distribution (Merow et al. 2017,
Skroblin et al. 2021). Although they each have limitations, these data
sources are increasing the availability of species data.
Combining data sources can fill in gaps in any individual dataset. For
example, this approach has been used to define the spatio-temporal
distribution of Killer Whales (Watson et al. 2019). However, analysts
must consider the biases that may result from differences across data
sources. For example, catchability often varies by fishing gear type,
and data collected from fisheries may be non-random and preferentially
sampled (Fletcher et al. 2019). Hybrid models using more complex
statistical structures to combine datasets from different sources can
increase the power of a model while still accounting for biases and
variances of the individual datasets (Rufener et al. 2021, Thorson et
al. 2021).
Information on a species’ ecology can be used to improve the uncertainty
regarding the accuracy of model predictions. For instance, dispersal
barriers, ontogenetic shifts, and biotic influences on aggregations
(e.g., spawning) affect model accuracy and performance (Robinson et al.
2011). Dispersal barriers are less common in marine systems (Carr et al.
2003), but may be important to incorporate as post-hoc constraints to
SDM predictions for species with lower dispersal capacities (Robinson et
al. 2011). Uncertainty may be reduced by splitting observation data
between adults and juveniles if a species occupies habitats with
different environmental conditions across its life stages (Petitgas et
al. 2013). Experimentally derived responses can be applied to compare
the fundamental niche of a species relative to the realized niche
modeled by SDMs (Franco et al. 2018, Martínez et al. 2015) or
incorporated as priors in Bayesian SDMs (Gamliel et al. 2020). Though
physiological limits are unknown for many marine species, this
information is particularly valuable for SDM projections as
distributions will be underestimated when observed locations are
constrained by non-climatic factors (Araújo and Peterson 2012).
4. Determine relevant climatic and non-climatic
environmental variables
There are two key considerations when identifying relevant environmental
variables: 1) their ability to describe species responses to current
environmental conditions; and 2) the uncertainties that exist in how
those responses may change in future climates (guideline #8). Many
studies have shown temperature-related variables to be among the most
powerful predictors of species distributions (Bosch et al. 2018, Bradie
and Leung 2017). A variety of mechanisms have been identified through
experiments, models, and observations of extreme thermal events whereby
temperature affects biological processes such as development, dispersal,
growth, and species interactions (Boyd et al. 2013, Kordas et al. 2011,
O’Connor et al. 2007, Sunday et al. 2012). Understanding these
mechanisms can help to determine the most suitable temporal values
(e.g., average daily maximum temperature, warmest month, or cumulative
values such as growing degree days). However, data availability and
realism must also be considered when selecting climatic variables. If
biological knowledge suggests that extreme temperature events contribute
to limiting the local-scale distribution of a species, it is necessary
to determine whether the spatial and temporal resolution of the data
(both from observations and climate models) are sufficient to resolve
such events. Global climate models are most suited to projecting changes
in the statistics of a climate phenomenon (e.g., mean temperature or the
frequency of an event), rather than the magnitude of an extreme event,
and the confidence in those extreme event projections can depend on the
variable and region (Seneviratne et al. 2012).
Static, non-climatic variables are essential to reduce uncertainty when
projecting species distributions (Willis and Bhagwat 2009). Ignoring
non-climatic variables that limit species distributions increases the
risk of overfitting the climatic variables, and over- or
under-estimating changes in a species’ distribution and extinction risk
under climate change (Beaumont et al. 2005, Hof et al. 2012, Virkkala et
al. 2010, Zangiabadi et al. 2021). In the marine realm, excluding
physical habitat variables such as bathymetry can be problematic as they
are often correlated with climatic variables that are difficult to
measure or model, such as food availability, but integral to predicting
habitat (Luoto and Heikkinen 2008). Unlike climatic variables, static
variables can either be used as predictors in a model or used as a
filter to constrain the model domain depending on the question and
research objective. For example, when projecting kelp distribution,
which requires hard substrate for attachment, substrate type can be
included as a model covariate, or the model projections can be
restricted to areas with hard substrate.
Highly complex and overfit models tend to perform well within the
environmental space the model was trained with but may perform poorly
when projecting into future conditions (Bell and Schlaepfer 2016,
Moreno-Amat et al. 2015). To limit model complexity, biological
knowledge should be relied on to select the relevant environmental
variables (Austin and Van Niel 2011). Preference should be to include
the most proximate variables, those that have a direct physiological
effect on the species being modeled, over more distal or indirect
variables that are often used as proxies when proximal variables are
missing (Anderson 2013, Gardner et al. 2019). Some commonly used static
variables (e.g., depth and distance from shore; (Bosch et al. 2018,
Johnson et al. 2019)) are considered proxies for other variables, such
as pressure and exposure. When proxy variables are needed to represent
important processes, practitioners should note that an assumption of
stationarity between the proxy variable and the more direct variable it
aims to represent is implicit when projecting species distributions.
Variable selection can simplify complex models by seeking subsets of
predictor variables that still allow good predictive accuracy (Piironen
and Vehtari 2017). Nevertheless, careful consideration of the causal
link between each environmental variable and the focal species is needed
to prevent the removal of an environmental variable that may be
influential in a different set of conditions. In addition, collinearity
between variables can make their independent influence on a species
range hard to distinguish. This can be particularly problematic for
temperature and depth in marine systems; although they are often highly
correlated at regional scales, temperature is projected to warm while
depth remains constant (e.g., Thompson et al. 2022a). Projections
require that SDMs have accurately estimated how these two variables
shape species ranges. A solution is to include species data from across
a broader spatial extent where latitudinal temperature gradients can
break down the collinearity between temperature and depth (Thompson et
al. 2022b).
5. Select the SDM model
SDM models range from parametric, to semiparametric (e.g., Shelton et
al. 2014), to various forms of non-parametric approaches including
MaxEnt (Phillips et al. 2006) and machine- or deep-learning models
(e.g., Christin et al. 2019, Elith et al. 2008). Furthermore, SDMs can
be purely phenomenological (e.g., correlative, Jarnevich et al. 2015) or
built on assumed mechanisms and calibrated to data (e.g.,Essington et
al. 2022, Kearney and Porter 2009). Correlative models may perform well
on existing data but not extrapolate well if those correlations break
down (e.g., Davis et al. 1998). Mechanistic models are grounded in
physiological and biological principles, and may outperform correlative
models in future conditions, but are often challenging to construct
(Kearney and Porter 2009, Urban 2019). Hybrid models incorporate known
mechanisms in addition to phenomenological correlations, and have the
potential to borrow advantages from both kinds of models (Kearney and
Porter 2009). Creating ensembles by combining the outputs from several
individual models utilizing different algorithms can improve predictive
ability (Araújo and New 2007, but see Hao et al. 2020) and can be as
simple as unweighted or weighted averages (Araújo and New 2007) or as
complex as super-ensembles tuned to simulated or trusted data (Anderson
et al. 2017). However, an ensemble is only as good as the individual
models used to build it, therefore some effort is required to choose a
high quality candidate set; using models with different covariates or
structure may help identify misspecification of any single model.
A recent advance in SDMs is the move from single-species models to
multi-species models known as Joint Species Distribution Models (JSDMs;
Warton et al. 2015). For example, JSDMs have been used to understand the
joint influence of ongoing environmental change and fishing pressure on
groundfish species richness in Canada’s Pacific waters (Thompson et al.
2022a). The flexible hierarchical structure makes it possible to account
for correlation among species and provide more robust uncertainty
estimates, and allows relevant biological information (e.g., functional
trait and phylogenetic information) to be added to the model. While
species correlations from JSDMs do not necessarily represent species
interactions (Dormann et al. 2018, Pollock et al. 2014), they can be
used to understand when there is substantial statistical correlation
between species in their shared response to the environment (as
represented in the model) or residual correlation (not explained by the
model). Finally, there are models for different taxonomic and spatial
scales (e.g., for alpha, beta, and gamma diversity; (summarized in
Pollock et al. 2020)) that can be appropriate depending on the specific
objectives. For example, if the objective can be evaluated with species
diversity or biomass rather than information from individual species,
then macroecological models could provide sufficient results with fewer
input data.
Model choice can influence uncertainty and should therefore be guided by
the objectives of the analysis, the model fit, and model evaluation. For
this reason, it is critical to start with a set of candidate models that
can support the objectives of the analysis. These candidate models may
include different variables or differing parameterization of these
variables. Second, it is necessary to evaluate candidate models for any
problems in the fit itself (e.g., failure to converge, non-sensible
response curves) as well as violations of their assumptions (e.g.,
residual analysis, (Rufener et al. 2021); posterior predictive checks,
(Gelman et al. 1996)). Several approaches are available to compare among
candidate models meeting the above criteria. Information theoretic
approaches such as AIC (Akaike 1973) or predictive model selection tools
such as the Leave One Out Cross-Validation Information Criterion (LOOIC)
(Vehtari et al. 2017) can help evaluate model parsimony; a more
parsimonious model should in theory make better predictions (e.g., Aho
et al. 2014). However, these approaches are not typically designed to
evaluate projections and are generally limited to parametric models.
Finally, practitioners should compare the predictive accuracy of all
candidate models using hold-out data, such as in cross-validation.
Threshold-independent statistics (e.g., receiver operator curve plots)
can be used to assess overall model performance and the models’
discriminatory ability across species and locations; while
threshold-dependent statistics (e.g., sensitivity, specificity, true
skill statistic) can support accuracy assessment (Freeman and Moisen
2008, Liu et al. 2011).
6. Identify climate model uncertainty
Global Climate Models (GCMs) are process-based models that include
coupled atmosphere, ocean and land models, representing the fundamental
components of the climate system (Flato 2011). When coupled to models of
biogeochemical cycling, they are known as Earth System Models (ESMs) and
are the primary scientific tools for estimating future climate
states. ESMs from major climate modeling centres participate in
coordinated experiments, including the Coupled Model Intercomparison
Project (CMIP), which has evolved through six discrete phases of
activity over the past 30 years. The future trajectory of human activity
and the associated greenhouse gas emissions are unknown, so future
socio-economically based emissions scenarios are developed to illustrate
the range of possible pathways. Climate models driven by these emissions
scenarios produce projections of the future climate state. Each phase of
CMIP contains new scenarios and updated models, and concludes with the
release of open data for downstream climate change studies (Eyring et
al. 2016).
Global climate projections have three sources of uncertainty: 1)
internal variability; 2) model uncertainty; and 3) scenario uncertainty
(Hawkins and Sutton 2009). Internal variability arises from fluctuations
in climate (such as El Niño), and within a single year this fluctuation
can be larger than the climate signal itself. The precise evolution of
internal variability in future decades cannot be predicted. However, the
range of possible outcomes resulting from internal variability can be
quantified by the spread across an ensemble of realizations from the
same model and scenario. Each realization starts from different initial
conditions, and while they will differ in their variability, they will
each experience the same overall climate change.
Climate model uncertainty results from an imperfect understanding of the
climate system, and from assumptions and compromises made in
representing this understanding in software-based numerical models. For
example, the global scale and process complexity in ESMs and limited
supercomputing capacity constrains the feasible resolution to about 100
km. Processes that are not resolved at this scale (e.g., mesoscale ocean
eddies) are approximately represented by parameterizations that are
imperfect and often differ between models. Climate model uncertainty can
be quantified by the spread obtained when multiple independent climate
models are run using the same climate scenario. Summary reports such as
the IPCC Assessments normally report on the multi-model mean result
(IPCC 2021), which is generally more accurate than the projections from
any one model.
Regional SDMs often require information at finer spatial scales
than ESMs can resolve, so the ESM outputs must be downscaled to a finer
spatial resolution. Dynamical downscaling uses a nested modeling
approach in which regional models are forced at their boundaries by ESMs
to generate finer resolution projections (e.g., Holdsworth et al. 2021,
Peña et al. 2019). These models directly solve the equations of motion
at regional scales and are particularly effective in regions where
topographic effects on wind, temperature, and precipitation are
important. Regional model uncertainty can be quantified by the spread
obtained when an ensemble of independent regional models is run using
the same driving ESMs and climate scenario. Statistical downscaling can
be used to downscale ensembles of climate models. They rely on the
assumption that regional climates are driven by large-scale influences
and often require a target fine-resolution simulation to train on. Both
downscaling techniques inherit all the uncertainties from their parent
ESMs and also introduce their own sources of uncertainty (e.g., Giorgi
and Gutowski 2015). To minimize model uncertainty, bias correction
methods can be applied prior to using global or regionally downscaled
climate variables in SDMs, though depending on the research question,
this may add additional uncertainty to the analysis process (Maraun
2016, Xu et al. 2021).
Finally, scenario uncertainty arises because the future of human
behavior, and the resulting emissions and land use changes, are unknown.
Scenario uncertainty is quantified by comparing different scenarios run
by the same model (or ensemble of models). CMIP6 created an ensemble of
projections for a discrete range of climate scenarios. Broadly, the
uncertainty is given by the range between the highest and lowest
emissions scenarios (SSP585 and SSP119 in CMIP6). Though, it has been
argued that the extreme high and low scenarios are less plausible and
unnecessarily inflate uncertainty (Hausfather and Peters 2020).
Communities of practice are forming to help inform relevant scenario
selection by users (Stammer et al. 2021).
The relative magnitude of each source of uncertainty (internal, model,
and scenario) largely depends on the spatial and temporal scales and
variables of interest (Hawkins and Sutton 2009). At global averaging
scales, scenario uncertainty tends to dominate, and internal variability
is typically the least important, particularly in the distant future.
However, at regional scales and for nearer-term time horizons
(<20 years), model variability and internal variability can be
significantly larger (Frölicher et al. 2016).
Propagation of climate projection uncertainties into downstream SDM
models presents a challenge. Ideally, SDM projections would be generated
from all possible regional models, which had downscaled all possible
ESMs, for all possible scenarios. While this approach is not practically
possible, it conceptually illustrates the full cascade of uncertainty,
which increases at each step of the process in moving from ESM climate
projections to end-use impact studies such as species distributions
(Falloon et al. 2014). A more feasible approach to estimating these
uncertainties is to generate several SDM projections from a
representative range of regional models, which themselves are driven by
a representative ensemble of ESMs and scenarios. Unfortunately, the
necessary data for these robust uncertainty estimates are often not
available. While there is some coordination under projects like the
Coordinated Regional Downscaling Experiment (CORDEX; Giorgi and Gutowski
2015), there is no equivalent to the CMIP ensemble, particularly for the
ocean. Hence, users are forced to construct these representative
downscaled ensembles themselves, and to be explicit about the
uncertainties that cannot be represented in their SDM projections.
7. Identify SDM uncertainty
Species distribution models can have at least three main sources of
uncertainty (sensu Hilborn 1987). The first is from regular
environmental and biological variation (‘noise’) that influences a
species’ distribution but is well observed and can be accounted for in a
model and contributes to parameter uncertainty and observation error.
The second source of uncertainty is the impact of extreme and
unpredictable events, and their effect on species distributions, which
can be dramatic (Anderson and Ward 2019). Unanticipated events (e.g.,
tsunamis, disease outbreaks, extreme heat waves) not captured in the
observations used to fit the SDM may only be partially accounted for in
the SDM projections. For example, it may be unknown how a species will
respond to extreme temperatures that are beyond observed values used to
build the projections and beyond the documented temperature range for
the species. Finally, there is the uncertainty stemming from ecological
patterns and processes that are only partially understood, or what
Hilborn (1987) calls uncertain states of nature. This can include
uncertainty related to climate model outputs (guideline #6), the
suitability of one environmental variable as a proxy for another, and
the influence of eco-evolutionary processes (e.g., species interactions,
dispersal limitation, local adaptation; guideline #8).
A variety of approaches are available to account for uncertainty across
possible states of nature. Multiple models can be used to evaluate the
influence of different combinations of covariates, or to characterize
the effect of a given covariate via linear or non-linear relationships.
For example, Brodie et al. (2020) applied three model types with
different covariate configurations (spatiotemporal only, environmental
only, and both spatiotemporal and environmental) to estimate responses
of fish species in the eastern Bering Sea. Predictions from multiple SDM
models or modeling assumptions can also be used to characterize the
range of such uncertainty (e.g., Nephin et al. 2020, Thuiller et al.
2019).
It is also critical to evaluate model accuracy and whether uncertainty
intervals encompass true values. Cross-validation provides a general
tool to characterize how well an SDM may be accounting for uncertainty.
Central to effective cross-validation is choosing an appropriate
blocking scheme to characterize the uncertainty of interest (e.g.,
Roberts et al. 2017). For example, spatial blocking can assess how well
an SDM can predict into areas that are omitted from the training data,
and temporal blocks can assess how well an SDM can forecast periods of
time that are omitted from the training data. Despite the importance of
cross-validation, it is important to consider that no cross-validation
strategy will fully encompass the uncertainty introduced by predicting
under new climate change conditions.
In addition, to accurately project uncertainty from SDMs, the model
needs to be statistically valid, accounting for major sources of
residual correlation caused by sampling schemes or spatial correlation
from unmodeled covariates (Legendre and Fortin 1989). Whenever possible,
SDM model uncertainty should be included in projections through error
propagation methods (e.g., via hierarchical modeling or
simulation–extrapolation; Stoklosa et al. 2015). Random effects can
provide a unified framework with which to integrate over uncertainty
from latent variables and residual correlation (Anderson et al. 2022,
Shelton et al. 2014, Thorson and Minto 2014). However, the omission of
relevant climate variables may cause spatial or spatiotemporal random
effects to absorb climate-driven variation and thereby underestimate
projected impacts of climate change (guideline #4).
8. Identify eco-evolutionary uncertainty
SDM modeling assumes that a species’ environmental niche can be
estimated by correlating occurrences or abundances with environmental
variation across space. However, environmental conditions are only one
determinant of species distributions. Distributions are also influenced
by interactions with other species, spatial patterns of dispersal, and
stochasticity (i.e., random events; Thompson et al. 2020, Vellend 2016).
Furthermore, SDMs also assume that all individuals of a species share
the same environmental response curves (Zurell et al. 2020) but this may
not be true if subpopulations are locally adapted to the conditions they
experience (Aitken et al. 2008) or if environmental responses differ
across life stages in an organism (Kingsolver et al. 2011). Together,
eco-evolutionary processes make the relationship between species
distributions and environmental conditions context-dependent (Urban et
al. 2016) which introduces three types of uncertainty when SDMs are used
to project responses to future conditions: 1) uncertainty in the model
parameters, 2) uncertainty in the assumption that all individuals within
a species will share the same environmental responses, and 3)
uncertainty in how well current species-environment relationships will
reflect future species-environment relationships.
While parameter uncertainty may be partially captured in that of the
fitted model (guideline #7), uncertainty regarding how eco-evolutionary
processes will alter species-environment relationships will not be. This
uncertainty stems from eco-evolutionary processes influencing whether or
not a species will shift its distribution at the same rate as the
climate changes (Urban et al. 2016). If species are dispersal limited or
if habitat connectivity is low, they may not be able to shift their
distributions fast enough to keep pace with the changing climate
(Schloss et al. 2012). Species will also only be able to establish in
new habitats if there is sufficient food, if obligate mutualists are
also present, and if predators, competitors, parasites, and diseases are
not too abundant or prevalent (Alexander et al. 2015, Brown and Vellend
2014, Thompson and Gonzalez 2017, Zarnetske et al. 2012). The northward
movement of the predatory whelk Mexacanthina lugubris into new
habitats is an example of range expansion that is mediated by a trophic
interaction (Wallingford and Sorte 2022). Alternatively, the loss of a
competitor or predator may allow a species to expand its distribution to
a wider range of environmental conditions than it historically occupied
(Urli et al. 2016). Additionally, species that adapt—either
evolutionarily or behaviorally—quickly to changing environmental
conditions will not need to shift their distributions as quickly, if at
all (Bell and Gonzalez 2009, Carlson et al. 2014, Thompson and Fronhofer
2019). These complex eco-evolutionary processes mean that species
distributions under future climates will inevitably differ from what
SDMs project based on current species environmental associations, and
thus should be communicated as hypotheses (Urban et al. 2016). Such
deviations may be due to the emergence of extreme and unpredictable
events (Anderson et al. 2017) such as disease outbreaks, species
interactions, invasive species, or simply from the fact that species
ranges may not perfectly track changes in climate (Wiens 2016).
Eco-evolutionary uncertainty is
distinct from uncertainty associated with statistical model fitting
(guideline #7) and from climate model uncertainty (guideline #6). In
cases where evidence of local adaptation or phenotypic plasticity to
climate variation is available, this information can be incorporated
into SDMs (e.g., Benito Garzón et al. 2011, Homburg et al. 2014, Lowen
et al. 2019, Valladares et al. 2014); however, for most species, this
information is lacking. One signal of local adaptation is that SDM
parameter coefficients may vary across the species range. This
uncertainty can be assessed using spatial block cross-validation or
spatially varying coefficients. In addition, practitioners can account
for eco-evolutionary uncertainty in the interpretation and communication
of the results (guideline #9). Much of the uncertainty associated with
eco-evolutionary processes stems from whether species will successfully
establish in new locations, and whether they will be lost in areas where
conditions are projected to become unsuitable. Researchers can be
reasonably certain of areas where species are projected to persist in
future climates, but less certain of areas where species are projected
to shift, and this can be highlighted when communicating SDM results
(see Box 1). Where species are expected to shift, either as a range
retraction or an expansion, monitoring programs can help to understand
species’ range dynamics and provide data to refine model(s) over time.