Temperature and species distribution data sources
We used National Oceanic and Atmospheric Administration (NOAA) long-term
surveys from three temperate marine continental shelf regions in the US:
the Northeast (annual spring survey 1968-2018), the West Coast
(triennial fall survey 1977-2004 and annual fall survey 2003-2018), and
the Eastern Bering Sea (annual summer survey 1989-2018; earlier years
omitted due to limited spatial extent (Lauth and Conner 2014)). These
surveys use trawl gear and a random stratified or fixed station sampling
design to target demersal and benthic fishes and invertebrates on the
continental shelf, up to several hundred meters deep. The West Coast and
Eastern Bering Sea raw datasets were downloaded using the R package
“FishData”, and the Northeast data were obtained from the 2019
OceanAdapt release (Stuart and Pinsky 2019), a data portal to access
NOAA trawl survey records (Lauth and Conner 2014, Politis et al. 2014,
Keller et al. 2017). The Northeast dataset, which was pre-processed by
OceanAdapt for quality control and taxonomic accuracy, contained records
for 74 species. We limited our West Coast analysis to the 54 species
that were recorded in both the triennial and the annual surveys. In the
Eastern Bering Sea dataset, we downloaded data on the 100 most
frequently observed taxa and proceeded with analysis for the 81 taxa
that were identified to species. We retrieved higher taxonomy for all
species using the R package “taxize” (Chamberlain and Szöcs 2013) and
grouped species as either fishes (belonging to classes Actinopterygii or
Elasmobranchii) or invertebrates (everything else). All data processing
and analyses were conducted in R version 3.6.0 (R Core Team 2018). All
code used is available at:
https://github.com/afredston/range-edge-niches.
Throughout our analysis, we compared distribution data for a given
species with temperature data from the preceding 12
months—specifically, the 12 months preceding the earliest possible
start month for each region’s survey for analysis (March in the
Northeast, May in the West Coast, and July in the Eastern Bering Sea).
For example, range edges derived from the spring 1999 Northeast survey
were compared to temperature records from March 1998 to February 1999.
We used two historical sea surface temperature (SST) datasets. The NOAA
NCEI Optimum Interpolation Sea Surface Temperature dataset (NOAA NCEI
2018) is available daily from 1982 onward at 0.25ºx0.25º resolution; we
downloaded this dataset for all regions. We also downloaded the Hadley
Centre Global Sea Ice and Sea Surface Temperature dataset, available
monthly at 1º resolution from 1870 (Rayner 2003) to fill in earlier
years for the Northeast and West Coast regions. To ensure comparability
between the two data sources, we performed mean bias correction and
converted the daily SST records from the higher-resolution OISST dataset
into monthly means for each grid cell (see Appendix 1); all temperature
metrics described henceforth are based on monthly mean SSTs. We tested
for change in mean, minimum, and maximum monthly regional SSTs over time
by calculating the annual region-wide mean of grid cell-specific mean,
minimum (coldest month), and maximum (warmest month) SSTs, and
performing a linear regression of those region-wide means over time for
each region with a significance threshold of α = 0.05.
Range edge positions were compared to warm and cold temperature
extremes, defined as the warmest and coldest months of the 12 months
preceding the survey (i.e., coldest month the previous winter and
warmest month the previous summer). To generate edge-specific estimates
of warm and cold extreme temperatures (see Range edge analysis ),
we constructed generalized additive models (GAMs) of maximum and minimum
monthly temperatures in each year along the axis of measurement for each
region (see Spatiotemporal reconstruction of species ranges )
using the “mgcv” package in R (Wood 2017). Each regional GAM predicted
warm or cold temperature extremes in each year, given a position along
the axis (one GAM was fit for all years in a given region with axis
position as a smoothed predictor estimated separately for each year, as
well as a separate year factor).