2. METHODS
2.1 Study Area
The Chagos Archipelago is located in the central IO at 6° S and 72° E at
the southern limit of the Chagos-Laccadive ridge, and is over 1,500 km
from the nearest continental land mass (Carr, 2012). Fifty-five islands
are clustered within the atolls of Diego Garcia, Peros Banhos, Salomon,
Egmont, and on the Great Chagos Bank (Figure 1a) and constitute combined
approximately 60 km2 of land area. The territory
encompasses approximately 60,000 km2 of shallow photic
reefs, and 580,000 km2 of primarily oceanic habitat,
with a maximum depth over 6,000 meters (Carr, 2011; Dumbraveanu &
Sheppard, 1999). The climate is tropical, characterised by oceanic
conditions and the seasonal reversal monsoon (Sheppard, 1999). Situated
in the inter-tropical convergence zone (ITCZ), the archipelago has
moderate winds generally from the north-west (October to April) and the
south-east (May to September). Sea surface temperature has an
approximately bimodal distribution with maxima in December–January and
March–April with a yearly mean of 28°C (Pfeiffer, Dullo, Zinke &
Garbe-Schönberg, 2009).
2.2 Seabird observations
In order to identify the influence of oceanographic conditions and
island rat infestation on seabird distribution, we conducted a multiyear
survey of the archipelago of seabirds at sea. The survey was conducted
conducted from 2012 to 2017, between November and April, to overlap with
the moderate phase of the monsoon. This period generally coincides with
peak breeding activity in the Chagos Archipelago (Carr, et al., 2019;
Carr, 2011; Carr, 2015). During the months of sampling, the BIOT marine
reserve and the IO experienced two seasons of modestly positive IO
Dipole (during 2012 and 2013), which was followed by three neutral IO
Dipole events (2014-2016) and by one very positive event (2017; NOAA
Earth System Research Laboratory [NOAA ESRL], 2017). Seabird count
samples (n = 425) were conducted from a marine vessel during six
expeditions. Three different sample types were generated: Transect
counts (n = 329) were generated during vessel transit, by adapting the
method of Tasker, Jones, Dixon & Blaker, (1984). Each transect count
had a duration of 30 minutes, during which the vessel typically steamed
at 12 knots and travelled c.11 km. Aggregation counts were generated
opportunistically during any seabird feeding aggregation (n = 87). The
birds within the aggregations were counted until all birds had been
counted (median duration 60 min; Letessier et al., 2016). Finally, point
counts (n = 9) were generated when the vessel was stationary (nominal
count duration 30 min). All samples were generated within a 180° arc
forward of the ship, out to approximately 300 meters (Table 1, Figure 1,
Appendix S1). All seabird observations were conducted by Pete Carr, a
co-author of this manuscript and an expert on seabirds within the
Archipelago (e.g. Carr, 2011; Carr, 2012; Carr, 2015). This consistency
in observer eliminates a potential source of bias. Observations were
predominantly made in proximity to the islands and the shallow reefs
(Figure 1b-1g).
2.3 Oceanic habitat
modelling
2.3.1 Response
variables
In order to model the oceanic distribution of seabirds, we selected the
most frequent and abundant seabird families in the BIOT marine protected
area as our response variables. This comparatively high-level taxonomic
classification allowed us to generate more statistical power by
increasing our counts. This grouping approach requires the assumption
that taxonomically similar species have similar ecological requirements,
in relation to habitat-use or energetic needs (Mannocci, Catalogna, et
al., 2014; Mannocci, Laran, et al., 2014). The oceanic seabird
distributions were modelled based on geomorphic and oceanographic
variables using Generalised Additive Models (GAM; Wood, 2006),
accounting for the different sampling types (Appendix S2). The GAMs were
fitted using individual family count per sample (a proxy for abundance)
as the response variables, against all possible combination of four of
six variables (depth, slope, year, sea level anomalies [SLA], sea
surface temperature [SST] and chlorophyll-a concentration
[CHL]). We avoided highly correlated variables (Spearman
coefficient, r > 0.60 and < -0.60) in the same
model, following Mannocci, Laran, et al. (2014) and retained the models
with the lowest generalised cross-validation score (GVC). We used the
explained deviance to evaluate the explanatory power of the models. GAMs
were fitted using the mgvc package in R (R Development Core Team
2017 version R version 3.3.3) that determines the degrees of freedom for
each smoother internally when fitting the model (Wood, 2006). Splines
were limited to three knots in order to maintain ecological sense and to
avoid overfitting (Mannocci, Laran et al., 2014).
2.3.2 GAM
Predictions
Spatial predictions in unsampled areas were limited to the convex hull
defined by the BIOT marine reserve and restricted by the range values of
the variables used to build each model. This ensured that predictions
were only made in areas with similar environmental conditions. Using
this approach, we avoided extrapolating beyond the range of the model,
whilst generating meaningful predictions beyond our sampled area (Yates
et al., 2018). Whenever [year] was retained, we rendered predictions
set at the last year of sampling, in 2017. Uncertainty for each model
was derived from the Bayesian covariance matrix of model coefficients
(Wood, 2006). We rendered predictions and modelled uncertainty on a 0.4
x 0.4 decimal degree resolved grid. This resolution is considered a
reasonable trade-off in order to capture distribution for species with
uncertain range sizes (Seo, Thorne, Hannah, & Thuiller, 2008).
2.4 Modelling the effect of rat
infestation
We hypothesise that seabird distribution is sensitive to rat-infestation
on islands and that this sensitivity restricts seabird distributions in
the water adjacent to infested islands. We modelled the effect of rat
infestation on seabird distribution at sea using Boosted Regression
Trees (BRT). BRT are considered an advanced form of regression
(Friedman, Hastie & Tibshirani, 2000) that use boosting to combine and
adapt large numbers of relatively simple tree models, enabling
performance optimization (Elith, Leathwick & Hastie, 2008). A pair of
BRT models were fitted each for the Laridae, Sulidae and Procellariidae
families (Appendix S3). The models were fitted using the set of
variables selected by the GAMs. In addition, the first model of the pair
was built with the additional inclusion of the variable ‘distance to the
closest rat-free islands (km)’. This model was considered to represent
bird distribution at its theorised maximum abundance, in the absence of
any rat effect. The second BRT was built the additional inclusion of the
variable ‘distance to the closest rat-infested islands (km)’. For each
BRT we also included a nearest island area variable (in
m2), to account for the potential effect of landmass
availability. To reduce variability, we used transect counts samples
only. BRTs were fitted following the methodology and adapting the code
in Elith et al. (2008) and the gbm package in R (R Development
Core Team 2017 version R version 3.3.3). The BRT models were fitted
using a trade-off between learning rate and numbers of trees (Elith et
al., 2008; D’agata et al., 2014).
We identified thresholds (break-point) to which seabird distribution is
influenced by the distance to rat-free or rat-infested islands, using a
Davies’ test. To find significant differences between families and
whether the island was rat-free and rat-infested islands, break-points
(BP) were determined with the 95% confidence intervals (CI). To
determine the net gain in seabird abundance following a scenario of an
archipelago wide rat eradication programme, we subtracted the difference
of the predictions resulting from the rat-infested models from the
predictions of the rat-free models. The predictions were mapped only
where the nearest island was rat-infested since we assume that no new
islands will be infested, showing net gain and net lost in seabird
distribution.