Data analyses
We profiled species diversity and relative abundance across the
landscape using a relative abundance index (RAI) calculated from the
metabarcoding results. The RAI of a species is equal to the sum of
occurrences at a site (or over the entire landscape) for speciesi divided by the total number of pooled samples at that site (or
number of pooled samples across the landscape). Occurrence of a species
in a sample was determined by the presence of sequence reads for that
taxon post quality control of sequence reads.
We used NMDS ordinations to examine the potential separation of the
vector and the non-vector communities and the host and the non-host
communities. The function envfit was used to determine if any of
the environmental metrics were significantly associated with the
community composition of sandfly or vertebrate species. Amount of forest
and pasture were logit transformed and all environmental variables were
rescaled prior to analysis. We also included Julian day as a predictor
for the sandfly species ordination due to the association of
leishmaniasis (and other zoonotic diseases) incidence with the wet
season. To better tease apart patterns between landscape structure and
the host and the vectors, we used generalized linear models to test the
hypothesis that the measures of forest cover, pasture cover, and
distance to the major urban center (and Julian day for the sandfly
models) predict the likelihood and density of disease-competent
taxonomic groups. We used a Poisson regression model to ask if the
counts of sandfly pools were influenced by phenology (Julian day),
percentage forest, percentage pasture, and distance to the urban center.
We then built binomial models with a random effect for site to assess
whether Julian day, percentage forest, percentage pasture, and distance
to urban affected the probability that a sandfly pool contained a
medically important vector. Lastly, we built binomial models with a
random effect for site to assess whether the environmental variables
affected the probability that any sandfly pool contained a host or
non-host species.
We constructed bipartite networks to examine how vector-host
interactions restructure between the most forest intact sites (
>60% forest cover) and the most deforested sites
(<30% forest cover). We first subsetted samples to only
include those in which vector species accounted for the majority of DNA
sequences in that sample (samples where more than 50% of the reads were
vector species) because our aim was to better understand changing
vector-host relationships due to deforestation. This subsetted dataset
contained six sites, with five sites containing 14 samples categorized
as “intact” and three sites containing 24 samples categorized as
“deforested”. The bipartite networks were constructed using thebipartite package in R (Dormann et al., 2011) and display a
weight that is equal to the number of interactions at the pooled sample
level between a sandfly and any vertebrate species identified in the
same sample.