Data analyses
We profiled species diversity and relative abundance across the landscape using a relative abundance index (RAI) calculated from the metabarcoding results. The RAI of a species is equal to the sum of occurrences at a site (or over the entire landscape) for speciesi divided by the total number of pooled samples at that site (or number of pooled samples across the landscape). Occurrence of a species in a sample was determined by the presence of sequence reads for that taxon post quality control of sequence reads.
We used NMDS ordinations to examine the potential separation of the vector and the non-vector communities and the host and the non-host communities. The function envfit was used to determine if any of the environmental metrics were significantly associated with the community composition of sandfly or vertebrate species. Amount of forest and pasture were logit transformed and all environmental variables were rescaled prior to analysis. We also included Julian day as a predictor for the sandfly species ordination due to the association of leishmaniasis (and other zoonotic diseases) incidence with the wet season. To better tease apart patterns between landscape structure and the host and the vectors, we used generalized linear models to test the hypothesis that the measures of forest cover, pasture cover, and distance to the major urban center (and Julian day for the sandfly models) predict the likelihood and density of disease-competent taxonomic groups. We used a Poisson regression model to ask if the counts of sandfly pools were influenced by phenology (Julian day), percentage forest, percentage pasture, and distance to the urban center. We then built binomial models with a random effect for site to assess whether Julian day, percentage forest, percentage pasture, and distance to urban affected the probability that a sandfly pool contained a medically important vector. Lastly, we built binomial models with a random effect for site to assess whether the environmental variables affected the probability that any sandfly pool contained a host or non-host species.
We constructed bipartite networks to examine how vector-host interactions restructure between the most forest intact sites ( >60% forest cover) and the most deforested sites (<30% forest cover). We first subsetted samples to only include those in which vector species accounted for the majority of DNA sequences in that sample (samples where more than 50% of the reads were vector species) because our aim was to better understand changing vector-host relationships due to deforestation. This subsetted dataset contained six sites, with five sites containing 14 samples categorized as “intact” and three sites containing 24 samples categorized as “deforested”. The bipartite networks were constructed using thebipartite package in R (Dormann et al., 2011) and display a weight that is equal to the number of interactions at the pooled sample level between a sandfly and any vertebrate species identified in the same sample.