Species delimitation and introgression analyses
We conducted species delimitation testing using BFD* (Leaché et al. 2014) implemented in the SNAPP plug-in (Bryant et al. 2012) for BEAST 2 (Bouckaert et al. 2014). SNAPP uses the multispecies coalescent (MSC) to estimate trees, effective population sizes, and divergence times from SNPs by inferring probabilities of allele frequency change, and then outputs a posterior distribution that represents different estimations of the species tree (Bryant et al. 2012). BFD* outputs marginal likelihood estimations (MLE) for each species delimitation model, which are used to calculate Bayes Factors (Grummer et al. 2014) and determine the best supported model (Leaché et al. 2014). Because this program is computationally demanding, we reduced the total number of individuals to 27 to speed up the analysis. Using the K = 6 Structure analysis as a guide (results described below), we included between three to six specimens from major genetic clusters as follows: six each from S. zerene(sampled from Alberta, Idaho, Nevada, and California) and S. atlantis (sampled from Alberta, Ontario, and Colorado), five from the northern cluster of S. hesperis (sampled from Alberta, British Columbia, Montana, Colorado, and South Dakota), and, for the southern clusters of S. hesperis , we sampled four specimens from the New Mexico and Arizona population, and three specimens each from the southern Utah population and the central population (sampled from Idaho, Colorado, and Utah). We chose individuals for the BFD* analysis that had little or no genomic admixture in the Structure and TESS results to ensure that these analyses weren’t biased by contemporary hybridization.
We tested five species delimitation models of S. atlantis, S. hesperis, and S. zerene, informed by existing species delimitations and alternate assignments recovered in the clustering and phylogenetic analyses: (i) the “a priori ” model, following the current species delimitation for S. zerene, S. hesperis, andS. atlantis ; (ii) the “2 species” model, following the K= 2 Structure results; (iii) the “4 species” model, which splitsS. hesperis into northern and southern species; (iv) the “3 species” model, which lumps northern S. hesperis and S. atlantis ; and (v) the “6 species” model, following the K = 6 Structure and TESS results.
Following BFD* recommendations, we set the mutation parameters uand v to 1, and allowed the coalescence rate to be sampled via MCMC to reflect probable differences in population size between lineages. We also included non-polymorphic sites in the analysis because our dataset contained some missing data. BFD* implements a birth-only Yule tree prior, which we set to have a gamma distribution with a single parameter, λ, governing speciation rate (Leaché et al. 2014). We calculated λ from the maximum likelihood consensus SNP tree output from IQ-TREE using the package phytools 0.6-99 (Revell 2012) implemented in R 3.6.1 (R Core Team 2017), and used it to determine the β scale parameter with an α shape parameter of 2. This gave us a gamma distribution of λ=65.96, α=2, and β=32.9. To avoid potential sampling bias by the program due to the narrow parameter distribution for λ and β indicated by the data, we further relaxed our λ to 200 and our β to 100. Following program recommendations, we set our rate priors to also follow a gamma distribution, with λ=10, α=1, and β=250 (Leaché and Bouckaert 2018), and ran each scenario with 1 million MCMC chains, 200,000 burn-in generations, and 24 path sampling steps. Convergence was assessed using Tracer v. 1.6.0 (Rambaut et al. 2018), and TreeAnnotator 2.4.7 (Drummond & Rambaut 2007) was used to generate the maximum clade credibility tree. We additionally used DensiTree v. 2.0.1 (Bouckaert 2010) to visualize topological discordance in the posterior distribution of trees recovered during BFD* model testing.
We used TreeMix (Pickrell & Pritchard 2012) to assess putative introgression between S. zerene, S. hesperis, and S. atlantis . TreeMix uses a Gaussian model to estimate drift between populations, and identifies population pairs that have the highest residuals, which are interpreted by the program as putatively experiencing gene flow. TreeMix subsequently plots migration edges and infers directionality of admixture between these populations in order to improve the likelihood score of the model (Pickrell & Pritchard 2012; Kozak et al. 2018). We built an unrooted maximum likelihood phylogeny with 100 bootstrap replicates using blocks of 50 SNPs, and then sequentially added 0-5 migration edges to the tree. For each model, we re-ran TreeMix three times to ensure consistency in our results, and used jackknifing to estimate the weight and significance of each migration edge. We additionally calculated f 3 statistics for all possible combinations of populations to substantiate plotted migration edges using the threepop command implemented in TreeMix.f 3 tests estimate admixture between triplets of specified parental and mixed populations; a significantly negative f 3 statistic supports the hypothesis of an introgression event between two parental populations and a putatively admixed population (Reich et al. 2009; Pickrell & Pritchard 2012).
Ultimately, support for migration events between populations of S. hesperis, S. atlantis, and S. zerene was determined by considering multiple lines of evidence, including an assessment of the residuals for each model, the statistical significance of putative migration events, and concordance between the migration edges plotted on the maximum likelihood tree and admixture events indicated by thef 3 statistics.
Ecological niche divergence estimation of S. atlantis andS. hesperis
We created a series of ecological niche models (ENMs) to infer whether three major genomic lineages identified in this study, S. atlantis , northern S. hesperis , and southern S. hesperis , satisfy criteria for classification as distinct ecological species (sensu Nosil 2012) . ENMs were fit using MaxEnt software (Phillips et al . 2006), which uses machine-learning maximum entropy modelling of presence-only data to quantify species’ ecological niches as well as predict habitat associations and potential geographic distributions across heterogeneous landscapes (Elith et al. 2006, 2011). For each of the three lineages, inputs for ENMs included the georeferenced localities of sequenced individuals and a set of geographic information systems (GIS) predictor variables, classified as either geographic or environmental; derivation and sources of the predictor variables used in these analyses are provided in Supplemental File 1.
Before fitting ENMs, we generated buffered (100 km) minimum convex polygons around the localities of sequenced individuals assigned to each of the three lineages. GIS data layers were clipped to these polygons and 10,000 background points were generated within polygons to sample available habitat. For each lineage, we removed duplicate locality coordinates and generated five different locality lists, each withholding a different 20% of the data to allow for K -fold analysis. For each of these five lists, the remaining 80% of localities were used to fit an ENM (via the R package dismo, Hijmans et al.2011) and the withheld 20% were used to evaluate its predictive power. Model evaluation was completed using receiver operating characteristic (ROC) analysis and area-under-the-curve (AUC) scores (Phillips et al. 2006). AUC scores are bound between 0 and 1, with higher values indicating greater predictive power. Background points (n=10,000) used in model evaluation were confined to each lineage’s buffered minimum convex polygon, meaning only habitat within polygons was defined as available in AUC estimation to avoid inflated estimates of the models’ predictive power. Following model evaluation, fitted ENMs for each lineage were used to predict habitat suitability across the entire study area. Each 1-km grid cell received a habitat suitability score ranging from 0 – 1, with higher values indicating higher suitability.
Niche divergence among genomic lineages was quantified in a pairwise fashion. For each pair of lineages, we assessed power with which one lineage’s set of ENMs predicted the localities of the other lineage within its respective buffered minimum convex polygon. To accomplish this, ENMs for one lineage were used as fitted models and the other lineage’s localities were used as validation data for estimation of “between-lineage” AUC scores. Background points (n =10,000) used in these model evaluations were confined to the buffered minimum convex polygon of the validation data. Within- and between-lineage AUC scores for each pair of lineages were then used as dependent variables in generalized linear mixed effects models, fit using a beta distribution and a logit link using the R package glmmTMB (Brookset al. 2017). Within linear mixed effects models, a “within-vs. between-lineage” binary predictor variable reflected whether each AUC score corresponded to a within-lineage (0) or between-lineage (1) model evaluation. For pairs of lineages, these models thereby quantified whether there was a significant reduction in the power of one lineage’s ENMs when predicting the other lineage’s localities. To control for nonindependence resulting from partially overlapping sets of localities used to fit each lineage’s five ENMs, the ID of the lineage for which each ENM was fitted was included as a random effect. Within each linear mixed effects model, a significant negative effect of the “within- vs. between-lineage” binary variable would indicate that predicted habitat suitability of the two lineages involved in the pairwise comparison are significantly different, suggesting significant niche divergence.
It is possible that spurious inferences of niche divergence may arise from differences in available habitat used to parameterize ENMs. To investigate this possibility, we built a series of null ENMs for each lineage using randomly generated localities confined to the buffered minimum convex polygons of the actual localities of sequenced individuals (1 point/10,000 km2 for each lineage).K -fold analysis was completed as described above using five different random locality lists, each withholding a different 20% of the random localities. We then repeated within- and between-lineage model evaluations for each pair of lineages. Within- and between-lineage AUC scores were again compared using linear mixed effects models, identical in structure to those described above. Within each linear mixed effects model, the absence of a significant negative effect of the “within- vs. between-lineage” binary variable would indicate that observed niche divergences cannot be attributed to biases arising from differences in available habitat.