Phylogenetic and population genetic analyses
Phylogenetic analyses for both the COI gene and the filtered
genomic SNPs were conducted in IQ-TREE 1.3.10 (Nguyen et al.2015). Model testing, SH-aLRT branch testing, and 1000 replicates of
ultrafast bootstrapping (Hoang et al. 2018) were conducted in the
program.
Mitochondrial COI gene data was used to build a minimum spanning
haplotype network (Bandelt et al. 1999) in the program PopART
(Leigh & Bryant 2015), which outputs a visual representation of the
population genetic relationships between COI haplotypes. We used
Structure version 2.3.4 (Pritchard et al. 2000) and TESS version
2.3.1 (Chen et al. 2007) to infer population structure in the SNP
dataset. While both programs take a similar Bayesian approach to
population clustering based on changes in allele frequency, TESS differs
from Structure by additionally incorporating a spatial component for
inferring genetically disparate populations that may result from
geographic discontinuities. This is particularly useful when genetic
structure correlates to isolation by distance, which can contribute to
population over-splitting in non-spatial programs (Chen et al.2007). Campbell et al. (2019) showed strong genetic
sub-structuring within S. hesperis that corresponded to sets of
populations sampled southwest and northeast of the Rocky Mountains; TESS
and Structure were compared to clarify the extent that geography
influenced these results.
Structure analyses were run using the admixture model without using
sampling locations as a prior. We tested K = 1-10 with a burn-in
period of 150,000 generations, 750,000 MCMC chains, and 10 replicate
runs for each K value. We also conducted separate substructure
analyses for S. zerene, S. atlantis, and the northern
cluster of S. hesperis , testing K = 1-5 for each. We used
CLUMPAK v. 1.1 (Kopelman et al. 2015) to average runs and
determine the optimal K considering both ΔK (Evannoet al. 2005) and LnPr(K ) (Pritchard et al. 2000).
We used TESS to infer K = 2-7, and incorporated weights on the
Voronoi network by computing pairwise Euclidean distances between the
geographic coordinates for each specimen. These weights correct for
regions with irregular or unequal sampling during Voronoi neighbourhood
estimation. We ran this analysis using the CAR admixture model (Durandet al. 2009) for 10 replicates per K , with a burn-in
period of 50,000 and 200,000 sweeps (analogous to “generations” in
Structure), and sampled the spatial interaction parameter and variance
during the MCMC runs. Following program recommendations, we averaged the
Deviance Information Criterion (DIC) score for the 10 runs of each value
of K , and then identified the optimal K as the lowest
value of K at which the DIC scores stabilized.