Phylogenomic and population genomic analyses
Maximum Likelihood analysis based on 3997 SNPs (contig dataset) inferred
a phylogenetic split between Virginia and Ohio samples (Fig. 4). In all
cases, technical replicates of the same specimen were placed together,
even with levels of missing data as high as 87% (USNM 525251).
Principle component analyses were strongly affected by levels of missing
data (Fig. 5A) and by differences between RADseq and capture-based
replicates (Fig. 5B). When analyzing all samples (3997 SNPs, n=32), PC 1
(24%) separated samples by geography, but with less separation of
replicates with high levels of missing data (Fig. 5A). PC 2 (17%)
separated replicates with high levels of missing data, as well as the
RADseq and capture-based replicates to some degree. The filtered dataset
in which samples with high missingness were removed (3997 SNPs, n=27)
separated replicates by geography along PC 1 (29%) and separated RADseq
and capture-based replicates along PC 2 (18%) (Fig. 5B). The contig
dataset filtered to only capture-based replicates and loci shared by
90% of samples (713 SNPs, n=25) split samples by geography along PC 1
(20%), and again by level of missingness along PC 2 (12%). The most
divergent samples along PC 2 were all formalin-fixed samples, which had
the highest amounts of missing data (Fig. S8).
Estimates of nucleotide diversity yielded similar values for supernatant
and pellet replicates: supernatant = 0.35 (SD = 0.004), pellet = 0.34
(SD = 0.008), but significantly different values for formalin-fixed and
RADseq replicates: formalin-fixed = 0.25 (SD = 0.007), RADseq = 0.26 (SD
= 0004; Fig. 6). The strict SNP filtering regime (95% complete, 298
SNPs) reduced differences between the formalin-fixed and other
capture-based replicates, but still inferred significant differences in
estimates between RADseq and all capture-based replicates (Fig. S3A).
Counts of heterozygous sites inferred similar levels of heterozygosity
between the three capture replicates: supernatant (31.6% SD = 5.8),
pellet (30% SD = 3.3), formalin-fixed (29.3% SD = 5.6). RADseq
replicates had a significant homozygote bias (12.4% SD = 3.4) compared
with the capture-based replicates (Fig. S4).