Conclusions and remaining questions
We show here that allozyme supernatant and formalin-fixed samples can be
used for both phylogenomic and many population genomic applications. In
particular, allozyme supernatant replicates performed similarly to the
frozen tissue pellet replicates in all analyses. On the other hand, only
six of ten formalin-fixed samples recovered >25% of SNPs
and were useful for analyses. The four formalin-fixed samples that
failed had lower extraction yields and high levels of exogenous DNA,
potentially corresponding to the amount of time larval samples were left
in formalin (10+ years). We recommend that libraries derived from
formalin-fixed DNA should be sequenced at greater depths, and multiple
samples could be included for lineages or populations of interest in the
event that some samples fail. We also document potential biases
associated with combining RADseq and capture datasets in shared
analyses, including biases in heterozygous SNP calls, clustering by
replicate type rather than by specimen, and systematic differences in
estimates of genetic diversity. We found that mapping reads to longer
reference sequences derived from assembled contigs rather than mapping
directly to RAD loci addressed some of these discrepancies. However,
systematic biases between RADseq and capture replicates remained in our
dataset and we caution that researchers should be aware of these issues
especially for studies in which such a bias could impact the
interpretation of results (e.g., inferring changes in heterozygosity
through time).