RADseq sample preparation and target capture probe design
Double-digest RADseq libraries for seven of our samples were prepared as
part of another study (KPM unpublished) following Peterson et al. (2012)
with minor modifications. Samples were digested using the enzymesSphI and EcoRI (NEB, Ipswich, MA, USA) and size selected
for fragments 450–550 bps following Hime et al. (2019). Samples were
sequenced with 396 other samples on a single NovaSeq 6000 S4 PE150 run,
resulting in an average of 8.5 million reads per sample. We analyzed
these samples using ipyrad v.0.9.12 (Eaton & Overcast, 2020) with a
clustering threshold of 0.95 and left all other settings at default. We
selected all loci that were present in all samples and included at least
one SNP, resulting in 32,547 RADseq loci of 132 bp in length. We
selected sequences from one individual at random to serve as a reference
and sent these data to Arbor Biosciences who further filtered loci by
soft masking for simple repeats and low-complexity regions as well as
loci with GC content > 35% and< 50%, resulting in 14,426 of the original 32,547 loci.
Arbor Biosciences randomly sampled 10,000 target loci from this
remaining set and synthesized 20,000 - 80 bp baits with
~2x tiling of the target 132 bp loci.