Genotyping, imputation, and downsampling
GBS libraries were prepared as previously described (Poland et al., 2012) using simultaneous restriction-ligation with HindIII-HF, MseI, and T4 DNA Ligase (NEB). Following the TASSEL GBS pipelineV2 (Glaubitz et al., 2014), BWA (Li and Durbin, 2009) was used to align 64 bp tags to reference assemblies for either the maternal parent species (P1), the paternal parent species (P2), or both maternal and paternal reference assemblies simultaneously (P1+P2). Reference assemblies for pistachio species were obtained from Palmer et al. (2022), for J. microcarpa from Zhu et al. (2019), and for J. regia from Marrano et al. (2020). Only tags that aligned uniquely (MAPQ>=20) were retained. The SNPQualityProfilerPlugin in TASSEL was used to remove candidate SNPs with low depth (log(depth)< -1) and low inbreeding coefficient (F < -0.05 for P1 and P2 alignments; F < 0.9 for P1+P2 alignments) before SNP calling. Vcftools (Danecek et al., 2011) was used to remove taxa with >90% missing data, and for depth thresholding (–minDP 5) of P1 and P2 datasets only. Imputation with Beagle 5.4 (Browning et al., 2018) was performed with no reference panel and a window size and walk speed of 12 and 4 Mb respectively. Downsampling (50%) was performed using the reformat.sh command in bbmap (Bushnell, 2014).