Data processing and linkage map construction in Lep-MAP3
We used mpileup in SAMtools v1.9 (Li et al., 2009), and the pileupParser2 and pileup2posterior scripts implemented in Lep-MAP3 (Rastas, 2017), to align trimmed reads to both the male and female Hi-C HiRise genome assemblies separately. Resulting posterior files were used as input into Lep-MAP3 to produce male- and female-aligned linkage maps. We used identity-by-descent (IBD) scores to verify the assignment of individuals to discrete families by removing individuals with less than 25% IBD to at least half of the individuals within their respective families. The ParentCall2 module imputed missing genotypes in the F1 parents that were not recovered from the bolts. The Filtering2 module removed markers with high segregation distortion or excessive missing data (data tolerance score of 0.01, following Lep-MAP3 recommendations).
Next, we used the SeparateChromosomes2 module of Lep-MAP3 to separate SNPs into distinct linkage groups representing putative chromosomes. We required retained linkage groups to contain at least 70 SNPs, and set the informativeMask parameter to 23, which excluded markers that were informative only for the fathers (i.e., we retained markers that were either informative for the mothers, or for both mothers and fathers). Including markers informative only for fathers substantially reduced the number of SNPs assigned to linkage groups. We adjusted LOD scores until the number of retained linkage groups closely matched the known number of chromosomes (11 autosomes + 2 neo-sex chromosomes) based on mountain pine beetle karyology (Lanier & Wood, 1968). In general, the appropriate LOD score should be similar to the number of chromosomes in the genome (Rastas, 2017).
We then ordered each linkage group in both maps five times using the OrderMarkers2 module of Lep-MAP3 and selected the marker order for each group with the highest likelihood score. We checked each file for incorrect marker ordering by visualizing linkage group graphs with xdot v1.1 (Fonseca, 2019). If any of these graphs indicated improper marker ordering, we discarded that replicate, chose the replicate with the next-highest likelihood score, and checked it again. This produced separate SNP recombination distances for male and female specimens in each linkage group, which were used as input for ALLMAPS (Tang et al., 2015). These linkage maps were used to inform joining, ordering, and orientation of scaffolds in the male and female Hi-C HiRise genome assemblies (described above). After these assembly modifications and subsequent steps were completed, we reproduced the male- and female-aligned linkage maps and ALLMAPS figures using the final versions of the male and female genome assemblies and the same parameters described above. We then repeated the marker ordering step in Lep-MAP3, this time outputting a single, sex-averaged distance for each SNP in the linkage groups in order to produce chromosome maps using the LinkageMapView package (Ouellette et al., 2018) in R v3.6.1 (R Core Team, 2020). We visualized each chromosome as a density map to identify regions with strong genetic linkage, indicated by shorter per-locus cM distances.