2.3 | Genetic diversity and inbreeding
We compared levels of genetic diversity, size and abundance of ROHs, and
relatedness between combined samples of S. catenatus and S.
tergeminus and between individual populations of S. catenatus .
We used ROHan v.1.0 (Renaud, Hanghøj, Korneliussen, Willerslev, &
Orlando, 2019) to estimate genome-wide heterozygosity
(ΘW), fraction of the genome in ROH
(F ROH), and number of ROHs
(N ROH) for each sample. This program combines
local Bayesian and hidden Markov models to generate reliable estimates
of ΘW and identify ROHs from low-coverage samples;
furthermore, it does not require stringent mapping and base quality
filters, since these metrics are informative for the models (Renaud et
al., 2019).
Following Benazzo et al. (2017), we defined ROHs as genomic regions> 50 kb with a heterozygosity rate ≤ 5 x
10−4 (i.e., ≤ 25 heterozygous genotypes in 50-kb
sliding windows), thus accounting for potential sequencing errors. For
this analysis we used individual BAM files downsampled to 5x coverage to
make samples statistically comparable. Also, to maximize our ability to
detect long ROHs (see below), we limited searches to the 135 scaffolds ≥
2 Mb in length, which comprised ~24% of the S.
catenatus genome assembly (Broe et al., in prep.).
We quantified the impact of inbreeding on genome-wide levels of
variation by comparing individual estimates ofF ROH with individual estimates of
ΘW and N ROH. Inbred individuals
should show a high F ROH and low
ΘW (Saremi et al., 2019), whereas individuals from
populations that have experienced a recent bottleneck should show an
excess of N ROH (Ceballos et al., 2018). To make
cross-study comparisons with other threatened and endangered species and
subspecies (Benazzo et al., 2017; Grossen et al., 2020; Robinson et al.,
2019; Saremi et al., 2019; van der Valk et al., 2019), we also
recalculated F ROH for ROH sizes of ≥ 0.1, 1, 2,
and 2.5 Mb.
Finally, estimates of individual relatedness can provide an additional
evaluation of the level of inbreeding occurring within populations. We
used ANGSD v.0.930 (Korneliussen, Albrechtsen, & Nielsen, 2014) and
NgsRelate v.2 (Korneliussen & Moltke, 2015) to calculate relatedness
for pairs of individuals from the same population based on genotype
likelihood distributions from low-coverage data. We focused on ther xy statistic (Hedrick & Lacy, 2015) because it
is designed for estimating individual relatedness in populations where
inbreeding occurs. We used the downsampled BAM files (with reads mapped
only to the 135 scaffolds ≥ 2 Mb) grouped into populations for the
analysis, while ANGSD/NgsRelate filtering parameters consisted of
setting mapping and base qualities ≥ 20, SNP P ≤ 1 x
10−6, and MAF ≥ 0.05.