2.3 SNP calling: genotype likelihoods and genotype calls
We called SNPs in two different ways: calculating genotype likelihoods
with ANGSD and calling genotypes (generation of a vcffile) with Bcftools . Genotype likelihoods take into account
genotype uncertainty and allow to obtain SNPs at very low coverages (Lou
et al., 2021). We calculated genotype likelihoods in two datasets, one
including Black Sea (BLS) individuals and another without BLS
individuals using the Samtools model (GL 1), keeping SNPs with a
minimum minor allele frequency (MAF) of 0.05, having data in a minimum
75% of the individuals and a SNP
p-value<1e-6. The beagle file generated withANGSD was used as an input in the population structure analysis
and to calculate population genomics summary statistics. Genotypes were
called with Bcftools commands mpileup and call,with the multiallelic and rare-variant calling option -m , in
alignments with minimum mapping (-q) and minimum base (-Q)quality of 30. We also used Bcftools to subsequently retain only
high-quality SNPs: we removed non-biallelic sites, indels, SNPs with MAF
below 0.05 and SNPs for which we did not yield genotype information in
at least 75% of the individuals. The vcf file generated was used
as an input in the demographic history and seascape genomics analysis.