2.3 SNP calling: genotype likelihoods and genotype calls
We called SNPs in two different ways: calculating genotype likelihoods with ANGSD and calling genotypes (generation of a vcffile) with Bcftools . Genotype likelihoods take into account genotype uncertainty and allow to obtain SNPs at very low coverages (Lou et al., 2021). We calculated genotype likelihoods in two datasets, one including Black Sea (BLS) individuals and another without BLS individuals using the Samtools model (GL 1), keeping SNPs with a minimum minor allele frequency (MAF) of 0.05, having data in a minimum 75% of the individuals and a SNP p-value<1e-6. The beagle file generated withANGSD was used as an input in the population structure analysis and to calculate population genomics summary statistics. Genotypes were called with Bcftools commands mpileup and call,with the multiallelic and rare-variant calling option -m , in alignments with minimum mapping (-q) and minimum base (-Q)quality of 30. We also used Bcftools to subsequently retain only high-quality SNPs: we removed non-biallelic sites, indels, SNPs with MAF below 0.05 and SNPs for which we did not yield genotype information in at least 75% of the individuals. The vcf file generated was used as an input in the demographic history and seascape genomics analysis.