Population diversity
The two isolates composing the New Zealand population were omitted from
further population genetic analyses. As linkage disequilibrium (LD) is
expected to decay rapidly with large effective population size and high
recombination rate, here we calculated LD within 5 kb non-overlapping
windows for the European and the Japanese populations separately using
the –geno-r2 command in VCFtools (Danecek et al.2011). The mean r2 values for each distance between
loci were plotted in R to visualize LD decay. Given that the LD decay
decreased rapidly for the Japanese population, a genomic window of 10 kb
was chosen as a compromise between LD decay and SNP density for the
analyses of genome-wide diversity within and between Japan and Europe.
Nucleotide diversity π (Nei & Li 1979) was calculated for both
populations using VCFtools. Levels of genetic differentiation between
populations was estimated by calculating FST (Hudsonet al . 1992) and nucleotide substitution per site
(DXY) using the python script from Simon Martin
(https://github.com/simonhmartin). VCFtools was used to
estimate Tajima´s D statistics (Tajima 1989) across the 10 kb windows,
in order to detect departure from the standard neutral model. Tajima’s D
was estimated on a genome wide level and for specific genomic windows of
interest. Estimated on a single locus, positive values of Tajima´s D
indicate balancing selection, while negative values indicate directional
selection. Whereas genome-wide distribution of Tajima´s D values can
give insights into demographic population events, with negative value as
an indicator of population expansion while positive values indicate
population contraction. Manhattan plots were created using the R package
qqman to visualize the FST, DXY, π and
Tajima’s D along the whole genome and for scaffolds of interest (Turner
2014).