Results
Patterns of neutral genetic variation
Neutral genetic diversity and population
structure
The analyses of genetic diversity with the Populations software yieldedHO estimates between 0.17 - 0.28, expected
heterozygosity HE estimates between 0.19 - 0.28,
and Fis estimates between 0.002 - 0.079 (Table 1 ). The
populations had 2 - 137 private alleles, and the number of private
alleles were higher in the three allopatric lake populations (63 - 137
per population) compared to all other populations (2 - 42 per
population)(Table 1) . Only a small proportion of the loci were
not in HWE (100 - 283 loci for each population, P <
0.05), and the Fis distributions were unimodal and peaked at zero
for all populations (Figure S3 ), suggesting that the majority
of the loci were not affected by null alleles and that the populations
were in HWE (Ravinet et al., 2016).
Analysis of pairwise population differentiation revealed that all
populations were significantly differentiated from each other (P< 0.0001; FST -value range: 0.06 to
0.25; Figure 2 ). The highest FST -values
were found in the pairwise comparisons including the anadromous
population Harfjärden, or at least one of the freshwater populations.
The fastSTRUCTURE analysis of all populations suggested that the most
likely number of genetic clusters (K ) was either 6 (for model
complexity that maximizes marginal likelihood) or 7 (for model used to
explain structure in the data)(Figure 1 ). The results obtained
using K = 6 and K = 7 were generally similar, and only
differed with regard to K = 7 assigning Snäckstavik to a
population-specific cluster, whilst it was assigned to a shared cluster
for K = 6. All three freshwater lake populations were mainly
assigned to population-specific clusters, indicating strong
differentiation from all other populations. Within the Baltic Sea
samples (anadromous and resident populations), both ecotype and region
seemed to influence the patterns of genetic structuring. The majority of
the anadromous populations (5 and 4 of 7 for K = 6 and K =
7, respectively) were assigned to a shared ’anadromous genetic cluster’.
All the populations assigned to this anadromous cluster were located on
the Swedish mainland, whilst the two anadromous populations that were
not included in the main anadromous cluster (for neither K = 6
nor K = 7), were Askeby from Denmark and Harfjärden from the
island of Öland. Notably, the two populations from Denmark (Askeby and
Stege Nor) were assigned to a shared genetic cluster despite belonging
to different ecotypes.
The fastSTRUCTURE analysis of the subset including only the Baltic Sea
populations (anadromous and resident) suggested that the most likely
number of K was either 3 (for model complexity that maximizes
marginal likelihood) or 4 (for model used to explain structure in the
data), and revealed the same patterns as were found when analysing all
populations (Figure S4 ). The results from the PCA were in
agreement with the findings from fastSTRUCTURE, showing that the
freshwater lake populations formed separate distinct clusters
(suggestive of differentiation). They further indicated that the
anadromous mainland populations formed an overlapping cluster, and that
the two Danish populations were genetically close to each other
(Figure S5 ).
Isolation by distance
Based on the Mantel tests, there was no association between geographic
and genetic distance for either all Baltic Sea populations (anadromous
and resident; R = -0.161, P = 0.678) or within the anadromous
ecotype (R = -0.204, P = 0.640; Figure S6 ).
Associations with environmental
variables
The db-RDA based on the full dataset revealed significant effects on
genetic distance of both midrange salinity
(F1,232 = 9.46, P < 0.001) and
latitude (F1,232 = 6.02, P <
0.001). In addition, a significant interaction effect was found between
latitude and midrange salinity (F1,231 = 8.60,P < 0.001). The biplot revealed that the individuals
clustered within populations, and that most of the populations were
separated (with the exception of some of the more closely located
anadromous populations overlapping). The direction of separation
associated with midrange salinity was almost parallell with the CAP1
axis, and that the direction of separation associated with latiude
corresponded fairly well with the CAP2 axis. In addition, the biplot
revealed that the interaction effect seems to reflect that the direction
of separation associated with midrange salinity explains the
differentiation between the ecotypes, whilst the direction of separation
associated with latitude explains the differentiation among the
anadromous populations (the arrow aligns along the seperation among the
anadromous populations; Figure 3a ).
Phylogenetic analysis
The phylogenetic analysis revealed overall low levels of variation and
shallow branching, which suggests that the populations are closely
related (Figure 4) . The samples were, however, clearly grouped
into separate populations, and recent nodes within populations were
supported by relatively high bootstrap values. The outgroup (reference
genome published by Rondeau et al.
(2014)) was most closely related to one of the anadromous populations
(Ängerån), and the anadromous populations constituted a paraphyletic
group. However, bootstrap values associated with the deeper nodes were
low, and the relationships between the populations could therefore not
be reliably resolved.
Patterns of adaptive genetic variation
Loci putatively under
selection
All approaches utilized for the outlier analyses identified loci
putatively under selection. Locus-specific effects were found for 28
loci with BayeScan (q -values < 0.05), for 231 loci with
Fdist (P < 0.01), and for 635 loci with LOSITAN
(P < 0.01). Of all these loci, 17 were identified by
all three software (Figure 5, Table 2 ). These 17 loci were used
to search for candidate genes putatively under selection, which showed
that 10 of the loci were located in previously annotated genes, and
because two of the loci resided in the same gene, nine candidate genes
were identified (Table 2 ).
When utilizing BayeScEnv to search for associations between allele
frequencies and the environmental variables (midrange salinity and
latitude), 13 loci were identified as outliers (q -value
< 0.05; Figure 5, Table 3 ), of which eight
were located in previously annotated genes. Because two of the loci
resided in the same gene, seven candidate genes were identified
(Table 3 ), of which three were also identified in the test of
locus-specific effects mentioned in the previous paragraph. The
annotation further revealed that some of the identified candidate genes
have previously been found to be associated with salinity. One of these
genes was potassium voltage-gated channel subfamily A member 10
(KCNA10), which encodes a voltage-dependent potassium-selective channel
that has been found to be associated with salinity stress in blue
mussels (Mytilus spp.)(Lockwood &
Somero, 2011). Another gene was vesicle-associated membrane protein 7
(VAMP7), which is crucial for calcium regulated lysosomal exocytosis,
and has been found to be involved in salt-tolerance in Arabidopsis
thaliana (Leshem et al., 2006). These
genes therefore seem particularly interesting as candidate genes
involved in adaptation to salinity. The results also revealed some genes
that might be associated with temperature. One gene that seems extra
interesting is zinc-finger protein 436-like (ZNF436-like), which was
also identified as putatively under selection in the previous study of
three of the populations in the present study
(Sunde et al., 2020a). The exact
function of ZNF436 in fish is not known, but it is a transcription
factor that has been found to have a critical role in regulating early
cardiac development in humans (Fu et al.,
2018). Other transcription factors have also previously been found to
be important in responses to heat-stress in sea urchin
(Strongylocentrotus
intermedius )(Zhan et al., 2019), and
some ZNF genes associated with acclimation to low salinity in the
euryhaline fish half-smooth tongue sole (Cynoglossus
semilaevis )(Si et al., 2018) and
migratory behaviour in brown trout (Salmo
trutta )(Lemopoulos, Uusi-Heikkilä,
Huusko, Vasemägi, & Vainikka, 2018).
Adaptive genetic diversity and structure, and effects of
environmental
variables
The Populations software yielded HO estimates in
the range between 0.08 - 0.26 and HE estimates in
the range between 0.09 - 0.27 (Table 1 ). The fastSTRUCTURE
analysis revealed that the pattern of adaptive genetic structure
differed from that of netural genetic structure (Figure 1 ). The
grouping of the mainland anadromous populations in a shared genetic
cluster observed for the full (neutral) dataset was not detected in the
adaptive dataset. The adaptive data set instead revealed a pattern of
adaptive structuring associated with latitude, and further suggested
that two of the freshwater populations (that were strongly neutrally
differentiated) shared a genetic cluster (Figure 1 ).
As for the full dataset, the db-RDA based on the adaptive dataset
revealed significant effects on genetic distance of both midrange
salinity (F1,232 = 11.90 , P <
0.001) and latitude (F1,232 = 14.48, P< 0.001), and a significant interaction effect
(F1,231 = 17.29, P < 0.001). The
biplot revealed that there was considerably more overlap between
populations and ecotypes compared to the full dataset. The direction of
separation associated with latitude corresponded fairly well with the
CAP1 axis, and the direction of separation associated with midrange
salinity did not correspond with either of the first two CAP axes. As
for the full dataset, the direction of separation associated with
latitude aligned with the separation of the anadromous populations
(Figure 3b ). However, the separation among ecotypes was not as
clear for the adaptive as for the full dataset, and one of the
freshwater populations overlapped with some of the anadromous
populations.