Results

Patterns of neutral genetic variation

Neutral genetic diversity and population structure

The analyses of genetic diversity with the Populations software yieldedHO estimates between 0.17 - 0.28, expected heterozygosity HE estimates between 0.19 - 0.28, and Fis estimates between 0.002 - 0.079 (Table 1 ). The populations had 2 - 137 private alleles, and the number of private alleles were higher in the three allopatric lake populations (63 - 137 per population) compared to all other populations (2 - 42 per population)(Table 1) . Only a small proportion of the loci were not in HWE (100 - 283 loci for each population, P < 0.05), and the Fis distributions were unimodal and peaked at zero for all populations (Figure S3 ), suggesting that the majority of the loci were not affected by null alleles and that the populations were in HWE (Ravinet et al., 2016).
Analysis of pairwise population differentiation revealed that all populations were significantly differentiated from each other (P< 0.0001; FST -value range: 0.06 to 0.25; Figure 2 ). The highest FST -values were found in the pairwise comparisons including the anadromous population Harfjärden, or at least one of the freshwater populations.
The fastSTRUCTURE analysis of all populations suggested that the most likely number of genetic clusters (K ) was either 6 (for model complexity that maximizes marginal likelihood) or 7 (for model used to explain structure in the data)(Figure 1 ). The results obtained using K = 6 and K = 7 were generally similar, and only differed with regard to K = 7 assigning Snäckstavik to a population-specific cluster, whilst it was assigned to a shared cluster for K = 6. All three freshwater lake populations were mainly assigned to population-specific clusters, indicating strong differentiation from all other populations. Within the Baltic Sea samples (anadromous and resident populations), both ecotype and region seemed to influence the patterns of genetic structuring. The majority of the anadromous populations (5 and 4 of 7 for K = 6 and K = 7, respectively) were assigned to a shared ’anadromous genetic cluster’. All the populations assigned to this anadromous cluster were located on the Swedish mainland, whilst the two anadromous populations that were not included in the main anadromous cluster (for neither K = 6 nor K = 7), were Askeby from Denmark and Harfjärden from the island of Öland. Notably, the two populations from Denmark (Askeby and Stege Nor) were assigned to a shared genetic cluster despite belonging to different ecotypes.
The fastSTRUCTURE analysis of the subset including only the Baltic Sea populations (anadromous and resident) suggested that the most likely number of K was either 3 (for model complexity that maximizes marginal likelihood) or 4 (for model used to explain structure in the data), and revealed the same patterns as were found when analysing all populations (Figure S4 ). The results from the PCA were in agreement with the findings from fastSTRUCTURE, showing that the freshwater lake populations formed separate distinct clusters (suggestive of differentiation). They further indicated that the anadromous mainland populations formed an overlapping cluster, and that the two Danish populations were genetically close to each other (Figure S5 ).

Isolation by distance

Based on the Mantel tests, there was no association between geographic and genetic distance for either all Baltic Sea populations (anadromous and resident; R = -0.161, P = 0.678) or within the anadromous ecotype (R = -0.204, P = 0.640; Figure S6 ).

Associations with environmental variables

The db-RDA based on the full dataset revealed significant effects on genetic distance of both midrange salinity (F1,232 = 9.46, P < 0.001) and latitude (F1,232 = 6.02, P < 0.001). In addition, a significant interaction effect was found between latitude and midrange salinity (F1,231 = 8.60,P < 0.001). The biplot revealed that the individuals clustered within populations, and that most of the populations were separated (with the exception of some of the more closely located anadromous populations overlapping). The direction of separation associated with midrange salinity was almost parallell with the CAP1 axis, and that the direction of separation associated with latiude corresponded fairly well with the CAP2 axis. In addition, the biplot revealed that the interaction effect seems to reflect that the direction of separation associated with midrange salinity explains the differentiation between the ecotypes, whilst the direction of separation associated with latitude explains the differentiation among the anadromous populations (the arrow aligns along the seperation among the anadromous populations; Figure 3a ).

Phylogenetic analysis

The phylogenetic analysis revealed overall low levels of variation and shallow branching, which suggests that the populations are closely related (Figure 4) . The samples were, however, clearly grouped into separate populations, and recent nodes within populations were supported by relatively high bootstrap values. The outgroup (reference genome published by Rondeau et al. (2014)) was most closely related to one of the anadromous populations (Ängerån), and the anadromous populations constituted a paraphyletic group. However, bootstrap values associated with the deeper nodes were low, and the relationships between the populations could therefore not be reliably resolved.

Patterns of adaptive genetic variation

Loci putatively under selection

All approaches utilized for the outlier analyses identified loci putatively under selection. Locus-specific effects were found for 28 loci with BayeScan (q -values < 0.05), for 231 loci with Fdist (P < 0.01), and for 635 loci with LOSITAN (P < 0.01). Of all these loci, 17 were identified by all three software (Figure 5, Table 2 ). These 17 loci were used to search for candidate genes putatively under selection, which showed that 10 of the loci were located in previously annotated genes, and because two of the loci resided in the same gene, nine candidate genes were identified (Table 2 ).
When utilizing BayeScEnv to search for associations between allele frequencies and the environmental variables (midrange salinity and latitude), 13 loci were identified as outliers (q -value < 0.05; Figure 5, Table 3 ), of which eight were located in previously annotated genes. Because two of the loci resided in the same gene, seven candidate genes were identified (Table 3 ), of which three were also identified in the test of locus-specific effects mentioned in the previous paragraph. The annotation further revealed that some of the identified candidate genes have previously been found to be associated with salinity. One of these genes was potassium voltage-gated channel subfamily A member 10 (KCNA10), which encodes a voltage-dependent potassium-selective channel that has been found to be associated with salinity stress in blue mussels (Mytilus spp.)(Lockwood & Somero, 2011). Another gene was vesicle-associated membrane protein 7 (VAMP7), which is crucial for calcium regulated lysosomal exocytosis, and has been found to be involved in salt-tolerance in Arabidopsis thaliana (Leshem et al., 2006). These genes therefore seem particularly interesting as candidate genes involved in adaptation to salinity. The results also revealed some genes that might be associated with temperature. One gene that seems extra interesting is zinc-finger protein 436-like (ZNF436-like), which was also identified as putatively under selection in the previous study of three of the populations in the present study (Sunde et al., 2020a). The exact function of ZNF436 in fish is not known, but it is a transcription factor that has been found to have a critical role in regulating early cardiac development in humans (Fu et al., 2018). Other transcription factors have also previously been found to be important in responses to heat-stress in sea urchin (Strongylocentrotus intermedius )(Zhan et al., 2019), and some ZNF genes associated with acclimation to low salinity in the euryhaline fish half-smooth tongue sole (Cynoglossus semilaevis )(Si et al., 2018) and migratory behaviour in brown trout (Salmo trutta )(Lemopoulos, Uusi-Heikkilä, Huusko, Vasemägi, & Vainikka, 2018).

Adaptive genetic diversity and structure, and effects of environmental variables

The Populations software yielded HO estimates in the range between 0.08 - 0.26 and HE estimates in the range between 0.09 - 0.27 (Table 1 ). The fastSTRUCTURE analysis revealed that the pattern of adaptive genetic structure differed from that of netural genetic structure (Figure 1 ). The grouping of the mainland anadromous populations in a shared genetic cluster observed for the full (neutral) dataset was not detected in the adaptive dataset. The adaptive data set instead revealed a pattern of adaptive structuring associated with latitude, and further suggested that two of the freshwater populations (that were strongly neutrally differentiated) shared a genetic cluster (Figure 1 ).
As for the full dataset, the db-RDA based on the adaptive dataset revealed significant effects on genetic distance of both midrange salinity (F1,232 = 11.90 , P < 0.001) and latitude (F1,232 = 14.48, P< 0.001), and a significant interaction effect (F1,231 = 17.29, P < 0.001). The biplot revealed that there was considerably more overlap between populations and ecotypes compared to the full dataset. The direction of separation associated with latitude corresponded fairly well with the CAP1 axis, and the direction of separation associated with midrange salinity did not correspond with either of the first two CAP axes. As for the full dataset, the direction of separation associated with latitude aligned with the separation of the anadromous populations (Figure 3b ). However, the separation among ecotypes was not as clear for the adaptive as for the full dataset, and one of the freshwater populations overlapped with some of the anadromous populations.