Discussion
Our comparative assessment revealed that a typical microsatellite
dataset (18 loci in H. molleri and 14 in P. cultripes ) can
yield similar range-wide patterns of genetic structure than those
inferred with a few thousand SNPs (15,412 and 33,140, respectively).
Differences across marker types involved mainly inference of the optimal
number of clusters (K), and assessment of individual and population
admixture levels.
Effect of marker type in model-based clustering and genetic
diversity
We found overall concordance between markers in recovering the same
major genetic clusters in STRUCTURE analyses (Figure 1), although the
model-free clustering approach based on NJ yielded poorly supported
clustering for microsatellites compared to SNPs (Supplementary File S1).
This shows that population genetic models implemented in STRUCTURE are
efficient to infer genetic structure from a few highly polymorphic
microsatellite loci, with comparable performance to analyses using
thousands of bialellic SNPs. Many population genetics studies often rely
on a few microsatellites compared to the hundreds or thousands of SNPs
needed to address similar questions regarding population structure
(Haasl & Payseur, 2011; Puckett, 2017). Previous research comparing
both marker types claimed that SNPs offered a “better” resolution to
address biological questions when compared to microsatellites, usually
referring to SNPs being able to identify more differentiated genetic
clusters (Rašić et al., 2014; Malenfant et al., 2015; Jeffries et al.,
2016; Puckett & Eggert, 2016; Elbers et al., 2017; Hodel et al., 2017;
McCartney-Melstad et al., 2018). These assertions in favor of SNPs over
microsatellites could potentially be exaggerated, because they mostly
derive from non-parametric (e.g. PCA or DAPC) instead of model-based
methods.
There were, however, discordances between markers. The inferred optimal
number of clusters was not consistent across marker types and method of
estimation (Figure S4; Table S3). The clear peak of ΔK at K = 2 in the
SNP dataset in H. molleri contrasted with the peak of ΔK at K = 4
and K = 6 in microsatellites in our results and in Sánchez‐Montes et al.
(2019), respectively. Also, the PI pointed to higher larger optimal
values of K than those selected by Evanno’s ΔK (Figure S4). The clear
peaks of ΔK at K = 2 in the SNP datasets describe the top level of
hierarchical population structure and must be interpreted cautiously,
since K = 2 is the optimal K value most often reported across studies
even when further genetic substructure is present (Janes et al., 2017).
The number of samples per population can have a strong effect on the
optimal K value inferred (Puechmaille, 2016). Furthermore, the history
of populations is often more complex than the “top-level” clustering
approach in STRUCTURE, and as K increases, violations in the assumptions
of STRUCTURE may hamper the inference of the correct population
structure (Lawson, van Dorp, & Falush, 2018).
Inferred ancestral groups (clusters) and their proportions of ancestry
were not fully congruent between marker types. The similarity in the
Q-matrices between markers varied for both species across K. This was
evidenced by ancestral groups arising at different K depending on the
marker type and the different characterization of ancestral groups
reflected in the amount of admixture and spatial extent of the clusters
(Figure 1). Also, genetic admixture was higher in microsatellites than
in SNPs only at larger K values for P. cultripes , driven by the
localities with higher genetic diversity (central and southern Iberia)
(Figures 1 and S5). Greater genetic admixture detected by
microsatellites, together with their greater variance in STRUCTURE
solutions at large values of K (Figure 2), suggest microsatellites have
reduced power to detect weaker or more complex signals of genetic
structure, as those reflected at larger values of K. For SNPs, even at
larger values of K (K > 6), the SSC fell into alternative
discrete solutions (Figure 2). These alternative solutions to the
optimal K problem deserve independent biological interpretations
(Kopelman et al., 2015; Pritchard et al., 2000; Wang et al., 2007), but
should be considered with caution to avoid over-interpretation (Lawson
et al., 2018). Previous studies comparing STRUCTURE results between SNPs
and microsatellites used datasets or approaches that were not fully
comparable between the two marker types, limiting the scope of their
conclusions. For instance, Bradbury et al. (2015) described different
levels of admixture between markers but used different biological
samples for each marker type, while Bohling et al., (2019) relied on
different clustering approaches, NGSadmix (for SNP data) and STRUCTURE
(for microsatellites), to conclude that microsatellites yielded less
precise and less consistent results. Of the few studies that used
exactly the same individuals and clustering approach across different
marker types, Lemopoulos et al. (2018) found nearly identical ancestry
memberships, whereas Malenfant et al. (2015) reported more admixed
ancestries for microsatellites, in agreement with our results.
Genetic diversity increased with latitude in SNPs for both species but
only in P. cultripes for microsatellites. Genetic diversity as
estimated from SNPs was spatially more coherent with genetic structure,
showing less variance between localities from the same cluster (e.g. the
southern group of P. cultripes ) (Figure 3). The differences in
genetic diversity between marker types resulted in a weak correlation of
the corresponding sMLH values (Supplementary Figure S6). This suggests
that microsatellites might be offering sub-optimal representative
measures of genomic diversity (Fischer et al., 2017; Väli, Einarsson,
Waits, & Ellegren, 2008) compared to SNPs.
The lack of full agreement between spatial patterns of genetic diversity
between marker types, even when comparable datasets and analytical
methods are used could be explained by the larger number of SNPs
compared to microsatellite loci. In this sense, the high number of SNPs
may overcome some of the limitations of using few loci as surrogates of
genome-wide variation, like stochasticity related to loci selection and
the associated ascertainment bias (Fischer et al., 2017; Guillot &
Foll, 2009; Lemopoulos et al., 2019; Morin et al., 2004). Different
marker discovery approaches (e.g. representation of functional genomic
regions) could be related to some of the differences between markers
(Clark, Hubisz, Bustamante, Williamson, & Nielsen, 2005; Dufresnes,
Brelsford, Béziers, & Perrin, 2014; Lachance & Tishkoff, 2014).
Additionally, differences in type and rate mutation could also account
for the differences in patterns between markers (Ellegren, 2004; Morin
et al., 2004). The better representation of loci covering a wider
evolutionary scale in SNPs compared to microsatellites (Haasl &
Payseur, 2011; Linck & Battey, 2019; Morin et al., 2004) could be
responsible to some loss of resolution with microsatellites when
representing older demographic processes.
Contribution of SNPs to the evolutionary history of H.
molleri and P.
cultripes
Our SNPs results on H. molleri are consistent with those of
Sánchez-Montes et al. (2019) in recovering two major clusters, southern
and northern, coinciding with the two major mitochondrial lineages and
the north/south microsatellite clusters. Patterns of genetic diversity
as measured with SNPs increase with latitude and decrease from coastal
localities of central Portugal towards the east (Figure 3).
Sánchez-Montes et al. (2019) also found greater mitochondrial and
microsatellite diversity in western localities, but no clear association
with latitude. Our findings support the existence of southwestern
refugia for H. molleri in Iberia, where it would have persisted
through glacial cycles in Atlantic central-south Portugal and Sierra
Morena, followed by two major historical dispersal axes, towards the
north and east.
For P. cultripes , analysis of the SNP dataset provided results
congruent with microsatellites and mitochondrial DNA from
Gutiérrez-Rodríguez et al. (2017), identifying three main lineages: a
southern one with high genetic diversity and complex genetic structure,
a second lineage in the Northern Plateau with low genetic diversity, and
a third lineage in the northeast, with very low genetic diversity
(Figures 1 and 3). The two latter groups were not further
sub-structured, but we found signs of finer substructure in the southern
lineage. Our results support northern and eastern colonization routes
from southern refugia, with the Northern-Plateau lineage probably
resulting from a relatively recent colonization event, contrasting with
the interpretation of Gutiérrez-Rodríguez et al. (2017), who suggested
the existence of a Northern-Plateau refugium. This study thus adds to
the growing body of evidence showing the importance of southern refugia
for a broad range of taxa in the Iberian Peninsula across glaciations in
the Pleistocene (Gómez & Lunt, 2007). Inferred trends of decreasing
genomic diversity towards northern latitudes provide valuable
information for the management of the genetically diverse populations
from southern refugia and their less diverse northern counterparts, both
of which face increased risk of extinction under future climatic
scenarios (Araújo, Guilhaumon, Neto, Ortego, & Calmaestra, 2011).