Genetic analyses
To determine if genetic differences existed among individuals expressing
different feeding strategies, the 79 lake trout classified by fatty acid
composition into four groups were genotyped to determine genetic
variation and structure within and among groups. To allow a sample size
sufficient for making a genetic comparison of giants to the other
dietary groups, 22 additional individuals determined non-randomly by
their size (≥ 900 mm ; giant sub-set) from the 2002-2015 collections
were added to giants processed for fatty acids, for a total of 39
giants. Lake trout DNA was extracted from pectoral fin tissue preserved
in ethanol using DNEasy extraction kits (Qiagen Inc., Valencia, CA)
following manufacturer protocols. Piscivorous groups were assayed using
a suite of 23 putatively neutral microsatellite markers amplified in
four multiplexes previously described in Harris et al. (2015). Amplified
microsatellite fragments were analyzed using an automated sequencer (ABI
3130xl Genetic Analyzer; Applied Biosystems, Foster City, CA). The LIZ
600 size standard was incorporated for allele base-size determination.
All genotypes were scored using GeneMapper software ver. 4.0 (Applied
Biosystems) and then manually inspected to ensure accuracy.
The program MICROCHECKER ver. 2.2.0.3 (Van Oosterhout et al., 2004) was
used to identify genotyping errors, specifically null alleles and large
allele dropout. Observed and expected heterozygosity (HEand H O) were calculated using GENEPOP ver. 4.2
(Rousset, 2008). The program HP-RARE ver. 1.1 (Kalinowski, 2005) was
used to determine the number of alleles, allelic richness, and private
allelic richness for each group, sampling 22 genes in each sample. Tests
of departure from Hardy-Weinberg equilibrium and genotypic linkage
disequilibrium within each sample (i.e., for each fatty acid grouping
and the Giant subset) were conducted in GENEPOP using default values for
both. Results from all tests were compared with an adjusted alpha (α =
0.05) following the False Discovery Rate procedure (Narum, 2006).
We used the POWSIM V. 4.1 analysis to assess the statistical power of
our microsatellite data set given the observed allelic frequencies
within our samples in detecting significant genetic differentiation
between sampling groups (Ryman et al., 2006). For POWSIM analyses, we
assumed that lake trout within our study diverged from a common baseline
population with the same allelic frequencies as observed in our
contemporary samples. Simulations were performed with an effective
population size of 5000 to yield values of FST of 0.01,
0.005 and 0.001. The significance of tests in POWSIM were evaluated
using Fisher’s exact test and the χ2 test and the statistical power was
determined as the proportion of simulations for which these tests showed
a significant deviation from zero. All simulations were performed with
1000 iterations.
Genetic structuring was tested among lake trout groups using several
different methods. First, genotypic differentiation among lake trout
groups was calculated using log-likelihood (G) based exact tests (Goudet
et al., 1996) implemented in GENEPOP. Global FST (θ)
(Weir et al., 1984) was calculated in FSTAT ver. 2.9.3 (Goudet, 1995)
and pairwise comparisons of FST between groups were
calculated in ARLEQUIN ver. 3.5 (Excoffier et al., 2005) using 10,000
permutations. We then employed the Bayesian clustering program STRUCTURE
V. 2.3.2 (Pritchard et al., 2000) to resolve the putative number of
populations (i.e., genetic clusters (K)) within our samples. Owing to
the remarkably low levels of genetic differentiation among lake trout in
the Great Bear Lake (Harris et al., 2015; Harris et al., 2013), we
employed the LOCPRIOR algorithm (Hubisz et al., 2009). The LOCPRIOR
algorithm considered the location/sampling information as a prior in the
model, which may perform better than the traditional STRUCTURE model
when the genetic structure is weak (Hubisz et al., 2009). We also
incorporated an admixture model with correlated allelic frequencies and
the model was run with a burn-in period of 500,000 iterations and
500,000 Markov chain Monte Carlo iterations. We varied the potential
number of populations (K) from 1 to 10 and we ran 20 iterations for each
value of K. The STUCTURE output was first processed in the program
STRUCTURE HARVESTER (Earl, 2012), followed by the combination of results
of independent runs of the program and compilation of results based on
lnP(D) and the post hoc ΔK statistic of Evanno et al. (2005), to infer
the most likely number of clusters. The best alignment of replicate runs
was assessed with CLUMPP V. 1.1 (Jakobsson et al., 2007) and DISTRUCT V.
1.1 (Rosenberg, 2004) was then used to visualize the results. For
STRUCTURE analyses, we reported both lnP(D) and the post hoc ΔK
statistic.
Finally, Discriminant Analysis of Principal Components (DAPC) (Jombart
et al., 2010) was implemented in the Adegenet package (Jombart, 2008) in
R (Team, 2015). The number of clusters was identified using thefind.clusters function (a sequential K-means clustering
algorithm) and subsequent Bayesian Information Criterion (BIC), as
suggested by Jombart et al. (2010). Stratified cross-validation (carried
out with the function xvalDapc ) was used to determine the optimal
number of principal components to retain in the analysis.