Introduction
Genomics promises exciting advances towards understanding adaptive
genetic variation and evolutionary potential of plants under a rapidly
changing and often increasingly variable environment (Hoffmann & Sgrò
2011; Savolainen et al. 2013; Harrisson et al. 2014).
Intraspecific genetic variation represents the potential for adaptive
change in response to new selective challenges, which is critical for
local species persistence under environmental change (Rice & Emery
2003; Bell & Gonzalez 2009). Adaptation to local climate conditions has
been considered typical for tree populations (Langlet 1971; Ying &
Liang 1994; Kitzmiller 2005; Wright 2007), but organisms with such long
generation times and a sessile lifestyle can become maladapted if
environmental shifts rapidly occur (Aitken et al. 2008; Andersonet al. 2012; Alberto et al. 2013). Plants also exhibit
plastic changes in their growth form and physiology in response to
stress, and the level of plasticity can itself be heritable
(Van
Kleunen & Fischer 2005; Auld et al. 2010) and may be under the
selection (Zettlemoyer & Peterson 2021). Understanding the distribution
of genetic variation related to environmental responses may help us
better predict changes and manage forests in a shifting climate (Neale
& Kremer 2011; Oney et al. 2013). This includes selecting seed
sources for restoration or breeding that have desirable characteristics
such as drought tolerance (Beaulieu et al. 2014; Isik 2014).
Landscape genomics offers enormous potential to discover genes
responsible for local adaptation by investigating the statistical
association between genetic variation at individual loci and the
causative environmental factors (Eckert et al. 2010, 2015; Sorket al. 2013; Lu et al. 2019). This approach is sometimes
known as Genotype-Environment Association (GEA) analysis. Prior studies
in Arabidopsis – the primary plant model organism - have found
that environmentally-associated SNPs can predict performance in common
gardens (Hancock et al. 2011). A Pinus pinaster study
suggests this could be true in trees as well, even when only a modest
number of the genetic variants involved have been identified
(Jaramillo-Correa et al. 2015). However, GEA studies don’t by
themselves reveal why specific alleles are more prevalent in particular
environments – for example, are they responsible for selectively
favored traits? Genotype-Phenotype Association (GPA) analysis identifies
loci linked to a specific phenotype (Eckert et al. 2009; Hollidayet al. 2010). In plant GPA studies, individuals are typically
grown in a common environment to eliminate the effects of environmental
variation on phenotypes. However, this approach does not reveal whether
a trait variant would be favored in the field. GEA and GPA association
are thus complementary, and combining them might better identify the
loci and traits that are selectively favored in particular conditions
than either could alone (Eckert et al. 2015; Mahony et al.2020).
The large genome size of conifer trees (>19 GBP) represents
a challenge for analysis. Most association studies in conifers have
focused on SNPs within a few hundred genes (Eckert et al. 2009,
2015; Holliday et al. 2010; Hamilton et al. 2013; Dillonet al. 2014; Housset et al. 2018), or fewer than 2,000
genome-wide SNPs (Uchiyama et al. 2013). One notable exception is
a recent study on lodgepole pine that used a sequence capture dataset
created by mapping the Pinus contorta transcriptome to theP. taeda genome sequence (Mahony et al. 2020). A
genome-wide SNP climate-association study was also recently completed
for P. lambertiana , one of the few other pines species with a
full genome sequence (Weiss et al. 2022). Still, most conifers
have neither a published genome sequence nor a complete transcriptome.
Though targeted sequencing is efficient, candidate gene approaches may
miss other vital genes with previously unsuspected roles in local
adaptation, and focusing solely on variants within genes may miss
significant variants within regulatory regions.
Several approaches to identifying more genetic variants for genome-wide
association studies (GWAS) utilizing next-generation sequencing (NGS)
have been proposed in recent years (Davey et al. 2011; Poland &
Rife 2012). Genotyping-by-Sequencing (GBS), which can generate tens of
thousands of SNP markers (Single Nucleotide Polymorphisms) without the
need for a reference genome or whole transcriptome, has emerged as a
cost-effective strategy (Elshire et al. 2011; Andrews et
al. 2016). By combining the power of multiplexed NGS with
restriction-enzyme-based genome complexity reduction, GBS can genotype
large populations of individuals for thousands of SNPs in an
increasingly rapid and inexpensive way (Poland et al. 2012;
Poland & Rife 2012).
Despite the high economic and ecological importance of ponderosa pine
(Pinus ponderosa ) in the western United States (Graham & Jain
2005), no previous study has attempted to identify the relationship
between gene sequence variation and drought tolerance in this species.
Some studies have investigated P. ponderosa’s evolutionary
history and phylogeography using mitochondrial DNA markers; these
reflect the long-term biogeographical process contributing to the modern
distribution of the species but have limited adaptive significance in
themselves (Johansen & Latta 2003; Potter et al. 2013). Other
studies have emphasized the importance of intraspecific variation ofP. ponderosa in environmental responses but focus on the
phenotypic variation within and among populations without identifying
the underlying genetic variation (Kolb et al. 2016; Maguireet al. 2018). California’s historic 2012–2016 drought may
represent an increasingly common condition as climate changes (Griffin
& Anchukaitis 2014; Berg & Hall 2015). Such “hot droughts” can lead
to mass tree mortality, even in relatively drought tolerant species like
ponderosa pine, negatively impacting the sustainability of conifer
forests (Fettig et al. 2019). A deep understanding of the genetic
basis of adaptation in ponderosa pine and other western conifers is
critical for successful reforestation and conservation programs.
In this study, we conducted a GEA analysis on 223 ponderosa pine
genotypes from a range of climates across the central Sierra Nevada
mountains of California. We then planted seeds collected from a subset
of these trees in the greenhouse. The resulting seedlings provided the
basis of a GPA analysis of putative drought-response traits. We ran gene
annotation to ascribe biological function to the genes that the
associated SNPs were in or adjacent to. Then we assessed overlap in SNP
identity or gene functions among GEA and GPA association analysis that
might indicate particular importance for local adaptation.