Data collection
We constructed an FST dataset through a systematic
search in google scholar (key words: “genetic structure”, “population
differentiation”, “population genetics”, “genetic diversity”,
“population gene flow”) for articles published up until June 2018. The
search yielded 356 peer-reviewed publications on seed plants for which
measures of population genetic structure (FST) based on
nuclear markers were available. When multiple studies reported
FST values for the same species, we recorded the
FST from the study with the largest geographic range, as
this may better represent the genetic diversity found in the species
(Cavers et al., 2005). By this criterion, we compiled a dataset that
included 337 unique species. We extracted information for the predictor
variables directly from the publications, and infrequently complemented
this, where necessary, with information from peer-reviewed literature on
the studied species (see Appendix S1 and Table S1 in Supporting
Information). Predictor variables were included in multiple regressions
to explain variation in FST values (see section
FST models). We included three factors that pertained to
the sampling scheme of each study and that can potentially affect
FST (Nybom, 2004; Nybom & Bartish, 2000): genetic
marker used, maximum distance between populations, mean sample size per
population. We used them to construct a null model to be compared
against models with our factors of interest. Factors of interest
consisted of five categorical variables with 2–4 levels: mating system
(outcrossing, mixed-mating), growth form (non-woody, shrub, tree),
pollination mode (large insects, small insects, vertebrates, wind), seed
dispersal mode (animal, gravity, wind), and latitudinal region (tropics,
sub-tropics, temperate). Below we explain the FSTestimates and all eight factors used in this study in greater detail.