Phylogenetical signal
Phylogenetic signal in seed mass, plant height, leaf area, growth form and leaf N was constructed from a mega-tree (GBOTB.extended.tre) containing 10587 genera and 74533 vascular plant species. This mega-tree is the largest phylogeny for vascular plants so far (Zanne et al., 2014; Smith & Brown, 2018). The R package ‘V. PhyloMaker’ was used because it can generate very large phylogenies for vascular plants at a relatively fast speed (Jin & Qian, 2019). Species names in this study were standardized to The Plant List v.1.1 (http://www.theplantlist.org/).
Pagel’s lambda (λ) estimates the strength of phylogenetic signal in a continuous trait, therefore, we calculated Pagel’s λ to quantitively estimate if the similarity of seed mass, plant height, genome size, leaf area, and leaf N among species is correlated with the phylogenetic similarity of plant species. We utilized a randomization test by running the package ‘phytools’ (Revell, 2012) in R to test for the significance of λ. In our study, Pagel’s λ can range from 0 to 1, with a λ of 0 indicating no phylogenetic signal and whereas a λ of 1 indicating the strongest phylogenetic signal (Pagel, 1999).
We tested the strength of the phylogenetic signal in growth form using the D statistic that is for binary traits (Fritz & Purvis, 2010), using the package ‘caper’ in R (Orme et al., 2013). Growth form of the 1071 species is supposed to come from the time of their independent evolution if the D is not significantly different from 0 (PBrownian > 0.05). Whereas, if D value is equal to or not significantly different from 1 (Prandom> 0.05), which indicates that the interspecific differences in growth form are distributed randomly across a phylogenetic tree.
Statistical analysis
All analyses were conducted in R (R Development Core Team, 2021). As plant traits vary with growth form, we analyzed for differences in plant traits between woody and non-woody species after accounting for growth form. We employed the general linear mode to detect the differences in seed mass, plant height, genome size, leaf area, leaf N between plant species with different growth forms (woody vs non-woody). We also constructed generalized linear model (GLM) to see the association between seed mass, plant height, growth form, genome size, leaf area and leaf N across plant species and groups, with the seed mass as dependent variable and other plant traits as independent variables. To investigate which plant traits were more important to variations in seed mass across plant species, we applied a multi-variable phylogenetic generalized linear mixed model (PGLMM) to incorporate phylogenetic information and then correct for phylogenetic effects among species, as closely related organisms are more likely to share similar biological traits. We used a Gaussian distribution with phylogenetic trees, implemented in the R packages ‘phyr’ and ‘ape’ (Paradis & Schliep, 2019; Li et al., 2020). We considered plant height, leaf area, genome size, growth form and leaf N as predictor variables, seed mass as the response variable and phylogeny as a random intercept.
To tease apart the relative contributions of plant traits and phylogeny to the variation in seed mass of the plant species, we used partial R2s for the logistic regression model (Ives, 2019) implemented by the R package “rr2” (Ives & Li, 2018). The partial R2lik for each factor was calculated by comparing the full model with reduced models in which a given factor was removed, and measuring the consequent reduction in the likelihood (Wang et al., 2022).