Comparative analyses
In order to calculate the phylogenetic signal of the Crabtree effect, a
categorical trait (presence/absence) we calculated the minimum number of
transitions in character states, at each node of the phylogeny, which
accounts for the observed distribution of the character in the tips
(Maddison & Maddison, 2000, Paleo-Lopez et al., 2016). Then, this
magnitude was compared with the median of a randomized distribution of
the character assignment (1,000 randomizations were used). This is a
statistical analysis to test if phylogenetic signal departs from zero in
categorical traits: a significant phylogenetic signal is inferred when
the observed transition rates fall within the lower tail of 5% of the
randomized distribution. Being significant, this outcome implies that
the innovation (i.e., Crabtree positive yeasts) appeared at some point
in a given lineage, and affected the derived lineages. If it is not
significant, it is concluded that Crabtree positive species arose
randomly across the phylogeny. We also computed phylogenetic signal for
continuous traits using the K-Blomberg statistic. This index vary from
zero to infinite, being K=1 the expectation under a model of Brownian
Motion evolution (Blomberg et al., 2003). To identify adaptive shifts on
fermentative traits, we applied an algorithm that is based on the
Ornstein-Uhlenbeck process (OU). This approach was originally proposed
by Hansen (1996), who modeled the OU process as a statistical
formalization of the “common descendent” assumption of evolution and
its deviations (see Fig 1 in Hansen & Martins, 1996). Here we explain
the OU model, briefly.
The rate of change of mean trait values of a lineage is given by:
dX(t) = α[θ-X(t)]dt + σdB(t) (1)
This equation expresses the infinitesimal change rate in change in trait
X over an infinitesimal increment of time. The term dB(t) is “white
noise”, a random variable that is normally distributed with mean 0 and
variance dt, and σ represents the intensity of these random
fluctuations. The deterministic part of the model is given by the term
α[θ-X(t)]dt, in which α represents the magnitude by which selection
“pulls” lineages to a phenotypic optimum, represented by θ. With α=0,
this model collapses to:
dX(t) = σdB(t) (2)
the Brownian Motion model for trait evolution (Felsenstein, 1973,
Felsenstein, 1985). This model uses the basic assumption of comparative
studies as a null hypothesis for any pair of lineages; that the
phenotypic similarities between both is proportional to the time passed
since the last common ancestor (Felsenstein, 1973).
We applied the OU model, combined with an algorithm of automatic
detection of adaptive shifts in the phylogeny, the “lasso-OU”
algorithm, implemented in the R package l1ou (Khabbazian et al.,
2016). This procedure simply assumes that at least one shift exists at
the beginning of any given branch, and tests the validity of this shift
as explanatory of the whole dataset using information criteria. The
algorithm is implemented as linear model (see Khabbazian et al., 2016;
ec. 1), and incorporates the lasso procedure for estimating the models
(Tibshirani, 1996). We used Bayesian information criteria (BIC,
Wagenmakers & Farrell, 2004) to rank models assuming either a fixed
shift, by default located where the WGD is described (i.e., at the
common ancestor of the Vanderwaltozyma – Saccharomycesclade, see Fig 1a); or models where shifts are searched automatically by
the algorithm. The program permits to set the maximum number of shifts
allowed, which in our case was set as three shifts. This analysis was
performed for the four metric traits we considered here: ethanol yield,
respiratory quotient, glycerol production and growth rate.