A numerical model for hydropathy tuning
To investigate hydropathy tuning by Leu, we used a simple model with
sequences composed of only 4 «amino acids»: A, B, C and D. These form
sequences of the type
AaBbCcDd,
with small letters indicating the number of the amino acids. A and B
were modeled according to Ile and Leu, with «a» and «b» corresponding to
the occurrences of the two amino acids within the TMD sequences of class
A GPCRs. The hydropathy of Ile was assigned to A (hA =
−0.81 kcal/mol) and the one of Leu was assigned to B (hB= −0.69 kcal/mol). C and D, and their counts «c» and «d», were modeled
to reflect all other amino acids with hydropathies smaller than zero and
amino acids with hydropathies larger than zero. C and D are thus generic
amino acids that represent the averages of all hydrophobic (except Ile
and Leu) and all hydrophilic amino acids, respectively. For C and D, the
average hydropathies of the amino acids they represent were used
(hC: −0.36 kcal/mol, hD: 0.8175
kcal/mol).
Amino acid compositions for simulated sequences were created by
generating Gaussian distributed numbers for a-d based on the amino acid
occurrences in the class A GPCR TMD sequences (A: 8.8 % ± 3.0 % (SD),
B: 15.2 % ± 3.4 %, C: 28.0 % ± 3.0 %, D: 48.0 % ± 2.8 %). The
generated random numbers a-d were then multiplied by 220 and rounded to
the nearest integer to obtain sequence lengths that are comparable to
the lengths of the TMD sequences. To test for statistical features, a
total of 1’500 sequences were generated in each of the 10’000 runs.
Driver residues were introduced to drive hydropathies towards a defined
optimum value hopt, which was set to −1.5 kcal/mol to
resemble the mean hydropathy of the TMD sequences (−1.47 kcal/mol). With
B as the driver, «a», «c» and «d» were randomly determined by a Gaussian
distribution as described above. Then «b» was determined as shown by the
equation below, with g(B) being a randomly Gaussian distributed number
and hB being the hydropathy of B. The first term
calculates the difference between the optimal and the already present
hydropathy, and divides it through the hydropathy of B, yielding the
value of «b» needed to get to the optimal hydropathy. A defined degree
of noise was introduced using fdrive, which determines
the amount of drive towards the optimum value hopt, with
the rest (1- fdrive) being determined randomly by the
Gaussian distribution g(B). The value of fdrive used was
0.25, which, however, does not mean that 25 % of the final number of
«b» is driving the hydropathy towards the desired value since this
fraction additionally depends on the value of hopt.
Interestingly, the variances and correlations were identical between
runs with different values for hopt, indicating that the
actual value of hopt is not important to observe the
effects of tuning towards it.
\begin{equation}
b=f_{\text{drive}}\times\frac{h_{\text{opt}}-(a\times h_{A}+c\times h_{C}+d\times h_{D})}{h_{B}}+(1-f_{\text{drive}})\times g(B)\nonumber \\
\end{equation}Two different models were tested (Fig. 2). In the first model, all amino
acids were modeled independently from the resulting hydropathies by
generating a-d based on Gaussian distributions alone. This simulates a
case in which TMD hydropathy is not optimized (Fig. 2A-2C). In the
second model, «a», «c» and «d» were generated based on Gaussian
distributions, whereas «b» was chosen based on the equation shown above.
This simulates the case in which Leu would be the driving force for
adjusting the hydropathy of the TMDs (Fig. 2D-2F).