3.1 | To split or not to split
3.1.1 | Summary
From among 77 CASP15 tertiary structure prediction targets, 43 were
one-domain targets, 21 had two domains, and the rest - three domains or
more (Table 1). For 52 targets no domain rearrangement was necessary,
and the targets were evaluated as whole-length structures (41) or
unchanged constituent domains (11). For the remaining 25 targets, in 20
cases we merged at least some domains according to Grishin plots, in two
cases we merged domains according to other considerations, and in three
cases we split targets in more EUs than suggested by the domain parsing
programs. The domain splitting and re-joining procedure (Methods )
yielded 112 evaluation units, 109 of which were included into the final
tertiary structure evaluation21, while three –
T1114s1-D2, T1157s1-D2 and -D3 – were cancelled due to the low
resolution of the cryo-EM maps in their local areas.
Out of 34 multi-domain targets, 14 were evaluated as one EU and 20 were
split into multiple EUs (Table 1). Below we discuss different scenarios
of forming evaluation units and present case studies for some targets.
3.1.2 | Multidomain-targets not requiring splitting (14)
Fourteen multi-domain targets (as defined by the automatic parsers -
section 2.1, Step 2 ) were proposed for the evaluation without
splitting into substructures.
In two cases, T1131 and T1133, we disagreed with the automatic domain
parsing results and considered the targets as one-domain structures.
Target T1131 is a small protein where a long central helix holds two
parts of the structure together and is needed for the structural
integrity of the protein; while target T1133 (PDB: 8DYS) is a
nine-bladed beta-propeller that is fully and reliably covered by
templates (e.g., 3WJ9_B) and well-predicted as the whole.
For eleven targets a decision to join domains into single EUs was
reached based on the analysis of Grishin plots. Two examples of such
targets are shown in Figure 1. Even though the targets are clearly
two-domain entities, their whole structures were predicted by most
groups as accurately as the constituent domains and thus did not require
splitting.