4 | CONCLUSIONS
A key objective of CASP is to monitor progress in predictive performance on different kinds of target protein. Thus, a robust and objective classification of targets is essential. Although previous classification has benefitted from detailed consideration by experts in protein evolution, the new, purely automatic method introduced here provides a new labor-saving foundation for CASP-to-CASP comparisons. We show that it largely recapitulates previous classifications and, furthermore, may provide numerical estimates of difficulty beyond the current four classes, potentially facilitating future study of features correlating with target difficulty.
Much as a purely automatic division of targets into EUs would also be desirable, the CASP15 set illustrate why that seems not yet to be possible. For example, a satisfactory EU definition for the ABC transporter T1158 was only achieved by manual reference to a set of structures and an understating of the structure-function relationship of the target: none of the automated domain partitioning algorithms produced sensible results. Nevertheless, clear and objective guidelines were followed as far as possible relating, for example, to the gradients of the Grishin plots. Finally, it is worth noting that although consistent policy is followed for EU definition, the resulting sets may still differ from CASP to CASP as predictions improve. Thus, as more groups accurately capture domain packing there will be fewer instances of splitting and more where larger multi-domain units are retained as the EUs: this tendency towards larger EUs could tend to depress global quality metrics and should be borne in mind by future assessors.
Table 1. CASP15 tertiary structure prediction targets, their split into evaluation units (EUs) and classification to homology-based prediction classes. Canceled targets are highlighted in red; targets that were released as auxiliary structures for other prediction categories (ligand, oligo, protein-RNA complex) are in yellow.