where sij is the step between atom i andj . bij is the type of chemical bond between atom i and j (The single, double, triple and benzene ring bonds are 1, 2, 3, and 1.5 respectively).
The atomic distribution matrices (MA ) are grouped by MSand MP , which reflect the relationship between atoms and the special contribution of each atom. In addition, MP and MScorresponding to the H-suppressed structure are also obtained theMA in the same way. The norm indexes (I ) are the norm of atomic distribution matrices as listed in Eqs. (9)-(14). The MAused for ρ and η are shown in Table C1-C2 of Supporting Information (atomic-distribution-matrix.docx). An example for the prediction process with two ILs generating norm indexes (I ) and applying the ρ (T ,P ,I )-QSPR model is shown in E1 of Supporting Information (example.xlsx).
Where λ is the eigenvalue of matrix, MH is the Hermite matrix.
2.5. Model validation
LOIO-CV method. The implementation process of LOO-CV is shown in Figure 1. LODPO-CV and LOILO-CV are two execution methods belonging to LOO-CV, which are often used to evaluate the robustness of IL-QSPR models. LODPO-CV is widely used because of the ease implementation. For LODPO-CV, a data point was removed to implement model validation process and the remaining data points serve as training set. The interpolating process for LODPO-CV leads to the “pseudo-high” accuracy. For LOILO-CV, all data points of one IL were removed to implement model validation process. While LOILO-CV is a better method than LODPO-CV, LOILO-CV will also produce “pseudo-high” accuracy because both cation and anion of the removed IL may appear in remaining set. In both internal and external validation of ILs QSPR models, an important criterion has been ignored: both cation and anion of one IL in testing set cannot reappear in training set. If the cation and anion of one IL in the testing set reappear in the training set at the same time, the contributions of the anion and cation have been present in the training set, so the predicting ability of the model cannot be reflected. Hence, to enhance the accuracy of model evaluation and verify the robustness of the model, the internal validation method of LOIO-CV was proposed as presented in Figure 1. It mainly includes two processes: (1) leave-one-cation-out cross-validation (LOCO-CV), in which ILs with the same cation are treated as testing set and the remaining ILs are used as training set; (2) leave-one-anion-out cross-validation (LOAO-CV), all ILs with the same anion as the validation set and the remaining ILs are used as the training set.