where sij is the step between atom i andj . bij is the type of chemical bond
between atom i and j (The single, double, triple and
benzene ring bonds are 1, 2, 3, and 1.5 respectively).
The atomic distribution matrices (MA ) are grouped by MSand MP , which reflect the relationship between atoms and the
special contribution of each atom. In addition, MP and MScorresponding to the H-suppressed structure are also obtained theMA in the same way. The norm indexes (I ) are the norm of
atomic distribution matrices as listed in Eqs. (9)-(14). The MAused for ρ and η are shown in Table C1-C2 of Supporting
Information (atomic-distribution-matrix.docx). An example for the
prediction process with two ILs generating norm indexes (I ) and
applying the ρ (T ,P ,I )-QSPR model is shown in
E1 of Supporting Information (example.xlsx).
Where λ is the eigenvalue of matrix, MH is the
Hermite matrix.
2.5. Model validation
LOIO-CV method. The implementation process of LOO-CV is
shown in Figure 1. LODPO-CV and LOILO-CV are two execution methods
belonging to LOO-CV, which are often used to evaluate the robustness of
IL-QSPR models. LODPO-CV is widely used because of the ease
implementation. For LODPO-CV, a data point was removed to implement
model validation process and the remaining data points serve as training
set. The interpolating process for LODPO-CV leads to the “pseudo-high”
accuracy. For LOILO-CV, all data points of one IL were removed to
implement model validation process. While LOILO-CV is a better method
than LODPO-CV, LOILO-CV will also produce “pseudo-high” accuracy
because both cation and anion of the removed IL may appear in remaining
set. In both internal and external validation of ILs QSPR models, an
important criterion has been ignored: both cation and anion of one IL in
testing set cannot reappear in training set. If the cation and anion of
one IL in the testing set reappear in the training set at the same time,
the contributions of the anion and cation have been present in the
training set, so the predicting ability of the model cannot be
reflected. Hence, to enhance the accuracy of model evaluation and verify
the robustness of the model, the internal validation method of LOIO-CV
was proposed as presented in Figure 1. It mainly includes two processes:
(1) leave-one-cation-out cross-validation (LOCO-CV), in which ILs with
the same cation are treated as testing set and the remaining ILs are
used as training set; (2) leave-one-anion-out cross-validation
(LOAO-CV), all ILs with the same anion as the validation set and the
remaining ILs are used as the training set.