2. Methodology
2.1. Database
The ILs data were collected from the National Institute of Standards and
Technology (NIST)31. In total, 19335 ρ data
points for 972 ILs and 9238 η data points for 832 ILs were
included in the dataset. For ρ and η , the temperature and
pressure ranges were 221.314 ~ 473.15 K and 0.0815
~ 251.5 MPa, 253.15 ~ 438.15 K, and 0.06
~ 300 MPa, respectively.
The total dataset contains 501
cations, including imidazolium (im), pyridinium (py), pyrrolidinium
(pyr), ammonium (N), phosphonium (P), piperidinium (pip), morpholinium
(mor), sulfonium (S), triazolium (Trl), propylpyrazolium (pyra), etc. It
contains 154 anions, such as bis[(trifluoromethyl)sulfonyl]imide
[(N(SO2CF3)2)-],
tetrafluoroborate [(BF4)-],
hexafluorophosphate [(PF6)-],
dicyanamide [(N(CN)2)-],
tetracyanoborate [(B(CN)4)-],
trifluoroacetate [(C(CN)3)-],
tris(pentafluoroethyl)trifluorophosphate
[(PF3(C2F5)3)-],
halogen [(X)-], thiocyanate
[(SCN)-], alkoxy-alkylsulfates
[(RSO3)-],
alkyl-sulfate
[(RSO4)-], and so on. In
particular, geminal dicationic ILs (GDILs) were also collected in this
work (E.g. 1-methyl-3-(3-(trimethylammonio)propyl)-1H-imidazolium
bis(dicyanamide) ). The information about these ILs together with
corresponding experimental values of ρ and η are shown in
Tables S1 ~ S2 of Supporting Information
(exp-cal-values.xlsx).
2.2. Data pre-processing
In the NIST database, the vast data points at variable temperature and
pressure were included for one IL. Some ILs would represent a large
percentage of the dataset if all these points were collected for
modeling. According to the principle of the least square
method32, 33, a large percentage of some ILs could
reduce the reliability of the QSPR model. Therefore, the criteria were
adopted in the process of data collection for which data points were
collected at 5 K temperature and 2.5 MPa pressure intervals.
2.3. f (T ,P ,I )-QSPR model
f (T ,P ,I )-QSPR models were established to
describe the relationship of ρ and η with structure,
temperature and pressure23. The preliminaryf (T ,P ,I )-QSPR models are shown as Eqs.
(1)-(2).
ρ is the density of the ILs in units of kg∙m3,η is the viscosity of the ILs in units of Pa∙s, T is the
temperature in K, and P is the pressure in kPa. α is a
variable related to the ILs structures. In most studies, the parametersβ , γ , and χ , are treated as constant terms for all
ILs27, 34. From our previous works23,
29, treating these three coefficients as variables for each IL makes
the model more accurate. This
strategy has hence been continued in the present work.
2.4. Proposed norm descriptors
The step matrix (MS ), such as the full step matrix
(MS F), the adjacent step matrix
(MS A), the adjacent-interphase step matrix
(MS AB) and the adjacent-interphase-jump step
matrix (MS ABC) are used to reflect the connection
relationship of atoms, as Eqs. (3)-(6). On this basis, two step matrices
(MS ABC_cyc andMS bon_cyc), given by Eqs. (7)-(8), are defined
to present the interaction of adjacent-interphase-jump atom on the ring
and the interaction of atoms on different bonds on the ring,
respectively. To better reveal the properties of atomic in molecules,
the property matrices (MP ) are used as shown in Table 1. The
properties of each atom were shown in S1 of Supporting Information (atom
properties.xlsx).
Table 1 . The property matrices (MP ).