Bridging machine learning and COSMO-SAC for accurate prediction of infinite dilute activity coefficients of binary mixtures
Yuxin Qiu , Guzhong Chen , Qian Liu , Zhiwen Qi , Kake Zhu , Zhen Song
ENG. Chem. Eng. ›› 2026, Vol. 20 ›› Issue (1) : 4
Bridging machine learning and COSMO-SAC for accurate prediction of infinite dilute activity coefficients of binary mixtures
Infinite dilution activity coefficient (γ∞) is a key thermodynamic parameter in solvent design for chemical processes. Although conductor-like screening model for segment activity coefficient (COSMO-SAC) exhibits strong prior predictive capabilities, its estimations are sometimes only qualitative rather than quantitative. Another limitation of COSMO-SAC arises from the reliance on time-intensive quantum chemistry calculations, which restricts its scalability for large-scale solvent screening. To overcome these issues, this study integrates COSMO-SAC with machine learning for accurate γ∞ prediction of binary mixtures. By bypassing the necessity for quantum chemistry calculations, the multi-task machine learning model could rapidly predict the surface charge density distribution (σ-profiles) and molecular cavity volume (VCOSMO) of molecules and ions, while accurately distinguishing isomers. Four adjustable parameters of COSMO-SAC are optimized using more than 20000 experimental data points of γ∞, and residual systematic errors are further corrected with the boosting ensemble strategy to improve the model performance. The resulting hybrid model reduces the mean absolute error from 0.944 to 0.102 (R2 = 0.969), representing an 89 % improvement, while preserving the physicochemical interpretability of model. This accurate and efficient approach broadens the practical applicability of σ-profiles and VCOSMO prediction, as well as γ∞ calculations based on COSMO-SAC, facilitating the high-throughput solvent screening for diverse chemical engineering applications.
infinite dilution activity coefficient / COSMO-SAC / machine learning / solvent design / multi-task learning
| [1] |
|
| [2] |
|
| [3] |
|
| [4] |
|
| [5] |
|
| [6] |
|
| [7] |
|
| [8] |
|
| [9] |
|
| [10] |
|
| [11] |
|
| [12] |
|
| [13] |
|
| [14] |
|
| [15] |
|
| [16] |
|
| [17] |
|
| [18] |
|
| [19] |
|
| [20] |
|
| [21] |
|
| [22] |
|
| [23] |
|
| [24] |
|
| [25] |
|
| [26] |
|
| [27] |
|
| [28] |
|
| [29] |
|
| [30] |
|
| [31] |
|
| [32] |
|
| [33] |
|
| [34] |
|
| [35] |
|
| [36] |
|
| [37] |
|
| [38] |
|
| [39] |
|
| [40] |
|
| [41] |
|
| [42] |
|
| [43] |
|
| [44] |
|
| [45] |
|
Higher Education Press
Supplementary files
/
| 〈 |
|
〉 |