Skip to main navigation Skip to search Skip to main content

Predicting second virial coefficients of organic and inorganic compounds using Gaussian process regression

  • Imperial College London
  • Fritz Haber Institute of the Max Planck Society

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

We show that by using intuitive and accessible molecular features it is possible to predict the temperature-dependent second virial coefficient of organic and inorganic compounds with Gaussian process regression. In particular, we built a low dimensional representation of features based on intrinsic molecular properties, topology and physical properties relevant for the characterization of molecule-molecule interactions. The featurization was used to predict second virial coefficients in the interpolative regime with a relative error ≲1% and to extrapolate the prediction to temperatures outside of the training range for each compound in the dataset with a relative error of 2.1%. Additionally, the model's predictive abilities were extended to organic molecules unseen in the training process, yielding a prediction with a relative error of 2.7%. Test molecules must be well-represented in the training set by instances of their families, which are high in variety. The method shows a generally better performance when compared to several semi-empirical procedures employed in the prediction of the quantity. Therefore, apart from being robust, the present Gaussian process regression model is extensible to a variety of organic and inorganic compounds.

Original languageEnglish
Pages (from-to)2891-2898
Number of pages8
JournalPhysical Chemistry Chemical Physics
Volume23
Issue number4
DOIs
StatePublished - Jan 28 2021

Fingerprint

Dive into the research topics of 'Predicting second virial coefficients of organic and inorganic compounds using Gaussian process regression'. Together they form a unique fingerprint.

Cite this