Prediction of hydrate and solvate formation using statistical models

Khaled Takieddin, Yaroslav Khimyak, Laszlo Fabian

Research output: Contribution to journalArticlepeer-review

51 Citations (Scopus)
62 Downloads (Pure)


Novel, knowledge based models for the prediction of hydrate and solvate formation are introduced, which require only the molecular formula as input. A data set of more than 19 000 organic, nonionic, and nonpolymeric molecules was extracted from the Cambridge Structural Database. Molecules that formed solvates were compared with those that did not using molecular descriptors and statistical methods, which allowed the identification of chemical properties that contribute to solvate formation. The study was conducted for five types of solvates: ethanol, methanol, dichloromethane, chloroform, and water solvates. The identified properties were all related to the size and branching of the molecules and to the hydrogen bonding ability of the molecules. The corresponding molecular descriptors were used to fit logistic regression models to predict the probability of any given molecule to form a solvate. The established models were able to predict the behavior of ∼80% of the data correctly using only two descriptors in the predictive model.
Original languageEnglish
Pages (from-to)70-81
Number of pages12
JournalCrystal Growth & Design
Issue number1
Early online date17 Nov 2015
Publication statusPublished - 6 Jan 2016

Cite this