Crystal structure prediction for multicomponent systems: energy models and structure generation
File(s)
Author(s)
Zhang, Yizu
Type
Thesis or dissertation
Abstract
Crystalline materials have a wide application in the pharmaceutical and agrochemical sectors. The aim of Crystal Structure Prediction (CSP) is to conduct polymorph screening by predicting all possible polymorphs given the chemical diagram of a compound. Various computational programmes have been developed for both academic and industrial use. However, the application to hydrate systems remains challenging. The aim of this thesis is to explore and improve the applicability of CSP for hydrates.
In this thesis, I first examined the applicability of a current lattice energy model for hydrates: This model consists of anisotropic distributed multipole moments derived from isolated-molecule quantum mechanical calculations to model the electrostatic interactions, combined with isotropic atom-atom exp-6 Buckingham potential along with empirical pa- rameters to model repulsion and dispersion interactions. It has been shown to be successful in determining the low-energy structures of small organic crystals. By giving 107 exper- imental hydrates extracted from the Cambridge Structural Database as starting points, I found that the energy model is able to reproduce around 95% of the structural geometry with different quantum mechanical levels of theory. The relative stability ordering based on the lattice energy for computed structures was, however, not always satisfactory and varies with the level of theory adopted. The energy model also revealed an underestima- tion of the binding energy for hydrate and hydrogen-bonding systems. The accuracy of our current energy model was insufficient for modelling crystals with complex short-range interactions, especially hydrogen bonds. I postulated that this can be addressed with the inclusion of an explicit induction energy correction in the model.
Hence I examined the use of the isolated-molecule assumption and the polarisable con- tinuum model (PCM) corrections within hydrate prediction. The electrostatics derived from ab initio molecular charge densities in the gas phase are replaced by simulations within a field of the surrounding molecules represented by point charges. Distributed multipolar representation of the electron density perturbation was applied in the classi- cal polarisation model for the evaluation of the induction energy. The integration of this process for modelling induction into a current CSP methodology was achieved. The im- plementation was based on the recently developed lattice energy minimisation programme known as Crystal Structure Optimizer – Rigid Molecules (CSO-RM) for rigid-body sys- tems, and its companion Crystal Structure Optimizer – Flexible Molecules (CSO-FM) to account for conformational flexibility. I assessed the energy rankings of experimental matches before and after induction corrections for three small organic hydrate systems, namely 2,6-diamino-4(3H)-pyrimidinone, gallic acid and theophylline, as well as demon- strating the importance of induction in the carbamazepine and diglycine crystals. The contribution to the lattice energy from explicit induction term was generally found to favour hydrogen-bonding systems, and has been found to result in significant improvement among polymorphic/computed forms.
Another aspect of this work focused on improving the global search efficiency of the initial structure generation. I modified the current methodology, which suffers the frequent occurrence of molecular overlaps. The modification could increase the initial structure generation speed by to four times while preserving the quality of structures generated.
In this thesis, I first examined the applicability of a current lattice energy model for hydrates: This model consists of anisotropic distributed multipole moments derived from isolated-molecule quantum mechanical calculations to model the electrostatic interactions, combined with isotropic atom-atom exp-6 Buckingham potential along with empirical pa- rameters to model repulsion and dispersion interactions. It has been shown to be successful in determining the low-energy structures of small organic crystals. By giving 107 exper- imental hydrates extracted from the Cambridge Structural Database as starting points, I found that the energy model is able to reproduce around 95% of the structural geometry with different quantum mechanical levels of theory. The relative stability ordering based on the lattice energy for computed structures was, however, not always satisfactory and varies with the level of theory adopted. The energy model also revealed an underestima- tion of the binding energy for hydrate and hydrogen-bonding systems. The accuracy of our current energy model was insufficient for modelling crystals with complex short-range interactions, especially hydrogen bonds. I postulated that this can be addressed with the inclusion of an explicit induction energy correction in the model.
Hence I examined the use of the isolated-molecule assumption and the polarisable con- tinuum model (PCM) corrections within hydrate prediction. The electrostatics derived from ab initio molecular charge densities in the gas phase are replaced by simulations within a field of the surrounding molecules represented by point charges. Distributed multipolar representation of the electron density perturbation was applied in the classi- cal polarisation model for the evaluation of the induction energy. The integration of this process for modelling induction into a current CSP methodology was achieved. The im- plementation was based on the recently developed lattice energy minimisation programme known as Crystal Structure Optimizer – Rigid Molecules (CSO-RM) for rigid-body sys- tems, and its companion Crystal Structure Optimizer – Flexible Molecules (CSO-FM) to account for conformational flexibility. I assessed the energy rankings of experimental matches before and after induction corrections for three small organic hydrate systems, namely 2,6-diamino-4(3H)-pyrimidinone, gallic acid and theophylline, as well as demon- strating the importance of induction in the carbamazepine and diglycine crystals. The contribution to the lattice energy from explicit induction term was generally found to favour hydrogen-bonding systems, and has been found to result in significant improvement among polymorphic/computed forms.
Another aspect of this work focused on improving the global search efficiency of the initial structure generation. I modified the current methodology, which suffers the frequent occurrence of molecular overlaps. The modification could increase the initial structure generation speed by to four times while preserving the quality of structures generated.
Version
Open Access
Date Issued
2023-06
Date Awarded
2023-09
Copyright Statement
Creative Commons Attribution NonCommercial Licence
Advisor
Adjiman, Claire S.
Pantelides, Constantinos C.
Sponsor
Engineering and Physical Sciences Research Council
Eli Lilly and Company (Firm)
Grant Number
EP/T005556/1
EP/T518207/1
Publisher Department
Chemical Engineering
Publisher Institution
Imperial College London
Qualification Level
Doctoral
Qualification Name
Doctor of Philosophy (PhD)