# Focus on Recent Methodological Developments in ONETEP

## Predicting Solvatochromic shifts using LS-DFT

The colour of a chemical compound is just about top of the list of its most-easily measured properties, a basic measurement requiring no specialised equipment beyond one's eyes, and perhaps a colour chart for comparison. To put this on a more quantitative footing, one can carefully measure the experimental absorption spectrum of a compound, convolve it with the known response functions of each of the three colour receptors in the human eye (red, green and blue), and exactly reproduce a sample of this colour.

However, despite many decades of continual advancement in our ability to predict most properties of materials, for example predicting excitations via Time-Dependent DFT, it still remains extremely challenging to accurately predict colours, particularly in a complex environment such as a liquid solvent. This capability would be of great value in numerous fields, including the food manufacturing, pharmaceutical and textile industries. It would allow rapid identification of potential impurity molecules, and rapid screening of candidate molecules based on their colour, without needing to synthesise them.

One of the most challenging aspects of this kind of theoretical spectroscopy is understanding how the excitations of a molecule shift when it is placed in the environment of a solvent. The excitation will “hybridise” with excitations of the surrounding environment, and spread into it to some extent, resulting in a so-called “solvatochromic shift”, namely a shift of absorption energies compared to their value in vacuum. However, the calculations required to capture this effect need to be performed on a large, realistic cluster of molecule plus solvent, requiring a model that is very large by the usual standards quantum-mechanical modelling (thousands of atoms).

Using time-dependent DFT calculations on an unprecedented scale, a team of researchers from Cambridge, Imperial and Warwick, led by Dr Nicholas Hine, has used the ONETEP Linear-Scaling DFT code (www.onetep.org) to investigate how the solvatochromic shift of a widely-used red dye (Alizarin) behaves in aqueous solvent. This is a vital step towards accurate colour prediction from first principles.

This work has recently been published in the Journal of Chemical Theory and Computation:

## Natural Bond Orbital Analysis of Large Protein Systems

First principles electronic structure calculations are typically performed in terms of molecular orbitals (or bands), providing a straightforward theoretical avenue for approximations of increasing sophistication, but do not usually provide any qualitative chemical information about the system. In a recent Journal of Computational Chemistry article, Lee, Cole, Payne and Skylaris show how we can derive such information via post-processing of a ONETEP calculation using natural bond orbital (NBO) analysis, which produces a chemical picture of bonding in terms of localized Lewis-type bond and lone pair orbitals that we can use to understand molecular structure and interactions. They demonstrate the capabilities of this approach on large proteins, including an analysis of hydrogen bonding in an 8000-atom model of the avidin protein's binding pocket (pictured)

## Linear-Scaling DFT+U for strongly correlated systems

DFT+U is a simple, computationally inexpensive method for improving the description of on-site Coulomb interactions provided by conventional exchange-correlation (XC) functionals and, hence, for extending the range of applicability of DFT to strongly correlated materials.

The principle of DFT+U is to divide the system into a delocalized, free-electron-like part, the "bath," which is well-described by conventional XC functionals, and a set of "correlated sites" which is not. The XC functional for electrons associated with these sites is then explicitly augmented with screened Coulomb interactions, the form of which are inspired by the Hubbard model.

The correlated sites are typically defined by a set of 3d and/or 4f atomic-like orbitals, or "Hubbard projectors," that are chosen a priori. Projector functions that are commonly used include hydrogenic wave functions, maximally localized Wannier functions, and linear muffin-tin orbital-type orbitals.

In a series of developments over the past few years, we have implemented DFT+U in ONETEP. The correlated sites are defined in terms of non-orthogonal generalised Wannier functions (NGWFs) that are self-consistent with the DFT+U ground state itself [1] - to the best of our knowledge the first implementation of such a self-consistent approach. We have also addressed a long-standing problem of how to take account of the non-orthogonality of the projectors in a tensorially correct fashion [2]. Our implementation scales linearly with the system size and has been applied to systems of up to 7000 atoms [3,4]. We are currently using it to study the properties of doped magnetic semiconducting materials.

Optimised NGWFs in a metal porphyrin molecule. Left: initial hydrogenic projector; middle: optimised NGWF after DFT+U calculation using hydrogenic projectors for correlated subspace; right: DFT+U self-consistent optimised NGWF after further refinement with DFT+U using NGWFs themselves as projectors for correlated subspace.

- Projector self-consistent DFT+U using non-orthogonal generalized Wannier functions, D. D. O'Regan, N. D. M. Hine, M. C. Payne and A. A. Mostofi, Phys. Rev. B 82, 081102(R) (2010).
- Subspace representations in ab initio methods for strongly correlated systems, D. D. O'Regan, M. C. Payne and A. A. Mostofi, Phys. Rev. B 83, 245124 (2011).
- Linear-scaling DFT+U with full local orbital optimization, D. D. O'Regan, N. D. M. Hine, M. C. Payne and A. A. Mostofi, Phys. Rev. B 85 085107 (2012).
- Optimised Projections for the Ab Initio Simulation of Large and Strongly Correlated Systems, D. D. O'Regan, (Springer, Berlin, Heidelberg, 2012) 1st Ed., Springer Theses XVI, 225p., ISBN 978-3-642-23237-4..

## Accurate Electrostatics for large systems

Researchers at Southampton, Imperial and Cambridge have contributed to several major developments in the way ONETEP deals with the large-scale electrostatics problems encountered in LS-DFT simulations. To retain the predictive power of quantum mechanics in realistic environments for the determination of total energies, the electrostatics must be treated with very high accuracy.

### Electrostatic Artifacts affecting the HOMO-LUMO Gap

The energy difference between the highest occupied and the lowest unoccupied molecular orbitals in DFT is known as the HOMO-LUMO gap. It has been shown previously that calculated HOMO-LUMO gaps can close for large polarisable systems like proteins, or even water clusters. It has not been clear whether the gap closure relates to the approximate functionals used to treat exchange and correlation (which do indeed underestimate gaps), or whether it is a result of electrostatic effects relating to interfaces and solvation. Either way, lack of HOMO-LUMO gap seriously hinders convergence of large DFT calculations.

Lever, Cole, Hine, Haynes and Payne have recently used ONETEP to confirm the results of previous DFT calculations showing closure of the HOMO-LUMO gap in large water clusters and in proteins. However, they also conclusively demonstrated that the vanishing gap results not from the choice of density functional, but rather from the treatment of the interface between the system and the surrounding vacuum, particularly with regard to dielectric screening. Improper treatments produce large dipole moments and incorrect electrostatic potentials which lower the eigenenergies in some regions of the system and raise them in others, closing the HOMO-LUMO gap.

The paper also provides a range of practical solutions to maintain the gap as the system size is increased, including: use of classical structural optimisation of the interface between water and vacuum; screening the molecular dipole moments via the use of implicit solvation; and embedding the quantum mechanical system in the potential of classical point charges to represent a water environment. We demonstrate that these approaches restore the gap for clusters of water molecules of 1755 atoms and also for protein systems comprising 2386 atoms.

### Electrostatic Embedding

In a recent article in J. Chem. Phys, Fox, Pittock, Skylaris and co-workers detailed a new way to perform electrostatic embedding, whereby a quantum-mechanical system is embedded in a system of classical charges. They demonstrated this approach using LS-DFT simulations of large biomolecules embedded in water clusters where beyond a certain range, the water was described by classical charges. They showed that interaction energies of ligands with biomolecules in solvation can be made to converge fairly rapidly in this scheme to the answer that would be obtained if the full system was treated quantum mechanically.

### Electrostatics with Periodic Boundaries

Hine, Dziedzic, Haynes and Skylaris have recently used ONETEP to perform a comparison of various approaches to the problem of large-scale electrostatic interactions in finite systems. They showed that with three different approaches (truncation of the Coulomb potential, use of the minimum-image convention, and use of Open Boundary Conditions with a multigrid approach to the Poisson equation), it was possible to obtain the "isolated" limit even within a calculation performed under Periodic Boundary Conditions. However, the different methods approached this limit with very different efficiency and convergence. The results are compared in a recent J. Chem. Phys. paper.

### Implicit Solvation

Dziedzic and Helal (working with Skylaris, Mostofi and Payne) have recently published a new approach to performing calculations in an "implicit" solvent, ie one which is not included in terms of the individual atoms, but rather by a modification to the dielectric constant. This makes simulations of biological systems in an aqueous environment far more realistic, without the need to explicitly treat unmanageably large numbers of water molecules around a biomolecule.

## Conduction States using Localised Orbitals

Ratcliff, Hine and Haynes have developed within ONETEP a new method for calculating optical absorption spectra of large systems, with linear-scaling computational effort. This incorporates a scheme for optimizing a new set of localized orbitals to accurately represent the unoccupied Kohn-Sham states. In a recent paper, the method was applied to the calculation of the density of states and optical absorption spectrum for the metal-free phthalocyanine molecule (see figure) and the conjugated polymer poly(para-phenylene). While the description of the unoccupied states in terms of usual valence-optimised localised orbitals of a ONETEP calculation (red) was in poor agreement with a results from a traditional DFT code (black), using the new approach the agreement is seen to be excellent (blue). In future, we plan to extend the scope of these techniques by applying them to PAW calculations and to EELS spectroscopy.