*Featured image and figures used with permission via ACS AuthorChoice open access.*

Title: Machine Learning Adaptive Basis Sets for Efficient Large Scale Density Functional Theory Simulation

Authors: Ole Schütt and Joost VandeVondele

Year: 2018

Journal: Journal of Chemical Theory and Computation

DOI: 10.1021/acs.jctc.8b00378

How would you ask someone for directions to the washroom if you were in a library? Perhaps you would keep your voice as low as possible. How about when you were in a pub? Then you would have to raise your voice so much for the staff to hear you! We have been trained to be adaptive to our environment in our daily life. This is the same for when chemists model atoms – the sets of parameters have to be adjusted in different situations.

In the eyes of computational chemists, atoms are no longer symbolized with circles as illustrated by many chemistry textbooks. Atoms are represented by a set of functions called a basis set. The basis set translates the different shapes of molecular orbitals that we have come across in lectures to a set of mathematical functions readable by computers. Therefore, basis sets make computational calculations feasible.

However, atoms behave differently in different situations. For example, some elements (such as halogens) may polarize neighboring atoms more easily while some do not. Therefore, the basis set of an atom has to adapt to its atomic environment. Researchers here, therefore, propose the adaptive basis sets, which has a smaller size for computers to handle and can be significantly more accurate than traditional basis sets of the same size.

They have devised a method to predict these adaptive basis sets by machine learning. Machine learning enables computers to make approximations or predictions with a given set of training data sets. This is often a challenge in quantum calculations due to the large training sets required and therefore a high computational cost. The researchers here have used a descriptor to represent all atom positions and characterize the chemical environment. This allows them to devise the adaptive basis set with only geometrical information. In their approach, they have aimed at the neighboring atoms to construct functions and parameters that transform a conventional basis set to an adaptive basis set.

Chemistry tells us that an atom likes to stay at its lowest energy as the ground state, just as we want to find the most comfortable position when we rest. Interestingly, when researchers have completed the optimization of the parameters of the adaptive basis set, they also find the chemical system has the lowest energy. They attribute it to the stability of their simulation.

In their experiment with molecular simulations of water molecules, they have also shown that small basis sets are sufficient to yield structural properties (Figure 1). They have demonstrated an effective 60-fold run-time speedup compared to the standard approach. This saves much computational power and allows chemists to obtain results much faster!

Figure 1 They have shown the with increasing number of training data set, the error of the basis set decreases (blue). The error of an optimized basis set (green) and traditional minimal basis set (red) are depicted for comparison. This has shown that only a small training data set is required to reach the limit of the optimized basis set.

While researchers have shown their method is robust and only requires a small training set, they have noted further studies are needed for improvements to the method. This includes the search for a good descriptor for general purpose and refining the parameters so as to reduce the number of parameters required for the basis set. Yet, this has successfully shown how machine learning has helped sophisticated computational chemistry calculations!