Abstractxx

Semiempirical electronic structure methods are increasingly parameterized and benchmarked against data obtained by DFT or wavefunction-based calculations using rather than experimental data (Stewart 2007, Scholten 2003, Gaus 2013). Using calculated data has the advantage that it represents the precise value (usually the electronic energy) that is being parameterized, with little random noise with good coverage of chemical space, including molecules that are difficult to synthesize or perform measurements on. Carefully curated benchmark sets, such as GMTKN30 (Goerigk 2011), are therefore an invaluable resource to the scientific community and heavily used.

For example, Korth and Thiel (Korth 2011) used the GMTKN24-hcno dataset (21 subsets of the GMTKN24 data set (Goerigk 2010), an earlier version of GMTKN30) to show that modern semi-empirical methods are approaching the accuracy of PBE/TZVP and B3LYP/TZVPcalculations. While this is encouraging one concern is whether the results obtained for the small systems that make up these data sets are representative of those one would obtain for the large systems. For example, Yalmazer and Korth (Yilmazer 2013) performed a benchmark study of hundreds of protein-ligand complexes that included protein atoms within up to 10 Å from the ligand and showed that, for example, that the mean absolute deviation (MAD) between interaction energies computed using PM6-DH+ and BP86-D2/TZVP was 14 kcal/mol. In comparison the MADs for the S22 interaction energy subset of GMTKN24 are <2 kcal/mol for both dispersion corrected PM6 and DFT/TZVP calculations in the Korth and Thiel study (Korth 2011). One likely explanation is that the systems in the S22 subset are too small to exhibit many-body polarization contributions to the binding energy that semi-empirical methods fail to capture. Another, or additional, reason is that the S22 subset does not include ionic groups, which are quite common in proteins and ligands.

The Yalmazer and Korth study (Yilmazer 2013) raises a similar question about whether benchmark results for semiempirical barrier height-predictions on small systems, such as the BH76 and BHPERI subsets of GMTK24/30, are transferable to barrier height predictions for enzymes. This towards answering this question is to create a benchmark set barriers computed for systems that are relatively large and representative to enzymatic reactions. This is a considerable challenge because, unlike for ligand-protein complexes, there is no large database of corresponding transition state (TS) structures (or even substrate-enzyme structures) to start from. Thus, TS structures must be computed which is time-consuming and hard to automate. There are a significant number of such structures in the literature but many are not computed at a high enough level of theory to serve as benchmarks. Furthermore, TS structures are known to dependent significantly on the level of theory used and it is therefore important that the benchmark set is computed using identical or very similar levels of theory. Creating such a benchmark set is this a considerable challenge for any one research group but can be addressed by a concerted effort from the community. This paper represents the first step in this process.

We have collected barrier heights and reaction energies (and associated structures) for five enzymes from studies published by Himo and co-workers (Chen 2007, Georgieva 2010, Hopmann 2008, Liao 2011, Sevastik 2007) on a Gitbub repository (github.com/jensengroup/db-enzymes). Using this data, obtained at the same level of theory, we then benchmark PM6, PM7, PM7-TS, and DFTB3 and discuss the influence of system size. bulk solvation, and geometry re-optimization on the error. We end by discuss steps needed to expand and improve the data set and how other researchers can contribute to the process.

Five systems are investigated: L-aspartate \(\alpha\)-decarboxylase (AspDC), 4-oxalocrotonate tautomerase (4-OT), phosphotriesterase (PTE), histone lysine methyltransferase (HKMT), and haloalcohol dehalogenase (HheC). The B3LYP/6-311+G(2d,2p)//B3LYP/6-31G(d,p) (LANL2DZ is used for Zn in 1HZY) barrier heights and reaction energies are taken from the literature: AspDC (Liao 2011), 4-OT (Sevastik 2007), PTE (Chen 2007), HKMT (Georgieva 2010), and HhecC (Hopmann 2008) and the corresponding atomic coordinates are taken from the supplementary information or supplied by Fahmi Himo. 4-OT and PTE have two step mechanism resulting in two barrier heights and reaction energies. All energies are taken relative to the reactant state which results in a negative barrier for the second step in the 4-OT mechanism (4-OT-2 in Table \ref{table1}). The largest model system for each study is used unless otherwise noted and PCM results for for a dielectric constant of 80. The PM6 (Stewart 2007) and PM7 single point calculations are performed using MOPAC2012 while the DFTB3 (xx) single point calculations are performed using GAMESS-US (Schmidt 1993, Nishimoto 2015). For PM6/COSMO (Klamt 1993) and DFTB/PCM (xx) calculations are performed using a dielectric constant of 80.

The B3LYP results include zero-point energy (ZPE) corrections and are therefore directly comparable to the relative enthalpy values predicted by PM6 and PM7. However, ZPE corrections are not included for the DFTB3 calculations and we note that the ZPE can contribute to the difference observed between the DFTB3 and B3LYP results.

Table \ref{table1} lists barrier heights and reaction energies computed using PM6, PM7 and DFTB3 single point energies computed using B3LYP/6-31G(d,p) geometries. For barrier heights the mean absolute differences (MADs) are 13, 15, and 17 kcal/mol for PM6, PM7 and DFTB3. The 13 kcal/mol MAD for PM6 is comparable to the 14 and 10 kcal/mol MADs computed for PM6 by Korth and Thiel (xx) for the BH76 and BHPERI datasets, respectively. For PM6 the accuracy best for AspDC and 4-OT, for which the models only consist of atoms in the first and second row of the periodic table. For PM6 and PM7 the MADs are dominated by the PTE system (the only system containing a transition metal, Zn) where the errors range from 30 to 41 kcal/mol. Removing these two entries reduces the MADs to 5 and 7 kcal/mol, respectively for PM6 and PM7. It is also worth noting that the second barrier in the PTE mechanism computed relative to the reaction intermediate is reproduced to within 3.4 and 1.4 kcal/mol with PM6 and PM7. For DFTB3 the MAD is dominated by AspDC and HKMT with errors of 32 and 27 kcal/mol, while the first and second barrier for PTE is reproduced very and reasonably well, respectively.

The errors computed for reaction energies have MADs of 9, 14, and 17 kcal/mol for PM6, PM7, and DFTB3. The lower MAD for PM6 compared to barrier heights is primarily due to the fact that the 10 error in the second step of the PTE mechanism (i.e. the difference between the product and the reactant) is considerably smaller than the 33 kcal/mol error in the corresponding barrier height. In the case of PM6 and PM7 there is generally a correlation between errors in the reaction energies and errors in the corresponding barrier heights. In the case of DFTB3 the correlation, if any, is inverse, i.e. errors in reaction energies are large for systems with small errors in barrier heights.