Semiempirical electronic structure methods are increasingly parameterized and benchmarked against data obtained by DFT or wavefunction-based calculations using rather than experimental data (Stewart 2007, Scholten 2003, Gaus 2013). Using calculated data has the advantage that it represents the precise value (usually the electronic energy) that is being parameterized, with little random noise with good coverage of chemical space, including molecules that are difficult to synthesize or perform measurements on. Carefully curated benchmark sets, such as GMTKN30 (Goerigk 2011), are therefore an invaluable resource to the scientific community and heavily used.
For example, Korth and Thiel (Korth 2011) used the GMTKN24-hcno dataset (21 subsets of the GMTKN24 data set (Goerigk 2010), an earlier version of GMTKN30) to show that modern semi-empirical methods are approaching the accuracy of PBE/TZVP and B3LYP/TZVPcalculations. While this is encouraging one concern is whether the results obtained for the small systems that make up these data sets are representative of those one would obtain for the large systems. For example, Yalmazer and Korth (Yilmazer 2013) performed a benchmark study of hundreds of protein-ligand complexes that included protein atoms within up to 10 Å from the ligand and showed that, for example, that the mean absolute deviation (MAD) between interaction energies computed using PM6-DH+ and BP86-D2/TZVP was 14 kcal/mol. In comparison the MADs for the S22 interaction energy subset of GMTKN24 are <2 kcal/mol for both dispersion corrected PM6 and DFT/TZVP calculations in the Korth and Thiel study (Korth 2011). One likely explanation is that the systems in the S22 subset are too small to exhibit many-body polarization contributions to the binding energy that semi-empirical methods fail to capture. Another, or additional, reason is that the S22 subset does not include ionic groups, which are quite common in proteins and ligands.
The Yalmazer and Korth study (Yilmazer 2013) raises a similar question about whether benchmark results for semiempirical barrier height-predictions on small systems, such as the BH76 and BHPERI subsets of GMTK24/30, are transferable to barrier height predictions for enzymes. This towards answering this question is to create a benchmark set barriers computed for systems that are relatively large and representative to enzymatic reactions. This is a considerable challenge because, unlike for ligand-protein complexes, there is no large database of corresponding transition state (TS) structures (or even substrate-enzyme structures) to start from. Thus, TS structures must be computed which is time-consuming and hard to automate. There are a significant number of such structures in the literature but many are not computed at a high enough level of theory to serve as benchmarks. Furthermore, TS structures are known to dependent significantly on the level of theory used and it is therefore important that the benchmark set is computed using identical or very similar levels of theory. Creating such a benchmark set is this a considerable challenge for any one research group but can be addressed by a concerted effort from the community. This paper represents the first step in this process.
We have collected barrier heights and reaction energies (and associated structures) for five enzymes from studies published by Himo and co-workers (Chen 2007, Georgieva 2010, Hopmann 2008, Liao 2011, Sevastik 2007) on a Gitbub repository (github.com/jensengroup/db-enzymes). Using this data, obtained at the same level of theory, we then benchmark PM6, PM7, PM7-TS, and DFTB3 and discuss the influence of system size. bulk solvation, and geometry re-optimization on the error. We end by discuss steps needed to expand and improve the data set and how other researchers can contribute to the process.