zhongyu Wan

and 2 more

This work investigates possible improvements in the accuracy of semiempirical quantum chemistry (SQC) methods for the prediction of standard enthalpy of formation (Δ_f H^o) through the use of artificial neural network (ANN) with molecular descriptors. A total number of 142 organic compounds with enough structural diversity has been considered in the training set. Standard enthalpy of formation for the selected compounds at the semiempirical PM3 and PM6 quantum chemistry methods is collected from literature, and is calculated by using semiempirical PM7 method in this work. The multiple stepwise regression is first employed to screen effective molecular descriptors, which are highly correlated with the error terms of the standard enthalpy of formation compared with experimental values. The obtained 7 effective molecular descriptors are then used as input set to establish three 7-11-1 neural network-based correction models to improve the accuracy of SQC methods. By using the developed correction models, the mean absolute errors (MAE) for Δ_f H^oof PM3, PM6, and PM7 methods are reduced from 22.36, 18.60, 17.27to 9.86, 9.83, 8.95, respectively in kJ/mol. Meanwhile, the results of the test set show that the neural network does not have the problem of over-fitting. Detailed analysis of the 7 effective molecular descriptors indicates that the major source to the correction models is from the electron withdrawing effect. The developed ANN models for the three selected SQC methods provide an efficient method for the quick and accurate prediction of thermodynamic properties.