Authorea

Comparison of the results of regression and Neuronal network methods

In order to compare the best FFSOLS regression and ANN models, one must first train these models and compute their estimated generalized error based on the same k-Folds. Since certain observations easier to predict might be contained in certain Folds of a model and not in those of the other model if the folds are different, using the same folds for both models allows one to avoid such issue.

Once again, 10-Fold cross-validation was chosen in order to estimate the generalized error. we can then compare the 10 test errors of each of the two models using a t-test in order to see if the vectors of test errors belong to the same population or not. Such t-test is given in Figure \ref{ResidAnnTraining} in addition to the corresponding box-plots of these vectors of test errors.

One can see that even though the visual impression given by the box-plots might indicates that the FFSOLS regression is globally more efficient than the ANN model, such assumption is not statistically significant as shown by the p-value of the test which is much bigger than 0.05 (i.e. the threshold for a 95 % confidence interval).

We can therefore not conclude that one model is more efficient than the other. However, one should be aware of that one of the assumptions in order to use t-tests, i.e. the normality of the two samples compared, is far from being fulfilled and hence, the power of such test must be put into perspective.