Figure 2 . The performance of Random Forest regression models to
predict the a) Jsc , b)Voc , c) ff and d) ECE.
As demonstrated in Figure 2 , the RF regression models were
utilized to predict the performance metrics of photovoltaic parameters,
including the Jsc , Voc ,ff and ECE. The performance of the RF model was evaluated by
using the coefficient of determination (R2) and mean
squared error (MSE). The R2 values provide information
on how much of the variance is due to the independent
variables[21] while MSE is used to evaluate the
quality of fit in terms of distance of the regressor to the actual
training points.[22] The bound for
R2 value is (-∞, 1], where -∞ indicates the worst
output value and 1 indicates the best value. In other words, an output
value of R2 close to 1 suggested a good prediction of
the regression model. For MSE, the value is bounded between [0, +∞)
and 0 MSE value is obtained if the linear regression model fits the data
perfectly. [23]
In these predictions, the R2 values obtained forJsc , Voc , ff and
ECE were 0.693, 0.523, 0.839 and 0.694, respectively, and showed that
the RF model can explain 69.3%, 52.3%, 83.9% and 69.4% of the
variation in the Jsc , Voc ,ff and ECE with the input features (nanopatterning depth of
mp-TiO2 layer and wt% of PCBM). High
R2 values for Jsc , ffand ECE indicate that the RF model effectively captures the relationship
between the input features with Jsc , ffand ECE, showing the ability of the model to predict accurately.
Although the R2 value for theVoc prediction is the lowest among the four
parameters, it does not necessarily indicate that the RF model is less
accurate for predicting the Voc , but rather
highlights the complex interplay between different factors that
influence the performance of the photovoltaic parameters. Despite this,
the MSEs for all four parameters were relatively low, suggesting the
model can still provide reliable predictions.