Figure 2 . The performance of Random Forest regression models to predict the a) Jsc , b)Voc , c) ff and d) ECE.
As demonstrated in Figure 2 , the RF regression models were utilized to predict the performance metrics of photovoltaic parameters, including the Jsc , Voc ,ff and ECE. The performance of the RF model was evaluated by using the coefficient of determination (R2) and mean squared error (MSE). The R2 values provide information on how much of the variance is due to the independent variables[21] while MSE is used to evaluate the quality of fit in terms of distance of the regressor to the actual training points.[22] The bound for R2 value is (-∞, 1], where -∞ indicates the worst output value and 1 indicates the best value. In other words, an output value of R2 close to 1 suggested a good prediction of the regression model. For MSE, the value is bounded between [0, +∞) and 0 MSE value is obtained if the linear regression model fits the data perfectly. [23]
In these predictions, the R2 values obtained forJsc , Voc , ff and ECE were 0.693, 0.523, 0.839 and 0.694, respectively, and showed that the RF model can explain 69.3%, 52.3%, 83.9% and 69.4% of the variation in the Jsc , Voc ,ff and ECE with the input features (nanopatterning depth of mp-TiO2 layer and wt% of PCBM). High R2 values for Jsc , ffand ECE indicate that the RF model effectively captures the relationship between the input features with Jsc , ffand ECE, showing the ability of the model to predict accurately. Although the R2 value for theVoc prediction is the lowest among the four parameters, it does not necessarily indicate that the RF model is less accurate for predicting the Voc , but rather highlights the complex interplay between different factors that influence the performance of the photovoltaic parameters. Despite this, the MSEs for all four parameters were relatively low, suggesting the model can still provide reliable predictions.