INTRODUCTION

Electric Load is irregular in nature (Fig. \ref{988369}) and there exists no system as of yet of storing energy on a wide scale that is cheap, sustainable, efficient and environmentally-friendly. Statistical models can be used to predict electrical power demand in a certain time frame using electrical data of domestic consumers and enterprises. Employing electric load density estimation will enable us to predict the electrical power required for the day, which we will then need to store less energy for future use and in turn, diminish the energy lost in the process of storage. Further applications of electric load density estimation include planning studies for optimal allocation of renewable distributed generation to minimize annual energy loss \cite{Atwa_2010}, evaluating the reliability and accuracy of power systems with PV power generation \cite{Li_2019a}, integrating of batteries with photovoltaic plants on a large scale \cite{Nor_2018}.
In literature, electric load probability is usually modeled using parametric models, such as the Gaussian distribution \cite{Guo_2019} \cite{Sheng_2019} and the beta distribution \cite{Nor_2018}. However, through careful inspection of electric load, it would not be accurate to model electric load using a Gaussian distribution nor a Beta distribution due to its bimodal nature \cite{seppl1996} (Fig. \ref{998686}). In this paper, two nonparametric approaches, Root Transform Local Linear Regression (RTLLR) and Kernel Density Estimation (KDE), are proposed to provide an accurate model of electric load probability density functions. The former turns the probability estimation problem into a regression problem through binning and variance stabilizing transformation of electric load data. The latter assigns each data observation a certain weight where more populated intervals end up having higher probability density function values. The performances of RTLLR and two KDE models with Rule-of-Thumb bandwidth selectors are compared to two parametric models, Gaussian and Gamma Distributions, and are evaluated through the Kolmogorov-Smirnov goodness-of-fit test, Coefficient of Determination (\(R^2\)) and four error metrics: Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Mean Biased Error (MBE).
The major contributions of this work can be summarized as follows:
  1. For the first time, RTLLR is proposed for estimating electric load data. It is based on converting the probability density estimation problem into a regression problem that is solved by local linear regression. The resulting procedure is easy to use and computationally efficient.
  2. We show that the RTLLR model avoids the boundary bias present in KDE models and is less sensitive to outliers in the data. The RTLLR model visually follows the shape of electric load distribution and has the lowest error metric values and highest \(R^{2}\) values when compared to the KDE models and the Gaussian and Gamma models.
  3. An interactive web application has been built to equip users with all the tools to replicate the analysis presented in this paper on any type of univariate data.
The remainder of the paper is organized as follows. A detailed description of the problem and electric load data is presented in Section 2. Then, the statistical model, assessment methods, and developed software are described in Section 3. Results are discussed in Section 4 and the paper then closes with a conclusion and acknowledgments.