INTRODUCTION
Electric Load is irregular in nature (Fig. \ref{988369}) and there exists no system as of yet of storing energy on a wide scale
that is cheap, sustainable, efficient and environmentally-friendly. Statistical models can be used to predict
electrical power demand in a certain time frame using electrical data of domestic consumers and
enterprises. Employing electric load density estimation will enable us to predict the electrical power required for the day, which we will then need to store less
energy for future use and in turn, diminish the energy lost in the process of storage. Further applications of
electric load density estimation include planning studies for optimal allocation of renewable distributed generation to minimize
annual energy loss \cite{Atwa_2010}, evaluating the reliability and accuracy of power systems with PV power generation \cite{Li_2019a}, integrating of batteries with photovoltaic plants on a large scale \cite{Nor_2018}.
In literature, electric load probability is usually modeled using parametric models, such as the Gaussian
distribution \cite{Guo_2019} \cite{Sheng_2019} and the beta distribution \cite{Nor_2018}. However, through careful inspection of
electric load, it would not be accurate to model electric load using a
Gaussian distribution nor a Beta distribution due to its bimodal nature \cite{seppl1996} (Fig. \ref{998686}). In
this paper, two nonparametric approaches, Root Transform Local Linear
Regression (RTLLR) and Kernel Density Estimation (KDE), are proposed to
provide an accurate model of electric load probability density functions.
The former turns the probability estimation problem into a regression
problem through binning and variance stabilizing transformation of
electric load data. The latter assigns each data observation a certain
weight where more populated intervals end up having higher probability
density function values. The performances of RTLLR and two KDE models
with Rule-of-Thumb bandwidth selectors are compared to two parametric
models, Gaussian and Gamma Distributions, and are evaluated through the
Kolmogorov-Smirnov goodness-of-fit test, Coefficient of Determination
(\(R^2\)) and four error metrics: Root Mean Squared Error (RMSE), Mean
Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Mean Biased
Error (MBE).
The major contributions of this work can be summarized as follows:
- For the first time, RTLLR is proposed for estimating electric load
data. It is based on converting the probability density estimation
problem into a regression problem that is solved by local linear
regression. The resulting procedure is easy to use and computationally
efficient.
- We show that the RTLLR model avoids the boundary bias present in KDE
models and is less sensitive to outliers in the data. The RTLLR model
visually follows the shape of electric load distribution and has the
lowest error metric values and highest \(R^{2}\) values when compared
to the KDE models and the Gaussian and Gamma models.
- An interactive web application has been built to equip users with all
the tools to replicate the analysis presented in this paper on any
type of univariate data.
The remainder of the paper is organized as follows. A detailed
description of the problem and electric load data is presented in
Section 2. Then, the statistical model, assessment methods, and
developed software are described in Section 3. Results are discussed in
Section 4 and the paper then closes with a conclusion and
acknowledgments.