Authorea

Clayton Miller edited Decision_threshold1.tex almost 10 years ago

Commit id: 5d604c6b2103c01fdfb6870f52084dbfd689fcb1

deletions | additions

We use k-means to cluster the daily profiles after removing the discord candidate day-types. This ensures load profile patterns that are not influenced by the less frequent discords. Time series clustering can be approached as a raw-data-based, feature-based, or model-based solution \cite{WarrenLiao:2005bq}. Numerous clustering techniques have been developed and evaluated for various contexts and optimization goals. The most common implementation is the k-means clustering algorithm and we chose to use it with the euclidean distance measure due to its simplicity and demonstrated appropriateness for this application \cite{Iglesias:2013ja,MacQueen:1967uv}. The algorithm in our application takes our daily chunks $(N_1, N_2, ..., N_n)$ and partitions these observations into $k$ sets, $S = \{{S_1, S_2, ..., S_k}\}$ so as to minimize the within-cluster sum of squares \cite{Rokach:2005ti}: \begin{equation} \argmin \sum_{i=1}^{k}\sum_{N_j\in S_i} \parallel N_j - \mu_i \parallel ^2 \label{eq:kmeans}