Based on these observations of the \(A\) and \(W\) parameters, we found that setting \(A\)=3 and \(W\)=4 resulted in the best balance between number of patterns generated and resolution of detail needed to adequately filter discords in a 24 hour period. While these findings are specific to our case studies, we hypothesize that similar settings will be useful when analyzing other building performance data due to the generally reoccurring daily patterns. These initial parameter setting may be used as a default when implementing the DayFilter process and adjusted accordingly based on visualizations similar those developed in this section.

Clustering parameters

The number of clusters to be created in the clustering step is based primarily on the interpretation of the motif candidates by an analyst. However, to understand the quality of clustering for a statistical standpoint, we executed the clustering step on both case studies with quantities, \(k\), between 2 and 11 clusters. Equations \ref{eq:silhouette_eq} and \ref{eq:sumofsquare_eq} are used to calculate the silhouette score and sum of squares error for each scenario. Figure \ref{fig:clusteringquality} illustrates the two quantitative clustering metrics calculated for a range of cluster number options for the two case studies. The results of these quantitative metrics are relatively flat after 4 clusters indicating that there isn’t a significant difference between 4 and 11 clusters based on the metrics. Cluster sizes of 2 and 3 performed better in the Silhouette score, however these number of clusters would usually aggregate the data too much to properly capture the important structure. The minimum sum of squares metric remains consistent through the experiments.