While the total floor area is recommended by Random as one of the main features in explaining the variation among the building energy consumption, the R-square shows that there is still a great part of the variation the model cannot explain, which can be attributed to each building’s degree of efficiency. Hence, we conducted the K-means clustering technique on the two features annual energy consumption and property floor area for the year 2015. The results from clustering allow us to group the buildings into different clusters, which is an indicator of how well they perform in energy consumption. We can obtain 6 clusters, of which two have small size 11 and 2 buildings, respectively, thus are more likely to be outliers. Figure 7 shows the four major clusters with the best performance group locates at the right bottom corner. The common range where most building energy consumptions from all clusters fall into is around 50 - 150 kbtu/ft2). However, the property floor area ranges are different between different clusters. The buildings, which have larger floor area but consume the same or less energy are more energy efficiently. The best performance group have the highest mean number of floors and the building floor area ratio and the most recent year built.

5.4.2 Yearly Trends in Energy Consumption (2012 - 2015)

Our Silhouette analysis suggests that the appropriate number of cluster is two with the average score is 0.583868527166. To examine our K-means clustering analysis, we grouped by the mean of  energy consumption of each year based on labels. We got the standard deviation of each cluster energy consumption and then plotted mean and standard deviation of both clusters (Figure 8). As Figure 8 shows, one of the clusters has a wider standard deviation and also the mean of the energy consumption among its’ years is higher. The one with higher mean has 1191 members in it while the other one has 2274.