Naets edited It_is_clear_from_Figure__.tex  about 8 years ago

Commit id: 37c61fb3eee6d9e9ba151ef49f54474cd06cfa3f

deletions | additions      

       

It is clear from Figure \ref{DendrogramPCAData} that cutting the dendrogram once only two clusters remain seems to be an appropriate decision. However, by looking at Figure \ref{ClustersGGMCrossValidBIC}, which presents different criteria that can help selecting the optimal number of clusters, one can note it seems much more difficult to judge what might be an appropriate number of clusters in this case. In the absence of clear evidence in favor of another number of clusters, we decide to base our analysis on two clusters.  Figure \ref{MedianImshowNormDataGMMPCA} \ref{MedianImshowNormDataWardsPCA}  shows one the corresponding medians of each of the five principal components of the cars dataset corresponding to a two clusters cut off. It is interesting to see such Figure presents a very similar pattern (i.e. high first PC and small second PC and vice-versa) to the one seen in Figure \ref{MedianImshowNormDataGMMPCA} depicting the clustering based on the GMM. The only difference being that cluster 1 of the Ward's method basically corresponds to cluster 2 of the GMM method and vice-versa. Such similarity is satisfactory in the sens that the general natural structure of our datset can be portrayed in a pretty similar fashion by different algorithms.