Besides predicting alternative central points for subsets, and consequently grouping alternative subset members, the number of clusters predicted can also vary depending on the algorithm selected. Whereas K-means and K-medoids require the number of clusters to be specified in advance, hierarchical clustering approaches automatically determine the number of clusters to group data points into without additional human intervention \cite{k-medoids_clustering,Dynamic_Time_Warping_Clustering}. Furthermore, as a form of unsupervised learning, clustering approaches will provide different group labels to subsets each time they are applied, even if the actual subset members remain unchanged, so a separate ‘subset mapping’ function based on ‘Hamming distance’ is required to ensure consistency in comparisons between generated clusters and expected groupings (see Table \ref{table:types_of_time_series_classification_techniques} for definition). Once again it is also worth noting that the definition of subsets using any clustering technique will only be valid if time series are being compared on comparative features rather than incomplete time series data. As such, time series segmentation based on shared features or imputation of missing data are again prerequisites for meaningful analysis, ensuring that only completed segments are used in defining subsets. Finally, if using feature-based distance measures as the basis for clustering (grouped into matrices of distance points relating each technology time series to every other time series) then it is generally suggested that either hierarchical clustering or the 'Partitioning Around Medoids’ (PAM) variant of K-Medoids are applied to the descriptive data \cite{k-medoids_clustering,Dynamic_Time_Warping_Clustering}.
Distance measures that can be used in clustering and feature alignment
A range of distance measures can be used as the basis of clustering and feature alignment techniques, each with their own specific focus when comparing data points. Some of the most commonly encountered distance metrics are summarised in Table \ref{table:types_of_time_series_classification_techniques} \cite{metrics,pdist}. Of these, the Euclidean distance is normally the default for most clustering software. However, the choice of distance metric to use is very important, as it will often have a strong influence on the results of any clustering. In this regard, distance metrics may refer to a standard measurement of differences between points in either Cartesian or transformed coordinate systems, the percentage dissimilarity between coordinate values, or the correlation observed between vectors. The choice of distance metric ultimately depends on the type of data being considered and the research questions being addressed. For example, if looking to identify clusters based on the similarity of profiles regardless of their magnitudes, correlation-based distances are normally better suited as a dissimilarity measure, as these consider the similarity of features irrespective of the Euclidean distance \cite{sthda}. However, Euclidean-based distance measures can be equally appropriate for gauging the similarity of time series providing that normalisation of the original time series has taken place \cite{berthold2016clustering}.