Preprocessing and statistical significance testing of time series classifications
Beyond the principal methods of classification outlined above, the preprocessing of time series datasets and means of statistical significance testing must also be considered. Preprocessing of data in particular is still an area that divides opinion within the statistics community, with some experts arguing that transformation, smoothing, and normalisation of datasets is required for unbiased time series comparisons, whilst others contend that in doing so a lot of information is removed that could otherwise be captured in error terms and that correlations may be over-stated \cite{Lucero_2000,Ramsay_2009,Smoothing_Data_Missing_Data_Nonparametric_Fitting,need_for_normalization,When_to_use_smoothing}. If focusing on long-term trends it is often recommended that analysis is based on either logarithms or inverse hyperbolic sine transformations of time series data rather than raw data in order to reduce focus on short cyclic features \cite{Functional_Data_Analysis_data_transformation,logarithm_transformation,zero_transformation,zero_transformation_discussion}. Similarly, simple moving averages are thought to be more appropriate than exponential smoothing (for long term trends if smoothing is to be applied) \cite{Simple_Vs._Exponential_Moving_Averages}.
A key data preparation requirement considered in the current study relates to the definition of shared curve features from bibliometric data that can be used to address the time series and Technology Life Cycle alignment issues highlighted in section \ref{284814}. These feature recognition and alignment processes are required to enable classification based on fair comparisons of dissimilar technologies. To ensure consistency, feature recognition processes should consider the relative height of plateaux observed between technology profiles from different industries, the rates of growth observed in the early stages of historical trends, and the influence of noise and incomplete time series data on the classifications being made. For these reasons it is assumed that unsmoothed, amplitude normalised time series which are subsequently segmented based on common curve features would enable these comparisons to be made. This approach would ensure that all curve amplitudes considered are relative on a global scale, whilst segmentation based on common features would enable consistency in defining early growth phases whilst allowing later incomplete segments to be discarded from classifications. As a basis for these feature extraction stages it is assumed that the Technology Life Cycle model proposed by Little provides a well-established concept and a sensible candidate for identification of common curve features \cite{little1981strategic}. However, identified curve features may still be unaligned in time, and consequently time transformation techniques, such as 'time warping' methods, are also recommended (this is discussed in more detail in section \ref{446824}).
In terms of being able to determine correlations between groups of time series datasets the Chi-square statistic is commonly used to test the independence of descriptive statistics derived from time series (time series classifiers are discussed in more detail in section \ref{446824}). However, as a consequence of the probability distribution function used in its significance test the Chi-squared approach is best suited to confusion matrices (i.e. cross-tabulated comparisons of predicted classifications against target classifications) which have all cell values being greater than or equal to five. As such, when smaller sample sizes are considered (such as the 23 technologies considered in this analysis), Fisher's exact test is more appropriate. In a similar fashion to the Chi-square test, Fisher's exact test is able to determine the significance of outcomes for samples taken at random from a population, but is not necessarily able to provide a ranking of the most statistically robust predictors (i.e. predictors that are likely to be accurate when considering out-of-sample predictions). It is worth noting that in the current study technologies have been deliberately selected based on their observed performance trends, and as such Fisher's exact test cannot be used to reject the null hypothesis (as samples are not being taken at random from a population) unless known time series classification labels are removed so that clustering is not based on human biases (i.e. unsupervised learning approach).
For subsequent ranking of predictors based on small sample sizes, cross-validation approaches are then required (discussed in more detail in section \ref{826884}). Histograms can also prove useful for determining the most frequently occurring individual factors in these cross-validation ‘bootstrapping’ processes, but cannot indicate what combination of factors would work best together.
Time series classification and feature alignment techniques
In order to identify and rank the predictive ability of different combinations of bibliometric indicators when used for classification purposes, an appropriate classifier first has to be selected that fits the data features being considered. In this sense time series classification procedures can be grouped based on the type of discriminatory features the techniques are attempting to find \cite{Bagnall_2016}: