Authorea

Thomas Lin Pedersen edited R&D.tex over 9 years ago

Commit id: 58cbd0f676dc6e09a6b957b2c5f4d1e99f740001

deletions | additions

\subsubsection{One-class Support Vector Machine} One of the most classical one-class outlier detections is one-class svm (osvm) support vector machine (oSVM) where an svm a support vector maching (SVM) is trained to contain a set of samples in the most efficent way. Outliers are then defined as samples laying outside the bounds of the support vectors. Osvm oSVM is a hard classification technique an and the output will only be outlier/non-outlier for every sample. Thus it is not useful for monitoring slow drifts in the output as PCA is, but can compliment such a method by labelling suspecious samples that might hide themself across multiple dimensions. To investigate the use of osvm on our data an svm SVM was trained to the training data (using the kernlab package \citep{kernlab}). Different kernel transformations were investigated but eventually abandonned as the high dimensionality of the data alleviated their need. On the contrary using anything but the simplest polynomial kernels made the model overfit to an extend that every subsequent sample was labelled as an outlier. Augmenting the PCA control chart with outlier labelling from the osvm oSVM model makes itis obvious that using the two different approaches by themselves would lead to very different conclusions. The PCA model will to a higher degree illuminate extreme samples that might fit well within the model space, while the osvm oSVM will reveal samples that more generally doesn't fit to the training data (be it extreme or odd samples).