Thomas Lin Pedersen edited R&D.tex  over 9 years ago

Commit id: c3f2b98bc7a50d5e9779df175be42655b6053d04

deletions | additions      

       

\subsubsection{One-class Support Vector Machine}  One of the most classical one-class outlier detections is one-class svm (osvm) where an svm is trained  to follow contain a set of samples in the most efficent way. Outliers are then defined as samples laying outside the bounds of the support vectors. Osvm is a hard classification technique an the output will only be outlier/non-outlier for every sample. Thus it is not useful for monitoring slow drifts in the output as PCA is, but can compliment such a method by labelling suspecious samples that might hide themself across multiple dimensions. To investigate the use of osvm on our data an svm was trained to the training data (using the kernlab package \citep{kernlab}). Different kernel transformations were investigated but eventually abandonned as the high dimensionality of the data alleviated their need. On the contrary using anything but the simplest polynomial kernels made the model overfit to an extend that every subsequent sample was labelled as an outlier. Augmenting the PCA control chart with outlier labelling from the osvm model it is obvious that using the two different approaches by themselves would lead to very different conclusions. The PCA model will to a higher degree illuminate extreme samples that might fit well within the model space, while the osvm will reveal samples that more generally doesn't fit to the training data (be it extreme or odd samples).