Thomas Lin Pedersen edited Materials and methods.tex  over 9 years ago

Commit id: 18400bd38c0830e877a0126a574a9f650f8a1ad7

deletions | additions      

       

The data used in this study has been provided by \citet{24494671} and match that used in their paper. It is a collection of metrics extracted using QuaMeter from samples across a range of US laboratories (Vanderbilt University Medical Center, Pacific Northwest National Laboratory, Broad Institute and John Hopkins University).  As example data the Velos data from Vanderbilt University Medical Center was used as it constituted the most samples over a long period of time. All samples were divided into runs by looking at the time difference between the sample and the next. A time difference exceeding 2 hours constituted the start of a new run. Using this approach 37 runs were identified in the dataset with a median runlength of 15 samples. Two of the runs only included one sample and were subsequently removed. In each run the first, middle and last sample were assigned to be standards used to monitor between run variation. A stable instrument period between Feb. 25 and April 15 2013 were identified and the samples from that period was used as a training set for between run variation. The training set thus included 89 samples. For within run analysis run 6 was chosen (August 31th 2012 ff) as it constituted 15 samples including one obvious and a few subtle outlier samples.  \subsection{Data analysis}  All analyses have been done in the statistical computing environment R \cite{R} \citep{R}  using a range of different packages that will get referenced accordingly when described. The code used for performing the analyses is available in an accompanying script file. The one exception is for the calculation of Angle Based Outlier Detection which was done using ELKI \citep{ELKI} as this contained the only known implementation.