this is for holding javascript data
chengds edited Comments_of_the_above_graphs__.tex
over 8 years ago
Commit id: 4a742610d8f92c8d4f0ba7e576286bf87c51a304
deletions | additions
diff --git a/Comments_of_the_above_graphs__.tex b/Comments_of_the_above_graphs__.tex
index cb2a844..0dda547 100644
--- a/Comments_of_the_above_graphs__.tex
+++ b/Comments_of_the_above_graphs__.tex
...
Comments of the above graphs:
\begin{itemize}
\item the graph for NEU(S) is really weird; it's mostly bimodal, with the classifier doing a perfect job 50\% of the
time and an awful job the rest of the time; does this imply a problem with the labels?
\item the graph for NEU(A) shows a very good classifier with some "bad"
users users;
\item in AGR(S) almost every user is close to 50\% accuracy, meaning that there does not seem to be good discrimination;
\end{itemize}
Average accuracy per-user. Finding CONFOUNDING users. It should be possible to pull more information by cross-referencing these accuracies with the confusion matrices, that is by checking the occurrences of false positives and false negatives.
Looking at those accuracies, it might be natural to ask if there are certain users that are consistently predicted badly. These "confounding" users would either have unreliable labeling (especially self-assessed) or unreliable selection of images (some other hidden variables explaining a "wider" range of preferences). The following figure shows the average per-user accuracy across all 10 traits (top), only the self-assessed (middle), and only the attributed (bottom).