this is for holding javascript data
Xavier Holt edited The_Baseline_Model_as_a__.md
almost 8 years ago
Commit id: b2e2ee9138c0b33c5e28f2a7ef8061c94f47b426
deletions | additions
diff --git a/The_Baseline_Model_as_a__.md b/The_Baseline_Model_as_a__.md
index 1d0ca95..350f05b 100644
--- a/The_Baseline_Model_as_a__.md
+++ b/The_Baseline_Model_as_a__.md
...
### Results
We see that our best AUC score of `0.84` used an rF model trained on the full set of features **(Fig. ?)**. We include the full ROC curve for this configuration **(Fig. ?)**.
In fact rF models outperformed their logReg counterparts uniformly. Additionally, rF models were particularly good at consolidating the different features; in contrast to the logReg model, adding a feature to the rF model never decreased performance. The logReg model also made particularly poor use of the 'freshness/recency' feature. This was a noisy feature with several large outliers. As rF models are highly robust, we are unsurprised by this finding **(Fig. ?)**.
* Reasonable performance, but model is very simplistic.
* Combining articles which independently are likely to be useful doesn’t give us any guarantee about the overall quality/coverage.
* We don’t directly model diversity of articles.
* We don’t account for the temporal aspect.
* BUT: The binary version of the model is a useful component in a more structured model.