this is for holding javascript data
Xavier Holt edited The_Baseline_Model_as_a__.md
almost 8 years ago
Commit id: 29d12114cc57f53b1b87715bf3c3b12fe6591127
deletions | additions
diff --git a/The_Baseline_Model_as_a__.md b/The_Baseline_Model_as_a__.md
index 59bfa4b..8d53ec0 100644
--- a/The_Baseline_Model_as_a__.md
+++ b/The_Baseline_Model_as_a__.md
...
Our experimental parameters were the set of features used as well as the type of classifier. We tested a range of subset-configurations (indicated below) and compared a logistic-regression (logReg) model against one based on random-forests (rF). The hyperparameters of the logReg model were the penalty-metric (\(\mathcal{l}^1-, \mathcal{l}^2-\) or mixed-norms) and the regularisation parameter. In our rF model we optimised over maximum tree-depth.
### Results
We see that our best AUC score of `0.84` used an rF model trained on the full set of features **(Fig. ?)**. We include the full ROC curve for this configuration **(Fig. ?)**.
We see that our best AUC score was found with In fact rF models outperformed their logReg counterparts uniformly. Additionally, rF models were particularly good at consolidating the different features; in contrast to the logReg model, adding a
feature to the rF model
trained on the full set never decreased performance. The logReg model also made particularly poor use of
features. the 'freshness/recency' feature. This was a noisy feature with several large outliers. As rF models are highly robust, we are unsurprised by this finding **(Fig. ?)**.