IntroductionState of the art for cancer heterogeneity deconvolution from methylation dataset+ limitation for state of the arts methodsState of the art for benchmarkingeg :eg :eg :eg :AN EXAMINATION OF PROCEDURES FOR DETERMINING THE NUMBER OF CLUSTERS IN A DATA SET (Milligan et al.)Comprehensive benchmarking and ensemble approaches for metagenomic classifiers (McIntyre et al.)A comprehensive database for benchmarking imaging systems (Panetta et al.)--> Importance of a robust database--> Importance of reliable metrics to evaluate the different tested approachesState of the art for data challenge organization (DREAM challenge, MVA Master yearly challenge, AMPS Hackathon, Codalab platform eg)CDS Saclay https://arxiv.org/abs/1705.07099SSMPG 2015 Aussois https://www.biorxiv.org/content/early/2016/05/24/055046State of the art for data challenge organization (DREAM challenge, MVA Master yearly challenge, AMPS Hackathon, Codalab platform eg)- Ghouila A, Genome Research, 2018- Gönen M, Cell Systems, 2017 (DREAM challenge) https://doi.org/10.1016/j.cels.2017.09.004- Seyednasrollah F, JCO Clinical Cancer Informatic, 2018 (DREAM challenge)Material and methodsData Challenge organisation+ program, location+ Amount of participants: 34 from 5 different countries and from different backgrounds (bioinfo/applied maths/statisticians/computer science)+ The challenge existed of two parts+ Amount of participants: 34 from 5 different countries and from different backgrounds (bioinfo/applied maths/statisticians/computer science) + Short courses on biology & statistics+ Invited speakers to present methods Data Challenge platform+ CodalabThe organization set-up a platform to share the original simulated DNA methylation, uploading and running scripts+ Limitations / requirements: 3 minutes running code challenge 1, RData object in challenge 2 phase 1, Running time 20 minutes slightly more noise in phase 2, Goal / introduction dataData Challenge simulated dataset and scoring metric+ Simulated dataset 1+ Simulated dataset 2+ scoring metric (MAE type 1, MAE type 2)Statistical methods for confounders consideration and feature selectionStatistical methods for k determinationStatistical methods for deconvolution
panarticiowContextoweverContextIntroM&MResultsmain tablemeta-resultsData Challenge provided an environment for : constant feedback for developers of R packages -> possible improvements of methods feedback and discussion for participants -> gain of knowledge and improvement of working skills discussion about the necessity of good state of the art, independent benchmark data for this research fieldDiscussion BenchmarkingPilot benchmarking in this data challenge workshopAdvantage: data simulated by individuals who were not competing / developing solutions, therefore solutions were independent of methodologyWill lead to more comprehensive benchmarking project that follows similar methodology but in a more structured format.Other similar benchmarking efforts (literature review)However, the extent, number of methods included, and the scope of scenarios those studies considered is not satisfactory (Suppl. Table XX).