Authorea

David Koes edited Discussion.tex over 8 years ago

Commit id: dc693dbf53c5ddb52ec907f631f459e6c87def25

deletions | additions

The MUV dataset, with its focus on eliminating analogue bias, is particularly resistant to single-query shape-based virtual screens \cite{Tiikkainen_2009}. This is reflected in our overall results, shown in Figure~\ref{aucs}, where only two targets (Rho and PKA) achieve AUCs where the 95\% confidence interval does not overlap with 0.5 (random performance). The remaining targets likely lack meaningful whole-molecule shape complementary between the query ligand and the active compounds of the benchmark. One exception may be HIV-rt, where this is clear early enrichment, indicates that a subset of the actives may be compatible with the query molecule. FOMS dramatically outperformed other methods for the Rho and PKA targets due to correct positioning of a fragment with key, conserved interactions. FOMS essentially provides a rapid means of template docking \cite{Ruiz_Carmona_2014,abagyan2015icm,Koes_2012} using shape-based scoring. The disadvantage of fragment-oriented approaches is they are critically dependent on the choice of fragment and its proper positioning in defining the query. Provided these requirements can be met, there are several advantages to shape-based fragment alignment search. By enforcing the fragment alignment, key interaction are guaranteed to be conserved. Previous results have demonstrated the importance of adding pharmacophoric properties (or `color') to shape similarity . \cite{Hawkins_2007}. Fragment alignment introduces a hard bias toward the specific fragment alignment without introducing any additional computation or calculation, as with color methods. In fact, as we have shown, pre-alignment substantially improves performance, which is the second main advantage. Prealignment, whether to fragments (FOMS) or canonical internal coordinates (VAMS) is orders of magnitude faster. This holds true even if the cost to create the search database is taken into account. The time to create the databases scales with the number of molecular shapes (about 10 shapes a second on our system) and compares favorable with RDKit search (2 molecules a second). Choice of fragment critical