Authorea

David Koes edited Discussion.tex over 8 years ago

Commit id: 85a5fdca1ff8ffb8e532f6da61cbd39d0b2847a3

deletions | additions

\section*{Discussion} The MUV dataset, with its focus on eliminating analogue bias, is particularly resistant to single-query shape-based virtual screens \cite{Tiikkainen_2009}. This is reflected in our overall results, shown in Figure~\ref{aucs}, where only two targets (Rho and PKA) achieve AUCs where the 95\% confidence interval does not overlap with 0.5 (random performance). The remaining targets likely lack meaningful whole-molecule shape complementary between the query ligand and the active compounds of the benchmark. One exception may be HIV-rt, where this there is clear early enrichment, enrichment which indicates that a subset of the actives may be compatible with the query molecule. FOMS dramatically outperformed other methods for the Rho and PKA targets due to correct positioning of a fragment with key, conserved interactions. FOMS essentially provides a rapid means of template docking \cite{Ruiz_Carmona_2014,abagyan2015icm,Koes_2012} using shape-based scoring. The disadvantage of fragment-oriented approaches is they are critically dependent on the choice of fragment and its proper positioning in defining the query. Provided these requirements can be met, there are several advantages to shape-based fragment alignment search. By enforcing the fragment alignment, key interaction are guaranteed to be conserved. Previous results studies have demonstrated the importance of adding pharmacophoric properties (or `color') to shape similarity \cite{Hawkins_2007}. Fragment alignment introduces a hard bias toward matching a key portion of the query molecule without introducing any additional computation or calculation, computation, as required by more general methods. In fact, as we have shown, pre-alignment substantially reduces the computational overhead. Prealignment, whether to fragments (FOMS) or canonical internal coordinates (VAMS) is orders of magnitude faster than methods that dynamically optimize the alignment. This holds true even if the cost to create the search database is taken into account. The time to create the databases scales with the number of molecular shapes (about 10 shapes a second on our system) and compares favorable with RDKit search (2 molecules a second). As the common case is for a fragment oriented database to be re-used to for queries by multipleand users investigating multiple targets, in practice the cost of database creation gets amortized into insignificance. A major advantage of fragment alignment is that it enables the use of shape constraints. Shape constraint search generally tracked or improved upon the performance of FOMS similarity ranking (e.g. Figure~\ref{cathg}). As shown in Table~\ref{pvaltable}, shape constraints were able to generate statistically significant ($p < 0.01$) enriched subsets for six of the ten targets.