David Koes edited Discussion.tex  about 8 years ago

Commit id: a2b0111a7b2d79946ea855c70b11bc256e0018dc

deletions | additions      

       

The MUV dataset, with its focus on eliminating analogue bias, is particularly resistant to single-query shape-based virtual screens \cite{Tiikkainen_2009}. This is reflected in our overall results, shown in Figure~\ref{aucs}, where only two targets (Rho and PKA) achieve AUCs where the 95\% confidence interval does not overlap with 0.5 (random performance). The remaining targets likely lack meaningful whole-molecule shape complementary between the query ligand and the active compounds of the benchmark. One exception may be HIV-rt, where there is clear early enrichment which indicates that a subset of the actives may be compatible with the query molecule.  FOMS dramatically outperformed other methods for the Rho and PKA targets due to correct positioning of a fragment with key, conserved interactions. For comparison, Figure~\ref{aucs} also shows the performance of a 2D fingerprint, the OpenBabel FP2 \cite{O_Boyle_2011} path-based fingerprint. 2D information is more successful for three targets (ER$\beta$, FXIa, HSP90), and has comparable or worse performance for the remaining targets, illustrating the orthogonality of 2D and 3D approaches.  Our assumption in the design of this study was that since the evaluated methods utilize rigid conformers that it would be necessary to generate a large sampling of conformers to ensure the biologically relevant conformer was screened. Surprisingly, when we tested this assumption with the Rho Kinase target (results were similar for PKA), we found the the virtual screening performance for all three shape methods as relatively insensitive to the number of conformers used (Figure~\ref{confs}). Interestingly, this is not because the top ranked conformer is the most representative conformer. Instead, as demonstrated in Figure~\ref{confs}, reducing the number of conformers does result in a reduction in similarity scores for active compounds. However, a compensating reduction in scores is observed in the decoy set as the number of conformers sampled is decreased, resulting in similar virtual screening results.  FOMS essentially provides a rapid means of template docking \cite{Ruiz_Carmona_2014,abagyan2015icm,Koes_2012} using shape-based scoring. The disadvantage of fragment-oriented approaches is they are critically dependent on the choice of fragment and its proper positioning in defining the query. Provided these requirements can be met, there are several advantages to shape-based fragment alignment search. By enforcing the fragment alignment, key interaction are guaranteed to be conserved. Previous studies have demonstrated the importance of adding pharmacophoric properties (or `color') to shape similarity \cite{Hawkins_2007}. Fragment alignment introduces a hard bias toward matching a key portion of the query molecule without introducing any additional computation, as required by more general methods. In fact, as we have shown, pre-alignment substantially reduces the computational overhead.   Prealignment, whether to fragments (FOMS) or canonical internal coordinates (VAMS) is orders of magnitude faster than methods that dynamically optimize the alignment. This holds true even if the cost to create the search database is taken into account. The time to create the databases scales with the number of molecular shapes (about 10 shapes a second on our system) and compares favorable with RDKit search (2 molecules a second).