David Koes edited section_Results_Consistent_with_previous__.tex  over 8 years ago

Commit id: e54d9d5860b05b3b5add3c38f4eb3a7d7805155a

deletions | additions      

       

\section*{Results}  Consistent with previous studies \cite{Tiikkainen_2009}, we find the MUV dataset to be a challenging target for shape-based screening with few targets demonstrating AUCs far from random performance. Overall FOMS either matched or exceeded the virtual screening performance of VAMS while retaining most of the benefits of pre-aligned molecules. Specifically, it is orders of magnitude faster than the optimizing alignment of RDKit suggesting, at a minimum, that FOMS is a viable method for rapidly pre-screening large libraries. In general, the pareto-optimal Pareto  frontier of the shape constraint queries (highlighted with solid circles in the provided ROC plots) delineated delineates  virtual screen performance equivalent or better to a full shape similarity search while performing queries orders of magnitude faster. For each target, we plot the ROC curves for FOMS, VAMS, and RDKit similarity search as well as the results for every interaction point shape constraint query. Shape constraint results along the Pareto frontier are highlighted as solid circles and the most statistically significant result is annotated with its Bonferroni-corrected p-value. We also plot the total time required for the FOMS, VAMS, and RDKit similarity search and provide a box plot of the distribution of times for the shape constraint searches. Shape constraint search time varies considerably depending on the number of hits retrieved. Most queries are highly selective, return few or no results, and take less than a hundredth of a second. We do not include the time needed to generate the shape constraints in the query time since the receptor-based constraints can be generated once and caches for future searches. We plot times for only those queries that generate results. For these queries the median time, as shown in the box plots, remains below a tenth of a second. A few queries are non-selective queries and return subsets of compounds that are comparable in size to the full database. These queries approach or exceed the running times of a full similarity search. The statistically most significant shape constraint query, annotated with a p-value in the ROC plot, has its running time plotted as a circle. These informative queries typically exceed the median time, but still take less than a second.  cathg - all random, but shape constraints do significantly better  eralpha - all random  eralpha-pot - all random (or worse)  erbeta - rdkit wins, FOMS=VAMS (0.43), lack of significance for sc  hivrt - almost random or worse, rdkit (0.57), shape constraints match rdkit, FOMS > VAMS, only one with good early enrichment  fxia - rdkit wins, all other lousy  hsp90 - foms and vams basically random, vams worse, sc not impressive   rho - fantastic, FOM >> VAMS > rdkit, shape constraints even better  pka - fantastic, FOM >> VAMS > rdkit, shape constraints even better  but totally different resutls for pka.f5