this is for holding javascript data
David Koes edited subsection_Dataset_The_specific_modality__.tex
over 8 years ago
Commit id: 288d3fa75329a8ed2105915b228b5496ab37bd3f
deletions | additions
diff --git a/subsection_Dataset_The_specific_modality__.tex b/subsection_Dataset_The_specific_modality__.tex
index 5bdeb34..d85c326 100644
--- a/subsection_Dataset_The_specific_modality__.tex
+++ b/subsection_Dataset_The_specific_modality__.tex
...
The specific modality of fragment-oriented molecular shapes requires the creation of a custom benchmark for assessing the virtual screening performance of the method. In order to construct the shape constraints, a receptor-ligand structure is required, and, in order to screen a library, all compounds in the library must contain the desired anchor fragment.
We use the Maximum Unbiased Validation (MUV)
dataset\cite{Rohrer2009} benchmark \cite{Rohrer2009} as a starting point. MUV includes sets of 30 active and 15000 property-matched decoy compounds for each of 17 targets. Compounds are selected from PubChem bioactivity data using a methodology that both reduces the similarity of actives (to avoid analogue
bias\cite{Good2008}) bias \cite{Good2008}) and increases the similarity between actives and decoys (which helps prevent artificial
enrichment\cite{Verdonk2004}). enrichment \cite{Verdonk2004}). MUV is also noteworthy in that the decoys were all assessed to be inactive in the initial high-throughput screen against the target providing some measure of experimental evidence that the negatives are true negatives (as opposed to schemes that generate decoys through random
sampling\cite{Mysinger_2012}). sampling \cite{Mysinger_2012}). The construction of the MUV dataset makes it particularly challenging for ligand similarity approaches. In an evaluation of several screening protocols (including molecular shape), poor enrichments were obtained for all protocols for the majority of query molecules \cite{Tiikkainen_2009}. The use of such a challenging benchmark allows us to critically evaluate FOMS and shape constraints in the context of a dataset where traditional approaches often fail.
Of the 17 targets in the MUV dataset, we identified 10 that had a receptor-ligand structure in the Protein Data Bank (PDB) where the ligand had sub-$\mu$M affinity. The interaction diagrams of these structures and their PDB codes are shown in Figure~\ref{targets}. For each of these structures we identified interacting fragments that could potentially serve as anchor fragments. For each target we selected relatively generic functional groups (at most 7 atoms) that were sufficiently common among both the actives and decoys to yield meaningful results and that were clearly forming interactions with the receptor. Chosen fragments are shown in Table~\ref{fragtable} which includes the resulting number matching actives and decoys for each benchmark. A second fragment is chosen for the PKA target to illustrate the effect of fragment specificity.