David Koes edited subsection_Shape_Indexing_and_Similarity__.tex  over 8 years ago

Commit id: 20b6f88aa5a776cfa08d189cea29529702a7ced2

deletions | additions      

       

\subsection*{Shape Indexing and Similarity} Indexing}  As with VAMS\cite{VAMS}, we use a matching and packing\cite{Koes_2014} bulk-loading algorithm to initialize an efficient data structure for volumetric shape constraint searches.  Briefly, shapes are stored in a GSS-tree\cite{keim1999} where each leaf of the tree is a single molecular shape and each internal node includes a maximum included volume (MIV) and minimum surrounding volume (MSV). The MSV is the union of all the molecular shapes beneath the node in the tree while the MIV is the intersection. By appropriately applying minimum and maximum shape constraints to the MIV and MSV, it can be determined if any of the shapes lower in the tree have the potential to match the constraints. If not, the entire subtree of molecular shapes can be eliminated from consideration, resulting in a sub-linear running time.  Shape constraints combined with shape indexing provide a rapid way to filter a virtual library.   Alternatively, instead of serving as hard constraints, they can also be used to rank molecular shapes by similarity to the shape constraint query. \subsection*{Shape Similarity}  We use the shape Tanimoto\cite{RushIII2005} to compute the similarity of two voxelized  shapes: $$\delta(A,B) = \frac{A \cap B}{A \cup B}$$  where a larger score indicates a greater degree of similarity. We consider both similarities with a single query ligand and similarities with For  shapeconstraints. The  similarity of a shape, $A$, evaluations we consider only similarity  with the minimum, $MIN$, and maximum, $MAX$, shape constraints is computed by averaging their shape Tanimotos:  \begin{equation*}  \begin{aligned}  \delta(A,MIN,MAX) & = \frac{\delta(A,MIN) + \delta(A,MAX)}{2} \\  & = \frac{1}{2}\left(\frac{A \cap MIN}{A \cup MIN} + \frac{A \cap MAX}{A \cup MAX }\right)  \end{aligned}  \end{equation*}  The average ranges from zero to one, where a value of one is only achieved when all three shapes are identical. query ligand.