Authorea

David Koes edited subsection_Evaluation__.tex over 8 years ago

Commit id: fa2481068b9eb6f2e75d2b723631ab2d1d7f0f76

deletions | additions

\subsection*{Evaluation} \subsection*{Shape Indexing and Similarity} As with VAMS\cite{Koes_2014}, we use a matching and packing\cite{Koes_2014} bulk-loading algorithm to initialize an efficient data structure for volumetric shape constraint searches. Briefly, shapes are stored in a GSS-tree\cite{keim1999} where each leaf of the tree is a single molecular shape and each internal node includes a maximum included volume (MIV) and minimum surrounding volume (MSV). The MIV of a node is the intersection of all the molecular shapes beneath the node in the tree while the MSV is the union. The MIV and MSV are used to determine if all the shapes beneath a node are capable of matching the specified shape constraints: the MIV must not overlap with the excluded shape constraint while the included shape constraint must be fully contained within the MSV. If a node high in the tree fails to match the specified shape constraints, a large fraction of the molecular shapes can be eliminated as a result of a single comparison. Shape constraints combined with shape indexing provide a rapid way to filter a virtual library. Alternatively, instead of serving as hard constraints, they can also be used to rank molecular shapes by similarity to the shape constraint query. We use the shape Tanimoto\cite{RushIII2005} to compute the similarity of two shapes: $$\delta(A,B) = \frac{A \cap B}{A \cup B}$$ where a larger score indicates a greater degree of similarity. The similarity of a shape, $A$, with the included, $I$, and excluded, $E$, shape constraints is computed by combining the shape Tanimoto with the included constraint with the shape Tanimoto with the \textit{inverse} of the excluded constraint: $$\delta(A,I,E) = \delta(A,I) + \delta(A,\overline{E}) = \frac{A \cap I}{A \cup I} + \frac{A \cap \overline{E}}{A \cup \overline{E} }$$ The closer a shape is to meeting the included constraint, the larger the value of $\delta(A,I)$, while the more a shape violates the excluded constraint, the smaller the value of $\delta(A,\overline{E})$. The more an shape exceeds the included constraint, the more it is penalized by the $\delta(A,I)$ term, but, to the extent that its volume avoid the conflicting with the excluded shape constraint, it is rewarded by the $\delta(A,\overline{E})$ term.