Authorea

David Koes edited subsection_Shape_Indexing_and_Similarity__.tex over 8 years ago

Commit id: a5fff82a8cd413ab1f8c2649638c536c5ee3a25a

deletions | additions

\subsection*{Shape Indexing and Similarity} As with VAMS\cite{VAMS}, we use a matching and packing\cite{Koes_2014} bulk-loading algorithm to initialize an efficient data structure for volumetric shape constraint searches. Briefly, shapes are stored in a GSS-tree\cite{keim1999} where each leaf of the tree is a single molecular shape and each internal node includes a maximum included volume (MIV) and minimum surrounding volume (MSV). The MIV of a node MSV is the intersection union of all the molecular shapes beneath the node in the tree while the MSV MIV is the union. The MIV intersection. By appropriately applying minimum and MSV are used to determine if all the shapes beneath a node are capable of matching the specified maximum shape constraints: constraints to the MIV must not overlap with the excluded shape constraint while the included shape constraint must and MSV, it can be fully contained within determined if any of the MSV. If a node high shapes lower in the tree fails have the potential to match the specified shape constraints, a large fraction of constraints. If not, the entire subtree of molecular shapes can be eliminated as a result of from consideration, resulting in a single comparison. sub-linear running time. Shape constraints combined with shape indexing provide a rapid way to filter a virtual library. Alternatively, instead of serving as hard constraints, they can also be used to rank molecular shapes by similarity to the shape constraint query. We use the shape Tanimoto\cite{RushIII2005} to compute the similarity of two shapes: $$\delta(A,B) = \frac{A \cap B}{A \cup B}$$