Authorea

David Koes edited Introduction.tex over 8 years ago

Commit id: e733c70d0ed6f85f25b22e3eed288d6b7f17d7f0

deletions | additions

Shape-based virtual screening typically attempts to identify the most similar molecules in a virtual library to a known active molecules or to a pseudo-ligand that is derived from the desired binding site\cite{Ebalunode2008}. Shape similarity is usually assessed either through alignment methods, which seek to maximize the three dimensional overlap of two shapes, or through feature vector methods, which transform shapes into a low-dimension vector of features that can be efficiently compared. As part of the similarity calculation, molecular shapes may be further annotated with electrostatic or pharmacophore features.\cite{Vainio2009,Cheeseright2006,Thorner1996,Tervo2005,Marin2008,Sastry2011} Alignment methods attempt try to either maximize find the volume overlap optimal overlay of two molecules to either maximize the overlapping volume or the correspondence betweenidentified feature points, such as molecular field extrema\cite{Vainio2009,Cheeseright2006}. Volume The predominant method of maximizing volume overlap is usually maximized by representing to represent the molecular shape shapes as a collection of Gaussians,\cite{Good1993,Grant1996} sampling sample several starting points, and using use numerical optimization to find a local maximum. Alternative, the molecule may be decomposed into a set of features, such as pharmacophore features\cite{Sastry2011}, field points\cite{Thorner1996,Vainio2009,Cheeseright2006}, or hyperbolical paraboloid representations of patches of molecular surface\cite{Proschak2008}, and various point correspondence algorithms may be used to generate an alignment. Although a number of performance improvements to alignment methods have been described,\cite{Grant1996,RushIII2005,Sastry2011,Fontaine2007} the task remains computationally intensive. An alternative, computationally less demanding approach is to reduce molecular shapes to a simple vector of Boolean or numerical features. For example, a small set of reference shapes may be used to define a Boolean shape-fingerprint\cite{Haigh2005,Putta2002} or translation and rotation invariant properties such as geometric moments\cite{Ballester2007} or ray-tracing histograms\cite{Zauhar2003} maybe used to create a numeric vector. Shape similarity is then computed by comparing these feature vectors with an appropriate metric, such as Euclidean distance. The simplicity of the feature vector representation results in very fast screening (millions of shape comparisons per a second,\cite{Ballester2007} but comes at the loss of accuracy and interpret-ability. These approaches have been shown to correlate poorly with more rigorous shape similarity computations and have been deemed to be too fragile or blunt to be useful for virtual screening.\cite{Nicholls2010}