David Koes edited Introduction.tex  over 8 years ago

Commit id: e733c70d0ed6f85f25b22e3eed288d6b7f17d7f0

deletions | additions      

       

Shape-based virtual screening typically attempts to identify the most similar molecules in a virtual library to a known active molecules or to a pseudo-ligand that is derived from the desired binding site\cite{Ebalunode2008}. Shape similarity is usually assessed either through alignment methods, which seek to maximize the three dimensional overlap of two shapes, or through feature vector methods, which transform shapes into a low-dimension vector of features that can be efficiently compared. As part of the similarity calculation, molecular shapes may be further annotated with electrostatic or pharmacophore features.\cite{Vainio2009,Cheeseright2006,Thorner1996,Tervo2005,Marin2008,Sastry2011}  Alignment methods attempt try  to either maximize find  the volume overlap optimal overlay  of two molecules to either maximize the overlapping volume  or the correspondence betweenidentified  feature points, such as molecular field extrema\cite{Vainio2009,Cheeseright2006}. Volume The predominant method of maximizing volume  overlap is usually maximized by representing to represent  the molecular shape shapes  as a collection of Gaussians,\cite{Good1993,Grant1996} sampling sample  several starting points, and using use  numerical optimization to find a local maximum. Alternative, the molecule may be decomposed into a set of features, such as pharmacophore features\cite{Sastry2011}, field points\cite{Thorner1996,Vainio2009,Cheeseright2006}, or hyperbolical paraboloid representations of patches of molecular surface\cite{Proschak2008}, and various point correspondence algorithms may be used to generate an alignment. Although a number of performance improvements to alignment methods have been described,\cite{Grant1996,RushIII2005,Sastry2011,Fontaine2007} the task remains computationally intensive. An alternative, computationally less demanding approach is to reduce molecular shapes to a simple vector of Boolean or numerical features. For example, a small set of reference shapes may be used to define a Boolean shape-fingerprint\cite{Haigh2005,Putta2002} or translation and rotation invariant properties such as geometric moments\cite{Ballester2007} or ray-tracing histograms\cite{Zauhar2003} maybe used to create a numeric vector. Shape similarity is then computed by comparing these feature vectors with an appropriate metric, such as Euclidean distance.   The simplicity of the feature vector representation results in very fast screening (millions of shape comparisons per a second,\cite{Ballester2007} but comes at the loss of accuracy and interpret-ability. These approaches have been shown to correlate poorly with more rigorous shape similarity computations and have been deemed to be too fragile or blunt to be useful for virtual screening.\cite{Nicholls2010}