Unique to Pharmit is the ability to select from a number of provided compound libraries or to submit a custom library for screening. The library to screen is selected through a pull down menu in the search button (see FigureĀ \ref{pharmfig}).
Large libraries corresponding to compound catalogs from a variety of sources are provided and periodically updated to ensure continued relevance, especially with regard to compound availability from commercial sources. Currently, Pharmit has pre-built libraries generated from CHEMBL21 \cite{Gaulton_2011}, with \(>1.4\) million compounds; ChemDiv (www.chemdiv.com), with \(>1.4\) million compounds; MolPort (www.molport.com), with \(>6.5\) million compounds; the NCI Open Chemical Repository (dtp.cancer.gov), with \(>108,000\) compounds; and PubChem \cite{Kim_2015}, with \(>66\) million compounds. Although a search is limited to the compounds of the selected library, all compounds within these provided libraries are cross-annotated so, for example, it is possible to look up the PubChem record of a compound found by searching the commercial MolPort library to check for known bioactivities.
Users may submit their own libraries for screening. In the spirit of the open access and open-source nature of Pharmit, users are encouraged to make their submitted libraries publicly accessible, in which case they are available to all users for screening as a user contributed library. However, registered users have the ability to create a private library, as well as remove or update previously submitted libraries.
In order to create a library, compounds may be provided either in the two-dimensional SMILES or three-dimensional SDF formats. If the user uploads compounds in the SMILES format, duplicated canonical SMILES are removed, the molecules are protonated using OpenBabel \cite{O_Boyle_2011} using default settings, and only the largest component of a molecule is retained (e.g., salts are removed). Then RDKit (rdkit.org) and the UFF force field \cite{Rappe_1992} are used to generate up to 10 3D conformers for each compound resulting from this procedure. This approach has been shown to generate high quality conformations \cite{Ebejer_2012}. Alternatively, if the user provides compounds in the SDF format, the provided structures are assumed to be valid conformers and are used directly, with protonation states determined by OpenBabel.