David Koes edited paragraph_Toolkits_The_Biochemical_Algorithms__.tex  about 8 years ago

Commit id: 20ab0acf2edda99edf7a0568a790564f877ff49b

deletions | additions      

       

The Chemistry Development Kit (CDK) \cite{Steinbeck_2006} is a cheminformatics toolkit written in Java. Its capabilities include support for reading and writing a variety of chemical formats, descriptor and fingerprint calculation, force field calculations, substructure search, and structure generation.  The utilities checkmol and matchmol \cite{Haider_2010} decompose and compare functional groups of input molecules.  Chem$^f$ \cite{H_ck_2012} is a minimal cheminformatics toolkit written in the functional language Scala.  ChemmineR \cite{Cao_2008} is a cheminformatics package for the R statistical programming languages that is built using OpenBabel. Its capabilities include property calculations, similarity search, and classification and clusters of compounds.  Cinfony \cite{cinfony} provides a single, simple standardized interface to other cheminformatics toolkits, including Open Babel, RDKit, the CDK, Indigo, JChem, OPSIN, and several web services.  ConvertMAS is a utility for converting between formates and merging and splitting multi-molecule files.  CurlySMILES \cite{Drefahl_2011} provides parsing functionality for an extension of the SMILES format that supports the description of complex molecular systems.  Filter-it\textsuperscript{TM} filters a set of molecules based on their properties such as physicochemical parameters and graph-based properties.   fmcsR \cite{Goecks_2010} is an R package that efficiently performs flexible maximum common substructure matching that allows minor mismatches between atoms and bonds in the common substructure.  Frog2 \cite{Miteva_2010} uses a two stage Monte Carlo approach coupled with energy minimization to rapidly generate conformers. Frowns is a cheminformatics tookkit toolkit  mostly written in Python that provides basic support for SMILES and SD files, SMARTS search, fingerprint generation, and property perception. Helium is a cheminformatics toolkit written using modern C++ idioms that provides support for SMILES files, fingerprints generation, and SMARTS and SMIRKS. 

LICSS \cite{Lawson_2012} integrates with the CDK to provide representations and analysis of chemical data embedded within Microsoft Excel.  The Lilly MedChem Rules (LMR) \cite{Bruns_2012} apply filters to avoid reactive and promiscuous compounds. MayChemTools is a collection of Perl script scripts  for manipulating chemical data, interfacing with databases, generating fingerprints, performing similarity search, and computing molecular properties. Mychem is built using OpenBabel and provides an extension to the MySQL database package that adds the ability to search, analyze, and convert chemical data within a MySQL database.  The Open Drug Discovery Toolkit (ODDT) \cite{W_jcikowski_2015} is entirely written in Python, is built on top of RDKit and OpenBabel, and is focused on providing enhanced functionality for managing and implementing drug discovery workflows, such as making it easy to implement a docking pipeline.   The Open Molecule Generator (OMG) \cite{Peironcely_2012} enumerates all possible chemical structures given constraints on their composition.  Open Babel \cite{O_Boyle_2011} is substantial cheminformatics toolkit written in C++ with Python, Perl, Java, Ruby, R, PHP, and Scala bindings. Its capabilities include support for more than 100 chemical file formats, fingerprint generation, property determination, similarity and substructure search, structure generation, and molecular force fields. It has absorbed the Confab \cite{confab} conformer generator which produces 3D structures through the systematic enumeration of torsions and energy minimization.  OPSIN \cite{Lowe_2011}, the Open Parser for Systematic IUPAC nomenclature, converts plain-text chemical nomenclature to machine readable CML or InChi formats. 

Ouch (Ouch Uses Chemical Haskell) is a minimal cheminformatics toolkit written in the functional language Haskell.  Pybel \cite{O_Boyle_2008} provides the  full functionality of Open Babel remains available, Babel,  but common routines are provided in a simplified, more `pythonic' interface. RDKit is a substantial cheminformatics toolkit written in C++ with Python, Java and C# bindings. Its capabilities include file handling, manipulation of molecular data, chemical reactions, substantial support for fingerprinting, substructure and similarity search, 3D conformer generation, property determination, force field support, shape-based alignment and screening, and integration with PyMOL, KNIME, and PostgreSQL.  Screening Assistant 2 (SA2) \cite{Guilloux_2012} The Small Molecule Subgraph Detector (SMSD) \cite{Rahman_2009}  is aGUI written in  Java that integrates with other toolkits to help manage, analyze, and visualize libraries of compounds. library for calculating the maximum common subgraph between small molecules.  sdf2xyz2sdf \cite{Tosco_2011} converts between SDF Som-it\textsuperscript{TM} is an R package for creating  and TINKER XYZ files. visualizing self-organizing maps from large datasets.  Shape \cite{Rosen_2009} employs a genetic algorithm to generate conformations of carbohydrates.  The Small Molecule Subgraph Detector (SMSD) \cite{Rahman_2009} is a Java library for calculating the maximum common subgraph between small molecules. \paragraph{Standalone Programs}  Som-it\textsuperscript{TM} The utilities checkmol and matchmol \cite{Haider_2010} decompose and compare functional groups of input molecules.  ConvertMAS  is an R package a utility  for creating and visualizing self-organizing maps from large datasets. converting between formates and merging and splitting multi-molecule files.  Filter-it\textsuperscript{TM} filters a set of molecules based on their properties such as physicochemical parameters and graph-based properties.   Frog2 \cite{Miteva_2010} uses a two stage Monte Carlo approach coupled with energy minimization to rapidly generate conformers.  The Lilly MedChem Rules (LMR) \cite{Bruns_2012} apply filters to avoid reactive and promiscuous compounds.  The Open Molecule Generator (OMG) \cite{Peironcely_2012} enumerates all possible chemical structures given constraints on their composition.  sdf2xyz2sdf \cite{Tosco_2011} converts between SDF and TINKER XYZ files.  Shape \cite{Rosen_2009} employs a genetic algorithm to generate conformations of carbohydrates.  Strip-it\textsuperscript{TM} is built using Open Babel and extracts molecular scaffolds.  

The Konstanz Information Miner (KNIME) is a general workflow environment that includes a number of plugins for cheminformatics, such as CDK \cite{Beisken_2013} and RDKit modules, as well as bioinformatics and machine learning modules.  Screening Assistant 2 (SA2) \cite{Guilloux_2012} is a GUI written in Java that integrates with other toolkits to help manage, analyze, and visualize libraries of compounds.  Weka \cite{Hall_2009} is a platform for data mining and machine learning that can be adapted for cheminformatics.   \paragraph{Molecular Editors}