AntND: A java framework for visualization, curation and optimization of metabolic networks.

Abstract

Abstract

Metabolic modeling is a widely used tool for the study and prediction of cell metabolism. High-quality metabolic models are needed for accurate metabolic simulations and predictions. The database BioModels (July 2015) hosts 2641 whole-genome stoichiometric models created using pathway information from KEGG or MetaCyc. The curation of a whole-genome metabolic model to make it functional and able to predict experimental data is a tedious process (Thiele 2010). AntND is a software framework that enhances the manual curation, analysis, visualization and optimization of metabolic models. It contains modules for reading and writing files in SBML format, easily editing the models, extracting subnetworks based on pathway information or based on the shortest path search, performing FBA and flux optimizations, visualizing and analyzing the metabolic networks. The visualization of the models is one the most powerful modules in AntND. It allows to extract and create layouts using GML (Graph Modeling Language) format, easily expand the network and coloring the nodes.

Avaliability and implementation: AntND is implemented using java and is distributed freely under GNU General Public Licence. It’s code is hosted in github.

Introduction

Understanding the metabolism of organisms is essential in many fields such as systems biology, metabolic engineering, disease research, etc. Whole-genome metabolic models are one of the most important tools used to model mathematically the metabolism of an organism. They can be seen as a representation of interconnected chemical reactions that represent our knowledge about the metabolism of the organism.

The reconstruction of a metabolic model is done in many cases in a automatic or semi-automatic way producing in many cases incomplete and non-accurate models. After a metabolic model has been created there is a need for curating, testing and visualizing it so all the problems can be solved in a short time frame.

Visualizing a metabolic network is still a challenge because of the big number of nodes highly interconnected (the nodes represent reactions and compounds). The current visualization methods rely on the knowledge of the curator to extract specific sub-networks from the model, but the model may contain unexpected pathways or reactions wrongly added by the reconstruction software. There is a need for an automatic sub-network extraction that takes into account the constrains and characteristics of that heterogeneous network. Many visualization tools exists as a part of metabolic reconstruction software (Rocha 2010) (Dias 2015) or independent of it (Shannon 2003) where the user may visualize part of metabolic network but they are unable to extract automatically sub-networks without information of all their components.

In this work we present an algorithm that can extract a sub-network taking into account the reaction stoichiometry and its constraints. The algorithm takes as a input the initial compound o compounds where the network is starting, a final compound where the network is ending, and a list of compounds that are not taken into account when the stoichiometry is calculated, such as water, ATP or some cofactors. The returned sub-network will contains all the reactions needed for the production of the final compound, including all the intermediates, starting from the source or sources. In case some intermediate doesn’t contain the complete pathway, a report showing the network gap is returned.

Once the sub-network is extracted, it can be extended by clicking a specific compound where the list of reactions connected to it will be shown and the new reactions can be expanded one by one or all at once. The nodes can be dragged and positioned allowing to create specific layouts that can be saved and use later for extracting the same sub-network from other models.

Apart from the algorithm specific for visualization, the software contains modules for flux balance analysis, automatic report of the fluxes, network optimization based on a genetic algorithm, and basic tools for model modification.

Other tools used for metabolic network curation and visualization (maybe I can make a table). Curation:

Semi-automated Curation of Metabolic Models via Flux Balance Analysis: A Case Study with Mycoplasma gallisepticum <- more than a tool is an algorithm.

Use of a global metabolic network to curate organismal metabolic networks <- again an algorithm?

Enhancing genetic algorithm-based genome-scale metabolic network curation efficiency <- algorithm

RAVEN -> tool to create and curate metabolic networks.

Transparency in metabolic network reconstruction enables scalable biological discovery -> explains a bit about curation in metabolic models

http://biocuration.org/ <- The ISB is a non profit organisation for biocurators (It is more general for all kind of biological data)

Network Thermodynamic Curation of Human and Yeast Genome-Scale Metabolic Models <- algorithm to curate the reaction directions using thermodynamics.

Fifteen years of large scale metabolic modeling of yeast: Developments and impacts <- the Figure 1 shows the basic steps for the model reconstruction. Explain which of these steps can be improved wiht AntND

The SuBliMinaL Toolbox: automating steps in the reconstruction of metabolic networks. <- Tool for the creation of metabolic models. It can be consider also a curation tool because it contains many modules.

PathwayBooster: a tool to support the curation of metabolic pathways <- and actual tool for the curation. It has more tools related to the gene annotation. It works on/with the suBliMinaL tool. http://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-014-0447-2

OptFlux -> http://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-014-0420-0 Major competitor of this software but very different. In this software pathways can be extracted without any knowledge of the reactions involved. It is a requirement during curation if you don’t know what reactions are included in the model. In OptFlux the user has to create a layout or give a list of reactions.

Reconstruction and curation of a metabolic model can’t be considered two independent steps. They are overlapping in many cases in the tools. We use CoReCo as a reconstruction tool which already provides the gene annotations steps. One important asset of AntND is the possible connection to our database. Is it possible to include the database curation?