INGA

INGA was developed by Damiano Piovesan, Manuel Giollo, Emanuela Leionardi, Carlo Ferrari, and Silvio Tosatto from the Department of Biomedical Sciences at the University of Padua, Italy. INGA is a web-based solution combining three different predictor types - sequence similarity, protein-protein interaction (PPI) networks, and domain assignments - to produce GO annotations for proteins. \cite{Piovesan_2015} CAFA results show that advanced predictors that combine several methods consistently outperform individual methods. INGA follows this paradigm by generating a concensus from three predictors. The INGA algorithm consists of four parts: 1) A sequence similarity search in the form of BLAST is executed against the UniProt database. It matches sequences with a more than 40% sequence identity and an overlap of more than 80%. Any sequences with an e-value greater than 10-3 are removed. 2) A protein-protein interaction network search is executed on the STRING database, \cite{Franceschini_2012} collecting the associated GO terms, and calculating a new set of GO terms for the target protein. 3) Identify Pfam \cite{Finn_2013} domains using the HMMER \cite{Finn_2011} web service. The UniProt database is queried for proteins with the same set of domains and a new set of GO terms is calculated for the target protein. 4) A joint probability of the results from the previous three steps is calculated to produce the consensus measure.