Results

Ten protein sequences, nine selected from CAFA-2 and one additional,  were used as input for this evaluation (appendix A ). Proteins were entered individually into the tools, and outputs compared to BLAST search results (Table 1).  Some tools performed comparably to each other and gave valid output that matched the BLAST results. Input format for the tools varied, INGA and ARGOT2 allowed for a standard FASTA file, SIFTER utilized the protein identification code which includes taxonomy information,  while EVEX input is limited to just a four digit protein code with no species identifier. Time for each tool to return a result was also used as a comparison metric.  ARGOT2, EVEX and SIFTER were all reasonably fast; on average less than five minutes. INGA showed a major disadvantage in that it was unpredictably slow, sometimes taking as long as 12 plus hours for a single protein sequence consensus. Output for all four tools compared to BLAST was standardized to GO terminology. While it is important to think about standardizing output, often the GO terms, while accurate, are vague or over-generalized. SIFTER predicted 8 of 10 sequences correctly based on BLAST results, ARGOT2 predicted 10 of 10,  and INGA predicted 6 with a partial correct prediction out of 10 sequences. EVEX performed poorly in this evaluation, as no human phenotype protein sequences were utilized, and this tool appeared to be limited in data bank references to only human and mice species, as those were the only results returned: EVEX made zero correct predictions.

Conclusion

As researchers continue to expand experimental annotation of protein data banks, and computational scientists expand in-silico annotations, streamlined tools need to be developed to help alleviate the dam of information to the scientific world. CAFA-2 improved upon the original CAFA challenge, which was conceived to encourage the development of such tools.  The third challenge is already in development to continue the CAFA tradition. The tools developed in the CAFA-2 challenge, while some have pitfalls, are an important step in standardizing protein function prediction input and output, and growing this field that is vital to researchers worldwide. While CAFA-2 utilized statisticians and computer scientists to evaluate the newly developed tools, there seemed to be a lack of functionality assessment from a user's perspective. This limited investigation aimed to fill that gap. While three of the four tools made accurate predictions on ten proteins, one tool came up lacking. Time for a prediction output was also an area of concern for one of the tools.  Although it is unlikely that Biologists will switch entirely from mainstream tools like BLAST, additional high-performance tools like the ones evaluated here are powerful weapons in their arsenals. With some additional development and publicity, some of these developers have the potential to outpace typical workflows currently in place.