# Orthology conjecture rotation project

The following is just a rough list of my immediate and stretch goals for the upcoming project:

# Primary goals

• use ensembl compara to determine orthologs and paralogs for zebra fish and mouse

• Stick to pipeline outlined by Vilella et al. paper

• use Gene ontology to obtain Biological process and Molecular function info for mouse and zebrafish?

• use same cutoffs to include only experimentally inferred annotations

• rework clark code and then use it on my data set

• create similar graphs and compare results to clark paper

• my theory: a purely mouse to zebrafish comparison should eliminate the experimental bias found in human vs mouse since mouse and zebrafish can be used for more similar experiments

# Stretch goals

• Find RNA seq data to work with (if its already out there) as a further check

• Fully eliminate authorship bias

• normalize measures of function similarity with respect to background similarity

• estimate frequencies of GO terms separately for each species?

• Find a way to incorporate phenoscape data into comparison

• find good source of similar data for mice

• figure out how to accurately and consistently compare features in an automated fashion

# goal changes

 The above goals were created in early January 2014, they changed during the course of the project. The final goals, set around early february, were: 

• Obtain a sample set of genes that relate a mouse ortholog to a set of zebrafish paralogs that resulted from the teleost duplication

• obtain a full set of that data (possibly from Yves Van De Peer)

• use phenoscape to obtain ontological annotations for each gene

• use scripts from Prishanti to calculate the functional similarity between orthologs and each paralog set, as well as between the paralogs.