this is for holding javascript data
twchrist edited Feb 21 pre meeting update.tex
about 10 years ago
Commit id: c00787f1b30bf3ba36dcc7d754b3becd244efb54
deletions | additions
diff --git a/Feb 21 pre meeting update.tex b/Feb 21 pre meeting update.tex
index 389a8a1..73853fa 100644
--- a/Feb 21 pre meeting update.tex
+++ b/Feb 21 pre meeting update.tex
...
\Section{Feb 20 pre-meeting update}
\subsection*{phenoscape}
\begin{verbatim}
So as of the time of writing Jim is still compiling a nice little starter package for me
so I can learn how phenoscape is set up. Also Prishanti will be giving me scripts after
the lab meeting. I also have been reading up on OWL and their main tutorial though it
is a bit vaugue. I'm thinking I will get a lot more help from Jim's scripts since they
will have examples specific to my research.
\end{verbatim}
\subsection*{ensembl access}
\begin{verbatim}
Every now and again I mess with Ensembl's API to learn about gene trees or things like that
but my scripts are always very slow so I got into contact with Steven Fishback, one of the
guys who oversees killdevil. We did a bit of brain storming and now I have several ways
that I could speed up data retrieval. I'm trying one of them now and it seems to be helping.
I'm pretty excited about the idea of future Ensembl data retrieval being less of a hassle.
\end{verbatim}
\subsection*{teleost duplication}
\begin{verbatim}
I think I have found some good papers/datasets. This paper
http://genome.cshlp.org/content/13/3/382.full is specifically about identifying ~50 paralog pairs
in zebrafish from the teleost duplication. Table 1 lists all of these genes. This seems like a
great test set that I could use for my first forrays into phenoscape.
I also found http://genomebiology.com/2006/7/5/R43 which is a very good over view of the three
WGD in vertebrates, with special focus on the teleost WGD. Judging by the experiments they perform
and their methods, they almost certainly have a very simple way of determining which paralogs
are a result of the teleost WGD. However that data is not posted online. The paper states that if
I want the data I would have to email them. So now the question is do I take my chances with them
or do I keep looking the see if their is a more easily accessible dataset.
Side Note: I have also found two databases that specialize in duplicated genes but they do not
assign duplication to any specific time so they don't appear any more useful than Ensembl.
\end{verbatim}
\subsection*{clark scripts}
\begin{verbatim}
I went over the scripts that Clark sent us and I noted something interesting. He only works with
Homologs that have over 50 percent identity. Not sure if that changes anything, just something
I found that seemed noteworthy.
\end{verbatim}
\end{document}