auastro edited Problems1.tex  almost 10 years ago

Commit id: 4ceecc45894ea4d98b250e727e5339e2756157d8

deletions | additions      

       

The answer is that each cell type turns on, or uses, a distinct set of genes. This means that each cell type makes its own complement of protein products that help determine the cell type’s function.  So what controls whether a gene is turned on or off? We know this is influenced by how easily accessible the gene is to the factors required to turn the gene on: the more tightly packaged the gene is, the less likely it is to be accessible to factors that bind and activate it, and the less likely it will be turned on.* *We now know that the different packaging of the DNA can be correlated with various marks that are made to the DNA, called ‘epigenetic marks’. These epigenetic marks can be considered as punctuation marks in the genome – they allow the cell to interpret how to read the information contained in the DNA sequence. Due to transformational changes in the way we examine these marks throughout the whole genome, biologists are creating a wealth of data that reports not only the DNA sequence, but also the amount of various epigenetic marks, and the amount of different factors bound to the DNA throughout the whole genome. This landslide of data is highly complex, and difficult for bench biologists to interrogate.  \subsection{Here is a bunch of problems that geneticists would love help with!}  Inside each cell in our body are more than 30,000 genes (the ‘genome’) that are expressed in different levels to control a wide range of processes. An example of this may be the differentiation of blood cells: a single blood stem cell in the bone marrow can give rise to many different mature blood cells, progressively developing from progenitors with the potential to become many different cells, to mature cells with very specialised functions. What key genes are turned on and off at various stages of differentiation? Can we associate functions of these cells with their genomic profiles?  Using whole genome techniques to measure the expression levels of all the genes in a cell, we can gather expression values of all the genes across a variety of cell types as a matrix of values. Then we can analyse this data in multiple ways to obtain biological meaning and generate hypothesis. As a simple example, suppose that a gene is expressed highly only in a single cell type out of many. Perhaps this implies that this gene is crucial in creating these cells, and blocking the action of this gene may lead to therapeutic outcomes if too many of these cells cause disease. Hence we need tools to be able to look at this data in different ways. We also want to empower the biologists without programming background to carry out some analysis in intuitive ways. So here are some questions which may have tractable answers over a weekend of coding, based on a matrix of gene expression values: