this is for holding javascript data
Joel Bangalan edited Methodology_Information_and_Resources_necessary__.md
about 8 years ago
Commit id: 6b59d62915235c34ef106b790dbb5064e787c6f3
deletions | additions
diff --git a/Methodology_Information_and_Resources_necessary__.md b/Methodology_Information_and_Resources_necessary__.md
index 09eeb20..cbf6145 100644
--- a/Methodology_Information_and_Resources_necessary__.md
+++ b/Methodology_Information_and_Resources_necessary__.md
...
## # Methodology:
### ## Information and Resources necessary
Previous researches on gene expression and cancer classification as applied to the colon cancer and the leukemia data sets will serve as the foundational pieces in this research. A similar set of data involving lung tissues will be used, and classification algorithms developed based on these. Methods on feature selection will be considered, with focus on how R Programming and existing packages can be utilized in a high dimensional setting.
### ## Data Collection
The colon and leukemia data sets are available as described in the previous researches (see the preliminary bibliography list).
The lung cancer data set is proprietary and will be provided by a Cancer Research Organization.
### ## Overview of the analytical approach
1. **Previous Research Analysis**.
The author will analyze the existing researches on gene expression cancer classifications, particularly on the methods used and the accuracy measures of each.
...
The results will be summarized and the accuracy measures compared across data sets.
### ## How the analysis relates to the topic or research question
The objective is to be able to develop machine learning algorithms for the classification of cancers in the lung cancer data set, and how these algorithms compare with the methods used in the leukemia and colon data sets. A comparative analysis of the classification results across the data sets will show which algorithm is ideal for each data type.