Abstract
Distributed computing and parallel processing are often used for offloading large amounts of data in instances such as BOINC. Projects, such as the Decentralized-Internet SDK also allow for people to build instances of cluster computing projects for the offloading of data or decentralized architecture. Generative Adversarial Networks are currently used by AI experts in order to generate data that would have otherwise been non-existent. Given that certain biomedical datasets only have a small amount of donors or case studies available, means that more data would allow for a higher degree of accuracy. Since, certain diseases may not have enough donors or resources to collect that data, one method may be mathematically creating viable artificial data. This however, requires large amounts of processing. The utilization of a regressional model that would allow for a generative adversarial network (GAN) to recursively build medical data sets based off of pre-existing data in order to increase the statistical pool of accuracy should be feasible with distributed computing. This approach should also be worth trying in the case of absolute unknowns and false positives.
1.0 Problem Statement
For findings in statistics, one wants to have a high degree of
significance.[1] The more data you have, the higher the degree of
accuracy. For example, the cancer known as Diffuse Intrinsic Pontine
Glioma (DIPG)[2] with a high degree of rarity relies many
statistical unknowns. Given this, and the rarity of survival, likely not
much data is readily available to get a full sense of knowledge on DIPG.
Other abnormalities could include medical diseases that have genetic
variants, cardiac diseases, and such that could utilize a better sense
of higher degree of accuracy in data. Grid Computing architecture such
as the introduction of BOINC[3] i.e “Berkley’s Open Infrastructure
for Network Computing” allow for offloading of large amounts of data
through parallel processing. Other projects such as the Decentralized
Internet SDK[4] allow for people to build distributed computing
clusters and instances in support of decentralization. A proposal is to
create a distributed processing program for the allowance of a
regressional GAN that would increase the amount of biomedical data being
analyzed in order to receive a higher degree of accuracy for the
researcher to viably conclude results upon.