Yashpal Singh

and 1 more

Cancer diagnosis using gene expression data is significant research for facilitating early treatment and prevention of cancer. The classification of gene expression data is challenging due to its high dimensionality and smaller number of samples that renders classification a difficult task. Creation of well-defined class boundaries is the aim of every classification algorithm. The Fuzzy min-max (FMM) neural network classifier is known to create good decision boundaries using hyperboxes constructed for each class. In this paper, we explore the General Fuzzy min-max (GFMM) and Enhanced Fuzzy min-max (EFMM) neural network architectures for the classification of lung cancer subtypes from microarray gene expression data. Both GFMM and EFMM are advanced versions of Simpson’s FMM neural network classifier. The GFMM is extremely efficient because it involves very simple operations for hyperbox manipulation, and can handle both labeled and unlabeled data. On the other hand, EFMM proposes three heuristic rules related to hyperbox expansion, contraction and the overlap test, which enhances the learning algorithm. We perform the classification of gene expression data using these two algorithms, then we analyze the performance by visualizing the hyperboxes obtained after training, and compare the accuracies of these classifiers. LASSO is used for selecting the important genes from the high-dimensional gene expression data. After the analysis of the results, we observe that EFMM with LASSO gives the best performance as compared to GFMM, FMM and other machine learning algorithms.