loading page

Breast Cancer Prediction Using Machine Learning With Feature Engineering
  • Hasan Abdulkader,
  • Ali AL-QAZZAZ
Hasan Abdulkader
Altinbas Holding

Corresponding Author:[email protected]

Author Profile
Ali AL-QAZZAZ
Altinbas Holding
Author Profile

Abstract

Breast cancer is the second leading cause of cancer death in women, after lung cancer. While it is still a significant public health issue, advances in screening, diagnosis, and treatment. Early detection through regular mammograms and self-examinations can increase the chances of successful treatment. Early recognition of breast cancer and having received the proper treatment could decrease the risk of mortality as survival becomes hard in advanced stages of the tumor’s development. Using machine learning in medicine is important because it aids specialists in making accurate diagnoses at an early stage. In this paper, machine learning algorithms were trained using the Coimbra breast cancer dataset (CBCD), comprising 116 patients, to diagnose breast cancer. Four machine learning algorithms are used: Support Vector Machine (SVM), Random Forest (RF), K-Nearest Neighbors (KNN), and Logistic Regression (LR). We proposed a model incorporating the results of statistical analysis with feature engineering to improve the data. Statistical analysis allowed us to create two new features and consequently the accuracy is improved by 4.13%. The result accuracy of SVM, RF and KNN reaches 95.8%, while the accuracy of LR algorithm got 75%. To the best of our knowledge, the achieved results outperform all state-of-art.