loading page

An Enhanced Dictionary based approach for identify the biomedical entities from PubMed Articles
  • SUGANYA GOVINDARAJ,
  • Porkodi R
SUGANYA GOVINDARAJ
Bharathiar University

Corresponding Author:[email protected]

Author Profile
Porkodi R
Bharathiar University School of Computer Science and Engineering
Author Profile

Abstract

Data mining is characterized as a process of transforming data information into a human-comprehensible code format such as rules, formula, algorithm, and so on. Bioinformatics is developed to solve a biological problem by using data mining technique. Identifying biomedical domain entities is a difficult task and the enhanced model is used to classify the entities from biomedical literature full text articles in PubMed database. The enhanced model 3 stages namely pre-processing, identification of the entities using dictionary-based approach and verification, validation with benchmarking databases. The model, Disgenet and Pubtator are considered as the dictionary which is a benchmarking database. This approach defines entities using the en_ner_bionlp13cg_md model from spacy package. Experimental purposes, 99 full text articles related to Alzheimer’s disease are considered which are downloaded from NCBI. Finally, demonstrated that our enhanced model for dictionary based approach outperforms in aspects of accuracy, precision and retrieval value. The model achieved 82% accuracy overall. Compared to state-of -art method, model obtained the better accuracy. These results suggested that the enhanced model is obtained high performance for extracting biomedical entities from PubMed articles. The improvement mostly due to the dictionary because Disgenet and Pubtator are considered the dictionary.