An Enhanced Dictionary based approach for identify the biomedical
entities from PubMed Articles
Data mining is characterized as a process of transforming data
information into a human-comprehensible code format such as rules,
formula, algorithm, and so on. Bioinformatics is developed to solve a
biological problem by using data mining technique. Identifying
biomedical domain entities is a difficult task and the enhanced model is
used to classify the entities from biomedical literature full text
articles in PubMed database. The enhanced model 3 stages namely
pre-processing, identification of the entities using dictionary-based
approach and verification, validation with benchmarking databases. The
model, Disgenet and Pubtator are considered as the dictionary which is a
benchmarking database. This approach defines entities using the
en_ner_bionlp13cg_md model from spacy package. Experimental purposes,
99 full text articles related to Alzheimer’s disease are considered
which are downloaded from NCBI. Finally, demonstrated that our enhanced
model for dictionary based approach outperforms in aspects of accuracy,
precision and retrieval value. The model achieved 82% accuracy overall.
Compared to state-of -art method, model obtained the better accuracy.
These results suggested that the enhanced model is obtained high
performance for extracting biomedical entities from PubMed articles. The
improvement mostly due to the dictionary because Disgenet and Pubtator
are considered the dictionary.