loading page

GETS : Sentence Scoring Scheme in Graph-based Extractive Text Summarization for Text Mining Applications
  • +1
  • Pronaya Bhattacharya,
  • Jai Prakash Verma,
  • Shir Bhargav,
  • Madhuri Bhavsar
Pronaya Bhattacharya
Amity University - Kolkata Campus

Corresponding Author:[email protected]

Author Profile
Jai Prakash Verma
Nirma University Institute of Technology
Author Profile
Shir Bhargav
Nirma University Institute of Technology
Author Profile
Madhuri Bhavsar
Nirma University Institute of Technology
Author Profile

Abstract

Recently, there is an exponential influx of textual data in big data applications, which necessitates the requirement of text mining tools for analysis of data. In Text Mining applications (TM), Text Summarization (TS) has emerged as an emergent field in Natural Language Processing (NLP). Mostly, in TS, abstractive approaches are presented which build complex models, and thus, a shift is envisioned towards graph-based extractive text summarization models. Such models allow review and feedback analysis of a service or product, and have the benefits of being less complex, flexible, and require low computational resources. This makes them an effective fit for modern text mining based big data and Internet-of-Things (IoT) applications. Thus, in the proposed work, we present a scheme, GETS, which exploits a graph-based model to establish relations between words and sentences based on statistical operations. In the scheme, a post processing phase is presented which uses sentence clustering based on graph preparation. To make the scheme scalable fit for real world applications, we use the Apache Spark environment for parallel execution of graph-based operations. In experimental setup, the Recall-oriented Understudying Gisting Evaluation (ROUGE) parameters is used to evaluate the proposed graph based model with a comparative analysis with ROUGE 1,2,L measures. Comparative analysis is done based on clustered and non-clustered approaches. The obtained results renders the scheme effective as a backend of Artificial Intelligence (AI) models in crowdsourcing applications and decision-analytics models.