Analysis and Detection of Multilingual Hate Speech Using Transformer Based Deep Learning

Dr Arijit Das; Somashree Nandy; Rupam Saha; Srijan Das; Diganta Saha

doi:10.36227/techrxiv.170629868.84167256/v1

loading page

Analysis and Detection of Multilingual Hate Speech Using Transformer Based Deep Learning

Dr Arijit Das,
Somashree Nandy,
Rupam Saha,
Srijan Das,
Diganta Saha

Abstract

Hate speech is harmful content that directly attacks or promotes hatred against members of groups or individuals based on actual or perceived aspects of identity, such as racism, religion, or sexual orientation. This can affect social life on social media platforms as hateful content shared through social media can harm both individuals and communities. As the prevalence of hate speech increases online, the demand for automated detection as an NLP task is increasing. In this work, the proposed method is using transformer-based model to detect hate speech in social media, like twitter, Facebook, WhatsApp, Instagram, etc. The proposed model is independent of languages and has been tested on Italian, English, German, Bengali. The Gold standard datasets were collected from renowned researcher Zeerak Talat, Sara Tonelli, Melanie Siegel, and Rezaul Karim. The success rate of the proposed model for hate speech detection is higher than the existing baseline and state-of-the-art models with accuracy in Bengali dataset is 89%, in English: 91%, in German dataset 91% and in Italian dataset it is 77%. The proposed algorithm shows substantial improvement to the benchmark method.

19 Jan 2024Submitted to TechRxiv

26 Jan 2024Published in TechRxiv

Abstract

Peer review timeline