A RAG-based Question Answering System Proposal for Understanding Islam: MufassirQAS LLM

Ahmet Yusuf; Enis Karaarslan; Omer Aydin

doi:10.36227/techrxiv.170723304.41988020/v2

loading page

A RAG-based Question Answering System Proposal for Understanding Islam: MufassirQAS LLM

Ahmet yusuf alan,
Enis Karaarslan,
Omer Aydin

Abstract

Challenges exist in learning and understanding religions, such as the complexity and depth of religious doctrines and teachings. Chatbots as question-answering systems can help in solving these challenges. LLM chatbots use NLP techniques to establish connections between topics and accurately respond to complex questions. These capabilities make it perfect for enlightenment on religion as a question-answering chatbot. However, LLMs also tend to generate false information, known as hallucination. Also, the chatbots' responses can include content that insults personal religious beliefs, interfaith conflicts, and controversial or sensitive topics. It must avoid such cases without promoting hate speech or offending certain groups of people or their beliefs. This study uses a vector database-based Retrieval Augmented Generation (RAG) approach to enhance the accuracy and transparency of LLMs. Our question-answering system is called “MufassirQAS''. We created a database consisting of several open-access books that include Turkish context. These books contain Turkish translations and interpretations of Islam. This database is utilized to answer religion-related questions and ensure our answers are trustworthy. The relevant part of the dataset, which LLM also uses, is presented along with the answer. We have put careful effort into creating system prompts that give instructions to prevent harmful, offensive, or disrespectful responses to respect people's values and provide reliable results. The system answers and shares additional information, such as the page number from the respective book and the articles referenced for obtaining the information. MufassirQAS and ChatGPT are also tested with sensitive questions. We got better performance with our system. Study and enhancements are still in progress. Results and future works are given.

08 Feb 2024Submitted to TechRxiv

13 Feb 2024Published in TechRxiv

Abstract

Peer review timeline