EVALUATING AN AI MODEL’S QUR’ANIC RECALL AND CITATION IN ARABIC AND SAHIH INTERNATIONAL ENGLISH: A REAL-DATA PILOT ETHICAL AND ANALYTICAL REVIEW

Alexandra V. Maragha

Alexandra V. Maragha

Keywords: Keywords: Artificial Intelligence (AI), Qur’an, Large Language Models (LLMs), Arabic, English.

Abstract

Large Language Models (LLMs) in Artificial Intelligence (AI) are commonly used to query religious texts, including the Holy Qur’an, raising questions about accuracy, reliability, and contextual fidelity, as minor deviations in Quranic recall can carry significant theological, ethical, and educational implications. This quantitative study presents a reproducible bilingual, pilot grounded evaluation of ChatGPT-5’s Quranic recall, citation, and thematic search abilities based on Arabic and Sahih International English translation operating three distinct task categories: (a) Verse Completion, (b) Citation-from-Text, (c) Thematic Retrieval in themes such as Patience, Charity, Fasting, and Mercy. The results displayed near-perfect accuracy across both languages for verse completion, with a reading of 1.00 in Arabic and English. The results of citation-from-text recall achieved 0.97 in Arabic and 1.00 in English. Regarding thematic retrieval, evaluated by Precision@10, the themes of Patience, Charity, and Mercy were identified, achieving 1.00 in Arabic and 0.90 for Fasting, while maintaining 1.00 in English. The implications of this study demonstrate that AI chatbot LLMs, such as ChatGPT-5, can provide accurate Quranic recall when grounded; however, Arabic syntax and morphology remain key challenges for AI systems, while ethical and bias considerations leave Quranic thematic analysis (tafsīr) to be avoided.