RAG4DS: Retrieval-Augmented Generation for Data Spaces - A Unified Lifecycle, Challenges, and Opportunities

Majjed Al-Qatf, Rafiqul Haque, Saeed Hamood Alsamhi, Samuele Buosi, Muhammad Asif Razzaq, Mohan Timilsina, Ammar Hawbani, Edward Curry

Research output: Contribution to a Journal (Peer & Non Peer)Articlepeer-review

Abstract

Retrieval-Augmented Generation (RAG) has gained significant attention from many researchers as an effective solution to address the hallucination issue of foundational models (FMs), particularly large language models (LLMs). Although the RAG framework is considered a successful approach for enhancing LLMs by providing a suitable retrieval mechanism to obtain appropriate external knowledge, it still has limitations in acquiring high-quality knowledge and diverse data sources. The complementary integration of RAG and data spaces is proposed to exploit RAG's capabilities within data spaces. Data spaces provideRAGwith the ability to obtain diverse and high quality data sources from several data providers under secure data-sharing mechanisms and direct data exchange negotiations. At the same time, RAG enhances the support services of data spaces. In this paper, we present a high-level architecture for RAG data space models (RAG-DSM) with a unified lifecycle for RAG and data spaces, highlight the possible challenges of the proposed integration, while presenting potential opportunities. Moreover, we present two use cases for leveraging RAG-DSM in the mobility and health domains.

Original languageEnglish
JournalIEEE Access
DOIs
Publication statusAccepted/In press - 2025

Keywords

  • Artificial Intelligence
  • Data Spaces
  • Foundational Models
  • Lifecycle
  • Retrieval-Augmented Generation

Fingerprint

Dive into the research topics of 'RAG4DS: Retrieval-Augmented Generation for Data Spaces - A Unified Lifecycle, Challenges, and Opportunities'. Together they form a unique fingerprint.

Cite this