TY - JOUR
T1 - RAG4DS
T2 - Retrieval-Augmented Generation for Data Spaces - A Unified Lifecycle, Challenges, and Opportunities
AU - Al-Qatf, Majjed
AU - Haque, Rafiqul
AU - Alsamhi, Saeed Hamood
AU - Buosi, Samuele
AU - Razzaq, Muhammad Asif
AU - Timilsina, Mohan
AU - Hawbani, Ammar
AU - Curry, Edward
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2025
Y1 - 2025
N2 - Retrieval-Augmented Generation (RAG) has gained significant attention from many researchers as an effective solution to address the hallucination issue of foundational models (FMs), particularly large language models (LLMs). Although the RAG framework is considered a successful approach for enhancing LLMs by providing a suitable retrieval mechanism to obtain appropriate external knowledge, it still has limitations in acquiring high-quality knowledge and diverse data sources. The complementary integration of RAG and data spaces is proposed to exploit RAG's capabilities within data spaces. Data spaces provideRAGwith the ability to obtain diverse and high quality data sources from several data providers under secure data-sharing mechanisms and direct data exchange negotiations. At the same time, RAG enhances the support services of data spaces. In this paper, we present a high-level architecture for RAG data space models (RAG-DSM) with a unified lifecycle for RAG and data spaces, highlight the possible challenges of the proposed integration, while presenting potential opportunities. Moreover, we present two use cases for leveraging RAG-DSM in the mobility and health domains.
AB - Retrieval-Augmented Generation (RAG) has gained significant attention from many researchers as an effective solution to address the hallucination issue of foundational models (FMs), particularly large language models (LLMs). Although the RAG framework is considered a successful approach for enhancing LLMs by providing a suitable retrieval mechanism to obtain appropriate external knowledge, it still has limitations in acquiring high-quality knowledge and diverse data sources. The complementary integration of RAG and data spaces is proposed to exploit RAG's capabilities within data spaces. Data spaces provideRAGwith the ability to obtain diverse and high quality data sources from several data providers under secure data-sharing mechanisms and direct data exchange negotiations. At the same time, RAG enhances the support services of data spaces. In this paper, we present a high-level architecture for RAG data space models (RAG-DSM) with a unified lifecycle for RAG and data spaces, highlight the possible challenges of the proposed integration, while presenting potential opportunities. Moreover, we present two use cases for leveraging RAG-DSM in the mobility and health domains.
KW - Artificial Intelligence
KW - Data Spaces
KW - Foundational Models
KW - Lifecycle
KW - Retrieval-Augmented Generation
UR - https://www.scopus.com/pages/publications/85219112335
U2 - 10.1109/ACCESS.2025.3545387
DO - 10.1109/ACCESS.2025.3545387
M3 - Article
AN - SCOPUS:85219112335
SN - 2169-3536
JO - IEEE Access
JF - IEEE Access
ER -