TY - GEN
T1 - Optimizing the performance of concurrent RDF stream processing queries
AU - Le Van, Chan
AU - Gao, Feng
AU - Ali, Muhammad Intizar
N1 - Publisher Copyright:
© Springer International Publishing AG 2017.
PY - 2017
Y1 - 2017
N2 - With the growing popularity of Internet of Things (IoT) and sensing technologies, a large number of data streams are being generated at a very rapid pace. To explore the potentials of the integration of IoT and semantic technologies, a few RDF Stream Processing (RSP) query engines are made available which are capable of processing, analyzing and reasoning over semantic data streams in real-time. This way, RSP mitigates data interoperability issues and promotes knowledge discovery and smart decision making for time-sensitive applications. However, a major hurdle in the wide adoption of RSP systems is their query performance. Particularly, the ability of RSP engines to handle a large number of concurrent queries is very limited which refrains large scale stream processing applications (e.g. smart city applications) to adopt RSP. In this paper, we propose a shared-join based approach to improve the performance of an RSP engine for concurrent queries. We also leverage query federation mechanisms to allow distributed query processing over multiple RSP engine instances in order to gain performance for concurrent and distributed queries. We apply load balancing strategies to distribute queries and further optimize the concurrent query performance. We provide a proof of concept implementation by extending CQELS RSP engine and evaluate our approach using existing benchmark datasets for RSP. We also compare the performance of our proposed approach with the state of the art implementation of CQELS RSP engine.
AB - With the growing popularity of Internet of Things (IoT) and sensing technologies, a large number of data streams are being generated at a very rapid pace. To explore the potentials of the integration of IoT and semantic technologies, a few RDF Stream Processing (RSP) query engines are made available which are capable of processing, analyzing and reasoning over semantic data streams in real-time. This way, RSP mitigates data interoperability issues and promotes knowledge discovery and smart decision making for time-sensitive applications. However, a major hurdle in the wide adoption of RSP systems is their query performance. Particularly, the ability of RSP engines to handle a large number of concurrent queries is very limited which refrains large scale stream processing applications (e.g. smart city applications) to adopt RSP. In this paper, we propose a shared-join based approach to improve the performance of an RSP engine for concurrent queries. We also leverage query federation mechanisms to allow distributed query processing over multiple RSP engine instances in order to gain performance for concurrent and distributed queries. We apply load balancing strategies to distribute queries and further optimize the concurrent query performance. We provide a proof of concept implementation by extending CQELS RSP engine and evaluate our approach using existing benchmark datasets for RSP. We also compare the performance of our proposed approach with the state of the art implementation of CQELS RSP engine.
KW - Linked data
KW - Query optimization
KW - RDF stream processing
UR - http://www.scopus.com/inward/record.url?scp=85019979326&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-58068-5_15
DO - 10.1007/978-3-319-58068-5_15
M3 - Conference Publication
SN - 9783319580678
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 238
EP - 253
BT - The Semantic Web - 14th International Conference, ESWC 2017, Proceedings
A2 - Maynard, Diana
A2 - Gangemi, Aldo
A2 - Hoekstra, Rinke
A2 - Blomqvist, Eva
A2 - Hartig, Olaf
A2 - Hitzler, Pascal
PB - Springer-Verlag
T2 - 14th Extended Semantic Web Conference, ESWC 2017
Y2 - 28 May 2017 through 1 June 2017
ER -