Mining Cardinalities from Knowledge Bases

Research output: Chapter in Book or Conference Publication/ProceedingConference Publicationpeer-review

Abstract

Cardinality is an important structural aspect of data that has not received enough attention in the context of RDF knowledge bases (KBs). Information about cardinalities can be useful for data users and knowledge engineers when writing queries, reusing or engineering KBs. Such cardinalities can be declared using OWL and RDF constraint languages as constraints on the usage of properties over instance data. However, their declaration is optional and consistency with the instance data is not ensured. In this paper, we address the problem of mining cardinality bounds for properties to discover structural characteristics of KBs, and use these bounds to assess completeness. Because KBs are incomplete and error-prone, we apply statistical methods for filtering property usage and for finding accurate and robust patterns. Accuracy of the cardinality patterns is ensured by properly handling equality axioms (owl:sameAs); and robustness by filtering outliers. We report an implementation of our algorithm with two variants using SPARQL 1.1 and Apache Spark, and their evaluation on real-world and synthetic data.
Original languageEnglish (Ireland)
Title of host publicationDatabase and Expert Systems Applications (DEXA 2017)
DOIs
Publication statusPublished - 1 Aug 2017

Authors (Note for portal: view the doc link for the full list of authors)

  • Authors
  • Emir Muñoz and Matthias Nickles

Fingerprint

Dive into the research topics of 'Mining Cardinalities from Knowledge Bases'. Together they form a unique fingerprint.

Cite this