TY - GEN
T1 - How hard is this query? Measuring the semantic complexity of schema-agnostic queries
AU - Freitas, André
AU - Sales, Juliano Efson
AU - Handschuh, Siegfried
AU - Curry, Edward
N1 - Publisher Copyright:
© 2015 Association for Computational Linguistics.
PY - 2015
Y1 - 2015
N2 - The growing size, heterogeneity and complexity of databases demand the creation of strategies to facilitate users and systems to consume data. Ideally, query mechanisms should be schema-agnostic, i.e. they should be able to match user queries in their own vocabulary and syntax to the data, abstracting data consumers from the representation of the data. This work provides an informationtheoretical framework to evaluate the semantic complexity involved in the query-database communication, under a schema-agnostic query scenario. Different entropy measures are introduced to quantify the semantic phenomena involved in the user-database communication, including structural complexity, ambiguity, synonymy and vagueness. The entropy measures are validated using natural language queries over Semantic Web databases. The analysis of the semantic complexity is used to improve the understanding of the core semantic dimensions present at the query-data matching process, allowing the improvement of the design of schema-agnostic query mechanisms and defining measures which can be used to assess the semantic uncertainty or difficulty behind a schema-agnostic querying task.
AB - The growing size, heterogeneity and complexity of databases demand the creation of strategies to facilitate users and systems to consume data. Ideally, query mechanisms should be schema-agnostic, i.e. they should be able to match user queries in their own vocabulary and syntax to the data, abstracting data consumers from the representation of the data. This work provides an informationtheoretical framework to evaluate the semantic complexity involved in the query-database communication, under a schema-agnostic query scenario. Different entropy measures are introduced to quantify the semantic phenomena involved in the user-database communication, including structural complexity, ambiguity, synonymy and vagueness. The entropy measures are validated using natural language queries over Semantic Web databases. The analysis of the semantic complexity is used to improve the understanding of the core semantic dimensions present at the query-data matching process, allowing the improvement of the design of schema-agnostic query mechanisms and defining measures which can be used to assess the semantic uncertainty or difficulty behind a schema-agnostic querying task.
KW - Database queries
KW - Databases
KW - Entropy
KW - Schema-agnostic queries
KW - Semantic complexity
UR - https://www.scopus.com/pages/publications/85035797266
M3 - Conference Publication
AN - SCOPUS:85035797266
T3 - IWCS 2015 - Proceedings of the 11th International Conference on Computational Semantics
SP - 294
EP - 304
BT - IWCS 2015 - Proceedings of the 11th International Conference on Computational Semantics
PB - Association for Computational Linguistics (ACL)
T2 - 11th International Conference on Computational Semantics, IWCS 2015
Y2 - 15 April 2015 through 17 April 2015
ER -