TY - GEN
T1 - Representing texts as contextualized entity-centric linked data graphs
AU - Freitas, Andre
AU - Oriain, Sean
AU - Curry, Edward
AU - Da Silva, Joao C.P.
AU - Carvalho, Danilo S.
PY - 2013
Y1 - 2013
N2 - The integration of a small fraction of the information present in the Web of Documents to the Linked Data Web can provide a significant shift on the amount of information available to data consumers. However, information extracted from text does not easily fit into the usually highly normalized structure of ontology-based datasets. While the representation of structured data assumes a high level of regularity, relatively simple and consistent conceptual models, the representation of information extracted from texts need to take into account large terminological variation, complex contextual/dependency patterns, and fuzzy or conflicting semantics. This work focuses on bridging the gap between structured and unstructured data, proposing the representation of text as structured discourse graphs (SDGs), targeting an RDF representation of unstructured data. The representation focuses on a semantic best-effort information extraction scenario, where information from text is extracted under a pay-as-you-go data quality perspective, trading terminological normalization for domain-independency, context capture, wider representation scope and maximization of textual information capture.
AB - The integration of a small fraction of the information present in the Web of Documents to the Linked Data Web can provide a significant shift on the amount of information available to data consumers. However, information extracted from text does not easily fit into the usually highly normalized structure of ontology-based datasets. While the representation of structured data assumes a high level of regularity, relatively simple and consistent conceptual models, the representation of information extracted from texts need to take into account large terminological variation, complex contextual/dependency patterns, and fuzzy or conflicting semantics. This work focuses on bridging the gap between structured and unstructured data, proposing the representation of text as structured discourse graphs (SDGs), targeting an RDF representation of unstructured data. The representation focuses on a semantic best-effort information extraction scenario, where information from text is extracted under a pay-as-you-go data quality perspective, trading terminological normalization for domain-independency, context capture, wider representation scope and maximization of textual information capture.
KW - Discoruse Graphs
KW - Discourse Representation
KW - Linked Data
KW - Semantic Web
UR - https://www.scopus.com/pages/publications/84887954558
U2 - 10.1109/DEXA.2013.21
DO - 10.1109/DEXA.2013.21
M3 - Conference Publication
AN - SCOPUS:84887954558
SN - 9780769550701
T3 - Proceedings - International Workshop on Database and Expert Systems Applications, DEXA
SP - 133
EP - 137
BT - Proceedings - 24th International Workshop on Database and Expert Systems Applications, DEXA 2013
T2 - 24th International Workshop on Database and Expert Systems Applications, DEXA 2013
Y2 - 26 August 2013 through 29 August 2013
ER -