A semantic best-effort approach for extracting structured discourse graphs from wikipedia

  • Andŕe Freitas
  • , Danilo S. Carvalho
  • , Jõao C.P. Da Silva
  • , SéAn O'Riain
  • , Edward Curry

Research output: Contribution to a Journal (Peer & Non Peer)Conference articlepeer-review

6 Citations (Scopus)

Abstract

Most information extraction approaches available today have either focused on the extraction of simple relations or in scenarios where data extracted from texts should be normalized into a database schema or ontology. Some relevant information present in natural language texts, however, can be irregular, highly contextualized, with complex seman- Tic dependency relations, poorly structured, and intrinsically ambiguous. These characteristics should also be supported by an information extrac- Tion approach. To cope with this scenario, this work introduces a seman- Tic best-effort information extraction approach, which targets an infor- mation extraction scenario where text information is extracted under a pay-as-you-go data quality perspective, trading high-accuracy, schema consistency and terminological normalization for domain-independency, context capture, wider extraction scope and maximization of the text semantics extraction and representation. A semantic information ex- Traction framework (Graphia) is implemented and evaluated over the Wikipedia corpus.

Original languageEnglish
Pages (from-to)70-81
Number of pages12
JournalCEUR Workshop Proceedings
Volume906
Publication statusPublished - 2012
EventWeb of Linked Entities, WoLE 2012 - Workshop in Conjunction with the 11th International Semantic Web Conference, ISWC 2012 - Boston, MA, United States
Duration: 11 Nov 201211 Nov 2012

Keywords

  • Information extraction
  • Linked data
  • RDF
  • Semantic best-effort extraction
  • Semantic networks
  • Semantic web

Fingerprint

Dive into the research topics of 'A semantic best-effort approach for extracting structured discourse graphs from wikipedia'. Together they form a unique fingerprint.

Cite this