Abstract
Most information extraction approaches available today have either focused on the extraction of simple relations or in scenarios where data extracted from texts should be normalized into a database schema or ontology. Some relevant information present in natural language texts, however, can be irregular, highly contextualized, with complex seman- Tic dependency relations, poorly structured, and intrinsically ambiguous. These characteristics should also be supported by an information extrac- Tion approach. To cope with this scenario, this work introduces a seman- Tic best-effort information extraction approach, which targets an infor- mation extraction scenario where text information is extracted under a pay-as-you-go data quality perspective, trading high-accuracy, schema consistency and terminological normalization for domain-independency, context capture, wider extraction scope and maximization of the text semantics extraction and representation. A semantic information ex- Traction framework (Graphia) is implemented and evaluated over the Wikipedia corpus.
| Original language | English |
|---|---|
| Pages (from-to) | 70-81 |
| Number of pages | 12 |
| Journal | CEUR Workshop Proceedings |
| Volume | 906 |
| Publication status | Published - 2012 |
| Event | Web of Linked Entities, WoLE 2012 - Workshop in Conjunction with the 11th International Semantic Web Conference, ISWC 2012 - Boston, MA, United States Duration: 11 Nov 2012 → 11 Nov 2012 |
Keywords
- Information extraction
- Linked data
- RDF
- Semantic best-effort extraction
- Semantic networks
- Semantic web