Abstract
Standardization efforts in financial reporting have led to large numbers of machine-interpretable vocabularies that attempt to model complex accounting practices in XBRL (eXtended Business Reporting Language). Because reporting agencies do not require fine-grained semantic and terminological representations, these vocabularies cannot be easily reused. Ontology-based Information Extraction, in particular, requires much greater semantic and terminological structure, and the introduction of a linguistic structure currently absent from XBRL. In order to facilitate such reuse, we propose a three-faceted methodology that analyzes and enriches the XBRL vocabulary: (1) transform semantic structure by analyzing the semantic relationships between terms (e.g. taxonomic, meronymic); (2) enhance terminological structure by using several domain-specific (XBRL), domain-related (SAPTerm, etc.) and domain-independent (GoogleDefine, Wikipedia, etc.) terminologies; and (3) add linguistic structure at term level (e.g. part-of-speech, morphology, syntactic arguments). This paper outlines a first experiment towards implementing this methodology on the International Financial Reporting Standard XBRL vocabulary.
| Original language | English |
|---|---|
| Journal | CEUR Workshop Proceedings |
| Volume | 673 |
| Publication status | Published - 2010 |
| Event | EKAW 2010 Workshop 6: "Reuse and Adaptation of Ontologies and Terminologies 2010", EKAW-WS6 2010 - Lisbon, Portugal Duration: 15 Oct 2010 → 15 Oct 2010 |
Keywords
- Accounting language
- Linguistic analysis
- Ontology-based information extraction
- XBRL semantics
- XBRL terminology