TY - GEN
T1 - Wikipedia-based distributional semantics for entity relatedness
AU - Aggarwal, Nitish
AU - Buitelaar, Paul
N1 - Publisher Copyright:
Copyright © 2014, Association for the Advancement of Artificial Intelligence.
PY - 2014
Y1 - 2014
N2 - Wikipedia provides an enormous amount of background knowledge to reason about the semantic relatedness between two entities. We propose Wikipedia-based Dist ributional Semantics for Entity Relatedness (DiSER). which represents the semantics of an entity by its distribution in the high dimensional concept space derived from Wikipedia. DISER measures the semantic relatedness between two entities by quantifying the distance between the corresponding high-dimensional vectors. DiSER builds the model by taking the annot ated entities only, therefore it improves over existing approaches. which do not distinguish between an entity and its surface form. We evaluate the approach on a benchmark that contains the relative entity relatedness scores for 420 entity pairs. Our approach improves the accuracy by 12% on state of the art methods for computing entity relatedness. We also show an evaluation of DiSER in the Entity Disambiguation ta.sk on a dataset of 50 sentences with highly ambiguous entity mcntions. It shows an improvement of 10% in precision over the best performing methods. In order to provide the resource that can be used to find out all the related entities for a given entity, a graph is constructed, where the nodes represent Wikipedia entities and the relatedness scores are reflected by the edges. Wikipedia contains more than 4,1 millions entities, which required efficient computation of the relatedness scores between the corresponding 17 trillions of entity-pairs.
AB - Wikipedia provides an enormous amount of background knowledge to reason about the semantic relatedness between two entities. We propose Wikipedia-based Dist ributional Semantics for Entity Relatedness (DiSER). which represents the semantics of an entity by its distribution in the high dimensional concept space derived from Wikipedia. DISER measures the semantic relatedness between two entities by quantifying the distance between the corresponding high-dimensional vectors. DiSER builds the model by taking the annot ated entities only, therefore it improves over existing approaches. which do not distinguish between an entity and its surface form. We evaluate the approach on a benchmark that contains the relative entity relatedness scores for 420 entity pairs. Our approach improves the accuracy by 12% on state of the art methods for computing entity relatedness. We also show an evaluation of DiSER in the Entity Disambiguation ta.sk on a dataset of 50 sentences with highly ambiguous entity mcntions. It shows an improvement of 10% in precision over the best performing methods. In order to provide the resource that can be used to find out all the related entities for a given entity, a graph is constructed, where the nodes represent Wikipedia entities and the relatedness scores are reflected by the edges. Wikipedia contains more than 4,1 millions entities, which required efficient computation of the relatedness scores between the corresponding 17 trillions of entity-pairs.
UR - https://www.scopus.com/pages/publications/84987680576
M3 - Conference Publication
AN - SCOPUS:84987680576
T3 - AAAI Fall Symposium - Technical Report
SP - 2
EP - 9
BT - Natural Language Access to Big Data - Papers from the AAAI Fall Symposium, Technical Report
PB - AI Access Foundation
T2 - 2014 AAAI Fall Symposium
Y2 - 13 November 2014 through 15 November 2014
ER -