TY - GEN
T1 - Non-orthogonal explicit semantic analysis
AU - Aggarwal, Nitish
AU - Asooja, Kartik
AU - Bordea, Georgeta
AU - Buitelaar, Paul
PY - 2015
Y1 - 2015
N2 - Explicit Semantic Analysis (ESA) utilizes the Wikipedia knowledge base to represent the semantics of a word by a vector where every dimension refers to an explicitly defined concept like a Wikipedia article. ESA inherently assumes that Wikipedia concepts are orthogonal to each other, therefore, it considers that two words are related only if they co-occur in the same articles. However, two words can be related to each other even if they appear separately in related articles rather than cooccurring in the same articles. This leads to a need for extending the ESA model to consider the relatedness between the explicit concepts (i.e. Wikipedia articles in Wikipedia based implementation) for computing textual relatedness. In this paper, we present Non- Orthogonal ESA (NESA) which represents more fine grained semantics of a word as a vector of explicit concept dimensions, where every such concept dimension further constitutes a semantic vector built in another vector space. Thus, NESA considers the concept correlations in computing the relatedness between two words. We explore different approaches to compute the concept correlation weights, and compare these approaches with other existing methods. Furthermore, we evaluate our model NESA on several word relatedness benchmarks showing that it outperforms the state of the art methods.
AB - Explicit Semantic Analysis (ESA) utilizes the Wikipedia knowledge base to represent the semantics of a word by a vector where every dimension refers to an explicitly defined concept like a Wikipedia article. ESA inherently assumes that Wikipedia concepts are orthogonal to each other, therefore, it considers that two words are related only if they co-occur in the same articles. However, two words can be related to each other even if they appear separately in related articles rather than cooccurring in the same articles. This leads to a need for extending the ESA model to consider the relatedness between the explicit concepts (i.e. Wikipedia articles in Wikipedia based implementation) for computing textual relatedness. In this paper, we present Non- Orthogonal ESA (NESA) which represents more fine grained semantics of a word as a vector of explicit concept dimensions, where every such concept dimension further constitutes a semantic vector built in another vector space. Thus, NESA considers the concept correlations in computing the relatedness between two words. We explore different approaches to compute the concept correlation weights, and compare these approaches with other existing methods. Furthermore, we evaluate our model NESA on several word relatedness benchmarks showing that it outperforms the state of the art methods.
UR - https://www.scopus.com/pages/publications/84954522191
U2 - 10.18653/v1/s15-1010
DO - 10.18653/v1/s15-1010
M3 - Conference Publication
AN - SCOPUS:84954522191
T3 - Proceedings of the 4th Joint Conference on Lexical and Computational Semantics, *SEM 2015
SP - 92
EP - 100
BT - Proceedings of the 4th Joint Conference on Lexical and Computational Semantics, *SEM 2015
PB - Association for Computational Linguistics (ACL)
T2 - 4th Joint Conference on Lexical and Computational Semantics, *SEM 2015
Y2 - 4 June 2015 through 5 June 2015
ER -