TY - GEN
T1 - DERI&UPM: Pushing corpus based relatedness to similarity: Shared task system description
T2 - 1st Joint Conference on Lexical and Computational Semantics, *SEM 2012
AU - Aggarwal, Nitish
AU - Asooja, Kartik
AU - Buitelaar, Paul
N1 - Publisher Copyright:
© 2012 Association for Computational Linguistics.
PY - 2012
Y1 - 2012
N2 - In this paper, we describe our system submitted for the semantic textual similarity (STS) task at SemEval 2012. We implemented two approaches to calculate the degree of similarity between two sentences. First approach combines corpus-based semantic relatedness measure over the whole sentence with the knowledge-based semantic similarity scores obtained for the words falling under the same syntactic roles in both the sentences. We fed all these scores as features to machine learning models to obtain a single score giving the degree of similarity of the sentences. Linear Regression and Bagging models were used for this purpose. We used Explicit Semantic Analysis (ESA) as the corpus-based semantic relatedness measure. For the knowledgebased semantic similarity between words, a modified WordNet based Lin measure was used. Second approach uses a bipartite based method over the WordNet based Lin measure, without any modification. This paper shows a significant improvement in calculating the semantic similarity between sentences by the fusion of the knowledge-based similarity measure and the corpus-based relatedness measure against corpus based measure taken alone.
AB - In this paper, we describe our system submitted for the semantic textual similarity (STS) task at SemEval 2012. We implemented two approaches to calculate the degree of similarity between two sentences. First approach combines corpus-based semantic relatedness measure over the whole sentence with the knowledge-based semantic similarity scores obtained for the words falling under the same syntactic roles in both the sentences. We fed all these scores as features to machine learning models to obtain a single score giving the degree of similarity of the sentences. Linear Regression and Bagging models were used for this purpose. We used Explicit Semantic Analysis (ESA) as the corpus-based semantic relatedness measure. For the knowledgebased semantic similarity between words, a modified WordNet based Lin measure was used. Second approach uses a bipartite based method over the WordNet based Lin measure, without any modification. This paper shows a significant improvement in calculating the semantic similarity between sentences by the fusion of the knowledge-based similarity measure and the corpus-based relatedness measure against corpus based measure taken alone.
UR - https://www.scopus.com/pages/publications/84995353007
M3 - Conference Publication
AN - SCOPUS:84995353007
T3 - *SEM 2012 - 1st Joint Conference on Lexical and Computational Semantics
SP - 643
EP - 647
BT - Proceedings of the 6th International Workshop on Semantic Evaluation, SemEval 2012
PB - Association for Computational Linguistics (ACL)
Y2 - 7 June 2012 through 8 June 2012
ER -