TY - GEN
T1 - Few shot transfer learning between word relatedness and similarity tasks using a gated recurrent Siamese network
AU - O' Neill, James
AU - Buitelaar, Paul
N1 - Publisher Copyright:
Copyright © 2018, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2018
Y1 - 2018
N2 - Word similarity and word relatedness are fundamental to natural language processing and more generally, understanding how humans relate concepts in semantic memory. A growing number of datasets are being proposed as evaluation benchmarks, however, the heterogeneity and focus of each respective dataset makes it difficult to draw plausible conclusions as to how a unified semantic model would perform. Additionally, we want to identify the transferability of knowledge obtained from one task to another, within the same domain and across domains. Hence, this paper first presents an evaluation and comparison of eight chosen datasets tested using the best performing regression models. As a baseline, we present regression models that incorporate both lexical features and word embeddings to produce consistent and competitive results compared to the state of the art. We present our main contribution, the best performing model across seven of the eight datasets - a Gated Recurrent Siamese Network that learns relationships between lexical word definitions. A parameter transfer learning strategy is employed for the Siamese Network. Subsequently, we present a secondary contribution which is the best performing non-sequential model: an Inductive and Transductive Transfer Learning strategy for transferring decision trees within a Random Forest to a target task that is learned from only few instances. The method involves measuring semantic distance between hidden factored matrix representations of decision tree traversal matrices.
AB - Word similarity and word relatedness are fundamental to natural language processing and more generally, understanding how humans relate concepts in semantic memory. A growing number of datasets are being proposed as evaluation benchmarks, however, the heterogeneity and focus of each respective dataset makes it difficult to draw plausible conclusions as to how a unified semantic model would perform. Additionally, we want to identify the transferability of knowledge obtained from one task to another, within the same domain and across domains. Hence, this paper first presents an evaluation and comparison of eight chosen datasets tested using the best performing regression models. As a baseline, we present regression models that incorporate both lexical features and word embeddings to produce consistent and competitive results compared to the state of the art. We present our main contribution, the best performing model across seven of the eight datasets - a Gated Recurrent Siamese Network that learns relationships between lexical word definitions. A parameter transfer learning strategy is employed for the Siamese Network. Subsequently, we present a secondary contribution which is the best performing non-sequential model: an Inductive and Transductive Transfer Learning strategy for transferring decision trees within a Random Forest to a target task that is learned from only few instances. The method involves measuring semantic distance between hidden factored matrix representations of decision tree traversal matrices.
UR - https://www.scopus.com/pages/publications/85060466841
M3 - Conference Publication
T3 - 32nd AAAI Conference on Artificial Intelligence, AAAI 2018
SP - 5342
EP - 5349
BT - 32nd AAAI Conference on Artificial Intelligence, AAAI 2018
PB - AAAI Press
T2 - 32nd AAAI Conference on Artificial Intelligence, AAAI 2018
Y2 - 2 February 2018 through 7 February 2018
ER -