Multi-label Classification using BERT and Knowledge Graphs with a Limited Training Dataset

Research output: Contribution to a Journal (Peer & Non Peer)Conference articlepeer-review

Abstract

This paper provides a new approach combining BERT and Knowledge Graphs (KGs) to solve a multi-label classification problem with limited training data. The paper introduces a method of using taxonomies and a dataset with 518 entries and 340 concepts to fine-tune BERT. It also introduces a new data augmentation technique called Perfect Binary Tree (PBT)-Flow to deal with limited or imbalanced training data. The proposed approach obtained a recall@10 of 61.12%, a precision@10 of 11.86% and F1score@10 of 18.83%. While these results seem low, they are promising because of the simple architecture of the model used (BERT+2xFC), the limited size of the training data, and the large number of output concepts.

Original languageEnglish
JournalCEUR Workshop Proceedings
Volume3342
Publication statusPublished - 2022
Externally publishedYes
Event2022 Workshop on Deep Learning for Knowledge Graphs, DL4KG 2022 - Virtual, Online
Duration: 24 Oct 2022 → …

Keywords

  • BERT
  • Data augmentation
  • Knowledge graphs
  • Multi-label classification

Fingerprint

Dive into the research topics of 'Multi-label Classification using BERT and Knowledge Graphs with a Limited Training Dataset'. Together they form a unique fingerprint.

Cite this