Homophobia and transphobia detection for low-resourced languages in social media comments

Research output: Contribution to a Journal (Peer & Non Peer)Articlepeer-review

Abstract

People are increasingly sharing and expressing their emotions using online social media platforms such as Twitter, Facebook, and YouTube. An abusive, hateful, threatening, and discriminatory act that makes discomfort targets gay, lesbian, transgender, or bisexual individuals is called Homophobia and Transphobia. Detecting these types of acts on social media is called Homophobia and Transphobia Detection. This task has recently gained interest among researchers. Identifying homophobic and transphobic content for under-resourced languages is a bit challenging task. There are no such resources for Malayalam and Hindi to categorize these types of content as far now. This paper presents a new high-quality dataset for detecting homophobia and transphobia in Malayalam and Hindi languages. Our dataset consists of 5,193 comments in Malayalam and 3,203 comments in Hindi. We also submitted the experiments performed with traditional machine learning and transformer-based deep learning models on the Malayalam, Hindi, English, Tamil, and Tamil-English datasets.
Original languageEnglish (Ireland)
Number of pages100041
JournalNatural Language Processing Journal
Volume5
DOIs
Publication statusPublished - 1 Jan 2023

Authors (Note for portal: view the doc link for the full list of authors)

  • Authors
  • Prasanna Kumar Kumaresan and Rahul Ponnusamy and Ruba Priyadharshini and Paul Buitelaar and Bharathi Raja Chakravarthi

Fingerprint

Dive into the research topics of 'Homophobia and transphobia detection for low-resourced languages in social media comments'. Together they form a unique fingerprint.

Cite this