TY - GEN
T1 - Detecting Hate Speech Towards the LGBT+ Population in Mexican Spanish Using Transformer Architectures
AU - Subburaj, Arunraj
AU - Kathiresan, Amirthagadeshwaran
AU - Ponnusamy, Rahul
AU - Buitelaar, Paul
AU - Chakravarthi, Bharathi Raja
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2026.
PY - 2026
Y1 - 2026
N2 - In the context of increasing hate speech on social media platforms, this paper examines the effectiveness of various transformer-based models for detecting hate speech towards the LGBT+ community in Mexican Spanish. By focusing on tweets related to the LGBT+ community, we aim to identify the most effective model architecture for analyzing nuanced hate speech and complex language patterns. We compare the performance of pre-trained multilingual transformer models, including mBERT and XLM-RoBERTa, and address the challenges posed by class imbalance and linguistic diversity. Our findings demonstrate that DistilBERT, fine-tuned for Spanish, achieves the highest macro F1-score of 0.89, outperforming other models in accurately detecting hate speech. We also discuss strategies for handling data imbalance and provide an error analysis to highlight the limitations and potential biases of the models. Our research advocates for the deployment of these models to create safer online environments, enhancing user interaction and inclusivity across digital platforms.
AB - In the context of increasing hate speech on social media platforms, this paper examines the effectiveness of various transformer-based models for detecting hate speech towards the LGBT+ community in Mexican Spanish. By focusing on tweets related to the LGBT+ community, we aim to identify the most effective model architecture for analyzing nuanced hate speech and complex language patterns. We compare the performance of pre-trained multilingual transformer models, including mBERT and XLM-RoBERTa, and address the challenges posed by class imbalance and linguistic diversity. Our findings demonstrate that DistilBERT, fine-tuned for Spanish, achieves the highest macro F1-score of 0.89, outperforming other models in accurately detecting hate speech. We also discuss strategies for handling data imbalance and provide an error analysis to highlight the limitations and potential biases of the models. Our research advocates for the deployment of these models to create safer online environments, enhancing user interaction and inclusivity across digital platforms.
KW - Hate Speech Detection
KW - LGBT+ Community
KW - Mexican Spanish
KW - Online Safety
KW - Social Media
KW - Transformer Models
UR - https://www.scopus.com/pages/publications/105021842213
U2 - 10.1007/978-3-032-05855-3_27
DO - 10.1007/978-3-032-05855-3_27
M3 - Conference Publication
AN - SCOPUS:105021842213
SN - 9783032058546
T3 - Communications in Computer and Information Science
SP - 397
EP - 407
BT - Speech and Language Technologies for Low-Resource Languages - 3rd International Conference, SPELLL 2024, Revised Selected Papers
A2 - Chakravarthi, Bharathi Raja
A2 - B, Bharathi
A2 - Rajiakodi, Saranya
A2 - García Cumbreras, Miguel Ángel
A2 - Jiménez Zafra, Salud María
A2 - Kovács, György
A2 - Eger, Steffen
A2 - Wahyu Pamungkas, Endang
A2 - Dobrovoljc, Kaja
PB - Springer Science and Business Media Deutschland GmbH
T2 - 3rd International Conference on Speech and Language Technologies for Low-Resource Languages, SPELLL 2024
Y2 - 4 December 2024 through 6 December 2024
ER -