Multilingual Racial Hate Speech Detection Using Transfer Learning

145
Nicht eingeplant
20m
Von-Melle-Park 4

Von-Melle-Park 4

Poster

Beschreibung

The rise of social media eases the spread of hateful content, especially racist content with severe consequences. In this paper, we analyze the tweets written in French targeting the death of George Floyd in May 2020 as the event accelerated debates on racism globally. Using the Yandex Toloka platform, we annotate the tweets into categories as hate, offensive, or normal. Tweets that are offensive or hateful are further annotated as racial or non-racial. We build French hate speech detection models based on the multilingual BERT and CamemBERT and apply transfer learning by fine-tuning the HateXplain model. We compare different approaches to resolve annotation ties and find that the detection model based on CamemBERT yields the best results in our experiments.

Keywords

Racial hate speech, offensive speech, transfer learning, Toloka

Find me @ my poster 2: Monday afternoon

Hauptautoren

Abinew Ali Ayele (Language Technilogy Group, University of Hamburg, Hamburg, Germany) Prof. Chris Biemann (Language Technilogy Group, University of Hamburg, Hamburg, Germany) Dr. Seid Muhie Yimam (House of Computing and Data Sceince) Frau Skadi Dinter (Language Technilogy Group, University of Hamburg, Hamburg, Germany)

Präsentationsmaterialien