Detecting offensive language by integrating multiple linguistic phenomena

PLAZA DEL ARCO, FLOR MIRIAM

Detecting offensive language by integrating multiple linguistic phenomena

PLAZA DEL ARCO, FLOR MIRIAM

Supervised by:

Luis Alfonso Ureña López Director
María Teresa Martín Valdivia Co-director

Defence university: Universidad de Jaén

Fecha de defensa: 30 January 2023

Committee:

Mariona Taulé Delor Chair
Eugenio Martínez Cámara Secretary
José Camacho Collados Committee member

Type: Thesis

Teseo: 819869 DIALNET

Abstract

Social media have grown into the prirnary means of communicating between people, allowing users to have conversations and share their opinions. The rise in digital social connections has led to the dissemination of harmful communicatton. The Natural Language Processing arises for the development of computational systems to interpret human language. Giving computers this skill offers a plethora of benefits, including the potential to moderate harmful conduct on social media. This thesis relies on advanced methods based on transfer learning to tackle the offensive language detection problem. We have generated appropriate resources to enable us to train Machine Learning systems, particularly for Spanish , for which we discovered a significant lack of resources. Moreover, we have identified different linguistic phenomena that could occur in the expression of offensive language and proposed a novel methodology that relies on integrating these phenomena into a Multi-Task Learning system to detect more accurate this problem.