Design and development of a hate speech detector in social networks based on Deep Learning technologies

Benito Sánchez, Diego (2019). Design and development of a hate speech detector in social networks based on Deep Learning technologies. Thesis (Master thesis), E.T.S.I. Telecomunicación (UPM).

Description

Title: Design and development of a hate speech detector in social networks based on Deep Learning technologies
Author/s:
  • Benito Sánchez, Diego
Contributor/s:
  • Araque Iborra, Óscar
Item Type: Thesis (Master thesis)
Masters title: Ingeniería de Telecomunicación
Date: 2019
Subjects:
Freetext Keywords: Social Networks, Hate Speech, Machine Learning, Natural Language Processing, SemEval, Transfer Learning, Scikit-learn, NLTK.
Faculty: E.T.S.I. Telecomunicación (UPM)
Department: Ingeniería de Sistemas Telemáticos [hasta 2014]
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (6MB) | Preview

Abstract

Recently, during the last few years, activity over Internet and social network connectivity has been increased. Social networks are platforms that ease communication between users by means of different interactions. Unfortunately, social networks have also become places for hate speech proliferation. Hate Speech has become a popular topic in recent years. This is reflected not only by the increased media coverage of this problem but also by the growing political attention it is receiving. Given the constant progression of this phenomenon, institutions, international minorities associations, researchers and social networks are trying to react as quickly as possible. Because of the massive scale of the social networks, methods that automatically detect hate speech are required. Natural Language Processing (NLP) focusing specifically on this phenomenon is required since basic word filters do not provide a sufficient remedy: a hate speech utterance might be influenced by aspects such us the domain, context, co-occurrence media objects (images, video, audio), etc. This thesis is the result of a project whose main aim has been to obtain a hate speech detector with a multilingual perspective, in order to remove all shape of hate speech that can occur in social networks, independently the origin language. During the development phase, there have been used supervised machine learning tools, NLP techniques, and Python as programming language. The proposed system is evaluated against two study cases, a participation in a internationally recognized competition, such as SemEval and facing the system against a Transfer Learning challenge across languages and hate speech traits. The extensive experimentation carried out has resulted in a very honorable position in the SemEval competition and in a demonstration of the benefits that can be brought by the appliance of Transfer Learning to the hate speech detection problem.

More information

Item ID: 55618
DC Identifier: http://oa.upm.es/55618/
OAI Identifier: oai:oa.upm.es:55618
Deposited by: Biblioteca ETSI Telecomunicación
Deposited on: 27 Jun 2019 11:34
Last Modified: 27 Jun 2019 11:34
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM