Development of an event detector in Twitter streams based on mention-anomaly detection for the city of Madrid

López García, Luis Cristobal (2019). Development of an event detector in Twitter streams based on mention-anomaly detection for the city of Madrid. Proyecto Fin de Carrera / Trabajo Fin de Grado, E.T.S.I. Telecomunicación (UPM), Madrid.

Description

Title: Development of an event detector in Twitter streams based on mention-anomaly detection for the city of Madrid
Author/s:
  • López García, Luis Cristobal
Contributor/s:
  • Iglesias Fernández, Carlos Ángel
Item Type: Final Project
Degree: Grado en Ingeniería de Tecnologías y Servicios de Telecomunicación
Date: 2019
Subjects:
Freetext Keywords: event, detection, detector, cluster, clustering, mentions, redundancy, filter, visualization, Machine Learning, Python, Twitter
Faculty: E.T.S.I. Telecomunicación (UPM)
Department: Ingeniería de Sistemas Telemáticos [hasta 2014]
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview

Abstract

Event detection has been a field of research long before social networks reached the high impact they have nowadays. Events were tracked from traditional news web sites, blogs or other information channels. However when microblogging as a form of social media emerged all this landscape changed. In this project we have developed a system capable of detecting the most important events occurred in a city by analyzing data published on social networks. For this, we have adapted and improved an already existing clustering approach named MABED, which relies on the number of interactions between users to measure the impact. Our main contributions to this model has been to improve that impact algorithm accuracy and to provide a new definition of redundancy leading to a better performance on duplicated events. The social network our detector reads is Twitter, considered a valuable source of what is known as Social Data. Information is provided by short length documents posted by users, called tweets. These publications are collected from our Streamer, gathering posts that have just been published in the city of Madrid. In addition to the cluster we have also developed an architecture that turns our project into a system. Streamer is in charge of collecting the data that we feed to our detector. However it first needs to pass through a preprocessing module which filters spam out and lemmatizes the text in order to achieve a better performance. Once the detection task is finished results are saved in a persistence subsystem. These results are finally visualized in a dashboard which interacts with the user and facilitates the cognitive process of the performed analysis. All this data ow is supervised by an orchestrator which assures the correct interaction between modules. The process we have just explained is repeated periodically every half an hour showing top three events with the higher impact that took place in the city of Madrid in the last 24 hours.

More information

Item ID: 55806
DC Identifier: http://oa.upm.es/55806/
OAI Identifier: oai:oa.upm.es:55806
Deposited by: Biblioteca ETSI Telecomunicación
Deposited on: 15 Jul 2019 07:50
Last Modified: 15 Jul 2019 07:51
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM