Design and development of a system for sleep disorder characterization using Social Media Mining

Suárez Souto, Daniel (2018). Design and development of a system for sleep disorder characterization using Social Media Mining. Proyecto Fin de Carrera / Trabajo Fin de Grado, E.T.S.I. Telecomunicación (UPM), Madrid.

Description

Title: Design and development of a system for sleep disorder characterization using Social Media Mining
Author/s:
  • Suárez Souto, Daniel
Contributor/s:
  • Iglesias Fernández, Carlos Ángel
Item Type: Final Project
Degree: Grado en Ingeniería de Tecnologías y Servicios de Telecomunicación
Date: 2018
Subjects:
Freetext Keywords: Insomnia, Machine Learning, Big Data, Python, NLP, Sentiments, Emotions, Twitter, Analysis
Faculty: E.T.S.I. Telecomunicación (UPM)
Department: Ingeniería de Sistemas Telemáticos [hasta 2014]
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview

Abstract

The catalogue of different sleep disorders is one of the main problems that medicine faces today. The percentage of people suffering from any of these disorders is 31% in Western Europe, 56% in the USA and 23% in Japan. However, it is estimated that only some these people are following some form of medical treatment. Nowadays, social networks have become platforms used by millions of users who communicate with each other. This also makes them a valuable source of what is known as Social Data, which is all information that social network users share publicly, including metadata such as user location, spoken language, biographical data and/or shared links. In this project we have analysed information shared by Spanish-speaking users about insomnia on the social network Twitter. Our objective has been to develop a machine learning classifier that is capable of classifying messages related to insomnia and a second classifer that is capable of classifying these messages into 5 di�erent themes according to the type of information they contain. To develop these classifers, we built a dataset with tweets containing the word "insomnia" to be published between December 14 and January 4, 2018. From this dataset, we conducted a geographical study of which we concluded that the Spanish-speaking countries with the most tweets on insomnia are Argentina, Mexico and Spain, specifically Spain, with the data collected, we have been able to estimate that approximately 1.21% of users in this country have ever written about insomnia. Another conclusion we came to with this dataset is that there is a big di�erence in the proportion of users who have the symptom of Difficulty at the beginning of sleep compared to the other two symptoms of Short sleep duration and Di�culty sleeping and low energy during the day, all defined by the ICSD-3. The algorithm that gave us the best results when training the insomnia classifier and the theme classifier was Logistic Regression with a Accuracy and a F1 score of 0.84, 0.82 and 0.75, 0.72 respectively. Finally we developed a monitoring service on insomnia that allows you to visualize the analysis of themes, sentiments and emotions made through Senpy of the captured tweets.

More information

Item ID: 51351
DC Identifier: http://oa.upm.es/51351/
OAI Identifier: oai:oa.upm.es:51351
Deposited by: Biblioteca ETSI Telecomunicación
Deposited on: 26 Jun 2018 05:28
Last Modified: 26 Jun 2018 05:28
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM