Design and development of a system for sleep disorder characterization using Social Media Mining

Suárez Souto, Daniel (2018). Design and development of a system for sleep disorder characterization using Social Media Mining. Proyecto Fin de Carrera / Trabajo Fin de Grado, E.T.S.I. Telecomunicación (UPM), Madrid.

Descripción

Título: Design and development of a system for sleep disorder characterization using Social Media Mining
Autor/es:
  • Suárez Souto, Daniel
Director/es:
  • Iglesias Fernández, Carlos Ángel
Tipo de Documento: Proyecto Fin de Carrera/Grado
Grado: Grado en Ingeniería de Tecnologías y Servicios de Telecomunicación
Fecha: 2018
Materias:
Palabras Clave Informales: Insomnia, Machine Learning, Big Data, Python, NLP, Sentiments, Emotions, Twitter, Analysis
Escuela: E.T.S.I. Telecomunicación (UPM)
Departamento: Ingeniería de Sistemas Telemáticos [hasta 2014]
Licencias Creative Commons: Reconocimiento - Sin obra derivada - No comercial

Texto completo

[img]
Vista Previa
PDF (Document Portable Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (1MB) | Vista Previa

Resumen

The catalogue of different sleep disorders is one of the main problems that medicine faces today. The percentage of people suffering from any of these disorders is 31% in Western Europe, 56% in the USA and 23% in Japan. However, it is estimated that only some these people are following some form of medical treatment. Nowadays, social networks have become platforms used by millions of users who communicate with each other. This also makes them a valuable source of what is known as Social Data, which is all information that social network users share publicly, including metadata such as user location, spoken language, biographical data and/or shared links. In this project we have analysed information shared by Spanish-speaking users about insomnia on the social network Twitter. Our objective has been to develop a machine learning classifier that is capable of classifying messages related to insomnia and a second classifer that is capable of classifying these messages into 5 di�erent themes according to the type of information they contain. To develop these classifers, we built a dataset with tweets containing the word "insomnia" to be published between December 14 and January 4, 2018. From this dataset, we conducted a geographical study of which we concluded that the Spanish-speaking countries with the most tweets on insomnia are Argentina, Mexico and Spain, specifically Spain, with the data collected, we have been able to estimate that approximately 1.21% of users in this country have ever written about insomnia. Another conclusion we came to with this dataset is that there is a big di�erence in the proportion of users who have the symptom of Difficulty at the beginning of sleep compared to the other two symptoms of Short sleep duration and Di�culty sleeping and low energy during the day, all defined by the ICSD-3. The algorithm that gave us the best results when training the insomnia classifier and the theme classifier was Logistic Regression with a Accuracy and a F1 score of 0.84, 0.82 and 0.75, 0.72 respectively. Finally we developed a monitoring service on insomnia that allows you to visualize the analysis of themes, sentiments and emotions made through Senpy of the captured tweets.

Más información

ID de Registro: 51351
Identificador DC: http://oa.upm.es/51351/
Identificador OAI: oai:oa.upm.es:51351
Depositado por: Biblioteca ETSI Telecomunicación
Depositado el: 26 Jun 2018 05:28
Ultima Modificación: 26 Jun 2018 05:28
  • GEO_UP4
  • Open Access
  • Open Access
  • Sherpa-Romeo
    Compruebe si la revista anglosajona en la que ha publicado un artículo permite también su publicación en abierto.
  • Dulcinea
    Compruebe si la revista española en la que ha publicado un artículo permite también su publicación en abierto.
  • Recolecta
  • InvestigaM
  • Observatorio I+D+i UPM
  • OpenCourseWare UPM