Machine Learning as a helper component of an email client

Briwa, Houda (2020). Machine Learning as a helper component of an email client. Thesis (Master thesis), E.T.S. de Ingenieros Informáticos (UPM).

Description

Title: Machine Learning as a helper component of an email client
Author/s:
  • Briwa, Houda
Contributor/s:
  • Serrano Fernández, Emilio
  • Zanardini, Damiano
Item Type: Thesis (Master thesis)
Masters title: Ciencia de Datos
Date: July 2020
Subjects:
Freetext Keywords: Deep learning; ELMo; BERT; Recommendation system; Email clients
Faculty: E.T.S. de Ingenieros Informáticos (UPM)
Department: Inteligencia Artificial
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview

Abstract

Email is an important means of communication and is largely used in corporations and businesses due to its efficiency, low cost and its practical asynchrony. In order to automate email-managing tasks for an efficient and easy utilization, many intelligent techniques were proposed and applied by researchers in the field of machine learning and data mining. In this research, a survey on the different techniques used for recommendation system and information retrieval in emails is presented. A case study of text classification using email content was elaborated using pre-trained language models (BERT, Elmo). A three baseline models were used to evaluate the performance: Random Forest, support vector machine and Naive Bayes. The classic classification metrics (Precision, Recall and F score) were used to assess the performance of the models. The results of experiments show that Elmo performed poorly in the binary classification with an accuracy of 38%, whereas the lake of sufficient Resources (GPU) and expensive computation of language model Bert presented a limitation for extracting its accuracy. The baseline models achieved good accuracy with random forest having highest value of 81%. This work does not imply an innovation in the state of the art, only the application of methods learned in the Master of Data Science to a problem of interest.

More information

Item ID: 63640
DC Identifier: http://oa.upm.es/63640/
OAI Identifier: oai:oa.upm.es:63640
Deposited by: Biblioteca Facultad de Informatica
Deposited on: 07 Sep 2020 12:39
Last Modified: 07 Sep 2020 12:39
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM