Full text
Preview |
PDF
- Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview |
Briwa, Houda (2020). Machine Learning as a helper component of an email client. Thesis (Master thesis), E.T.S. de Ingenieros Informáticos (UPM).
Title: | Machine Learning as a helper component of an email client |
---|---|
Author/s: |
|
Contributor/s: |
|
Item Type: | Thesis (Master thesis) |
Masters title: | Ciencia de Datos |
Date: | July 2020 |
Subjects: | |
Freetext Keywords: | Deep learning; ELMo; BERT; Recommendation system; Email clients |
Faculty: | E.T.S. de Ingenieros Informáticos (UPM) |
Department: | Inteligencia Artificial |
Creative Commons Licenses: | Recognition - No derivative works - Non commercial |
Preview |
PDF
- Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview |
Email is an important means of communication and is largely used in corporations and businesses due to its efficiency, low cost and its practical asynchrony. In order to automate email-managing tasks for an efficient and easy utilization, many intelligent techniques were proposed and applied by researchers in the field of machine learning and data mining. In this research, a survey on the different techniques used for recommendation system and information retrieval in emails is presented. A case study of text classification using email content was elaborated using pre-trained language models (BERT, Elmo). A three baseline models were used to evaluate the performance: Random Forest, support vector machine and Naive Bayes. The classic classification metrics (Precision, Recall and F score) were used to assess the performance of the models. The results of experiments show that Elmo performed poorly in the binary classification with an accuracy of 38%, whereas the lake of sufficient Resources (GPU) and expensive computation of language model Bert presented a limitation for extracting its accuracy. The baseline models achieved good accuracy with random forest having highest value of 81%. This work does not imply an innovation in the state of the art, only the application of methods learned in the Master of Data Science to a problem of interest.
Item ID: | 63640 |
---|---|
DC Identifier: | https://oa.upm.es/63640/ |
OAI Identifier: | oai:oa.upm.es:63640 |
Deposited by: | Biblioteca Facultad de Informatica |
Deposited on: | 07 Sep 2020 12:39 |
Last Modified: | 07 Sep 2020 12:39 |