Full text
Preview |
PDF
- Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview |
Labroski, Aleksandar (2018). Multi-view versus single-view machine learning for disease diagnosis in primary healthcare. Thesis (Master thesis), E.T.S. de Ingenieros Informáticos (UPM).
Title: | Multi-view versus single-view machine learning for disease diagnosis in primary healthcare |
---|---|
Author/s: |
|
Contributor/s: |
|
Item Type: | Thesis (Master thesis) |
Masters title: | Data Science |
Date: | 16 September 2018 |
Subjects: | |
Faculty: | E.T.S. de Ingenieros Informáticos (UPM) |
Department: | Otro |
Creative Commons Licenses: | Recognition - No derivative works - Non commercial |
Preview |
PDF
- Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview |
The work presented in this report considers and compares two different approaches of machine learning towards solving the problem of disease diagnosis prediction in primary healthcare: single-view and multi-view machine learning. In particular, the problem of disease diagnosis prediction refers to the issue of predicting a (possible) diagnosis for a given patient based on her past medical history. The problem area is extensive, especially considering the fact that there are over 14,400 unique possible diagnoses (grouped into 22 high level categories) that can be considered as prediction targets. The approach taken in this work considers the high-level categories as prediction targets and attempts to use the two different machine learning techniques towards getting close to an optimal solution of the issue. The multi-view machine learning paradigm was chosen as an approach that can improve predictive performance of classifiers in settings where we have multiple heterogeneous data sources (different views of the same data), which is exactly the case here. In order to compare the single-view and multi-view machine learning paradigms (based on the concept of supervised learning), several different experiments are devised which explore the possible solution space under each paradigm. The work closely touches on other machine learning concepts such as ensemble learning, stacked generalization and dimensionality reduction-based learning. As we shall see, the results show that multiview stacked generalization is a powerful paradigm that can significantly improve the predictive performance in a supervised learning setting. The different models performance was evaluated using F1 scores and we have been able to observe an average increase of performance of 0.04 and a maximum increase of 0.114 F1 score points. The findings also show that approach of multi-view stacked ensemble learning is particularly well suited as a noise reduction technique and works well in cases where the feature data is expected to contain a notable amount of noise. This can be very beneficial and of interest to projects where the features are not manually chosen by domain experts.
Item ID: | 56719 |
---|---|
DC Identifier: | https://oa.upm.es/56719/ |
OAI Identifier: | oai:oa.upm.es:56719 |
Deposited by: | Biblioteca Facultad de Informatica |
Deposited on: | 07 Oct 2019 08:34 |
Last Modified: | 07 Oct 2019 08:34 |