Venn Prediction for Survival Analysis: experimenting with survival data and Venn Predictors

Aparicio Vázquez, Ignacio (2020). Venn Prediction for Survival Analysis: experimenting with survival data and Venn Predictors. Thesis (Master thesis), E.T.S. de Ingenieros Informáticos (UPM).

Description

Title: Venn Prediction for Survival Analysis: experimenting with survival data and Venn Predictors
Author/s:
  • Aparicio Vázquez, Ignacio
Contributor/s:
  • Girdzijauskas, Šarūnas
  • Boström, Henrik
Item Type: Thesis (Master thesis)
Masters title: Data Science
Date: 2020
Subjects:
Freetext Keywords: Venn Predictors; Random Forests; Survival Modelling; Machine Learning; Well Calibrated Probabilities; Out-of-bag Calibration; Anomaly detection
Faculty: E.T.S. de Ingenieros Informáticos (UPM)
Department: Otro
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview

Abstract

The goal of this work is to expand the knowledge on the field of Venn Prediction employed with Survival Data. Standard Venn Predictors have been used with Random Forests and binary classification tasks. However, they have not been utilised to predict events with Survival Data nor in combination with Random Survival Forests. With the help of a Data Transformation, the survival task is transformed into several binary classification tasks. One key aspect of Venn Prediction are the categories. The standard number of categories is two, one for each class to predict. In this work, the usage of ten categories is explored and the performance differences between two and ten categories are investigated. Seven data sets are evaluated, and their results presented with two and ten categories. For the Brier Score and Reliability Score metrics, two categories offered the best results, while Quality performed better employing ten categories. Occasionally, the models are too optimistic. Venn Predictors rectify this performance and produce well-calibrated probabilities.

More information

Item ID: 64515
DC Identifier: http://oa.upm.es/64515/
OAI Identifier: oai:oa.upm.es:64515
Deposited by: Biblioteca Facultad de Informatica
Deposited on: 08 Oct 2020 16:11
Last Modified: 08 Oct 2020 16:11
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM