Random forests for regression as a weighted sum of k-Potential Nearest Neighbors

Fernández González, Pablo and Bielza Lozoya, María Concepción and Larrañaga Múgica, Pedro María (2019). Random forests for regression as a weighted sum of k-Potential Nearest Neighbors. "IEEE Access", v. 7 ; pp. 25660-25672. ISSN 2169-3536. https://doi.org/10.1109/ACCESS.2019.2900755.

Description

Title: Random forests for regression as a weighted sum of k-Potential Nearest Neighbors
Author/s:
  • Fernández González, Pablo
  • Bielza Lozoya, María Concepción
  • Larrañaga Múgica, Pedro María
Item Type: Article
Título de Revista/Publicación: IEEE Access
Date: 2019
ISSN: 2169-3536
Volume: 7
Subjects:
Freetext Keywords: Random forests; Regression; Bagging; Bootstrap; Nearest neighbors; K-Potential Nearest Neighbors
Faculty: E.T.S. de Ingenieros Informáticos (UPM)
Department: Inteligencia Artificial
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (336kB) | Preview

Abstract

In this paper, we tackle the problem of random forests for regression expressed as weightedsums of datapoints. We study the theoretical behavior ofk-potential nearest neighbors (k-PNNs) underbagging and obtain an upper bound on the weights of a datapoint for random forests with any type of splittingcriterion, provided that we use unpruned trees that stop growing only when there arekor less datapoints attheir leaves. Moreover, we use the previous bound together with the concept of b-terms (i.e., bootstrap terms)introduced in this paper, to derive the explicit expression of weights for datapoints in a random (k-PNNs)selection setting, a datapoint selection strategy that we also introduce and to build a framework to derive otherbagged estimators using a similar procedure. Finally, we derive from our framework the explicit expression ofweights of a regression estimate equivalent to a random forest regression estimate with the random splittingcriterion and demonstrate its equivalence both theoretically and practically.

Funding Projects

TypeCodeAcronymLeaderTitle
Government of SpainC080020-09UnspecifiedUnspecifiedCajal Blue Brain Project
Government of SpainTIN2016-79684-PUnspecifiedUniversidad Politécnica de MadridAvances en clasificación multidimensional y detección de anomalías con redes bayesianas

More information

Item ID: 63472
DC Identifier: http://oa.upm.es/63472/
OAI Identifier: oai:oa.upm.es:63472
DOI: 10.1109/ACCESS.2019.2900755
Official URL: https://ieeexplore.ieee.org/document/8648334
Deposited by: Memoria Investigacion
Deposited on: 05 Nov 2020 12:23
Last Modified: 05 Nov 2020 12:23
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM