A Selective Under-Sampling (SUS) Method For Imbalanced Regression

Aleksic, Jovana ORCID: https://orcid.org/0000-0002-3366-8379 and García Remesal, Miguel ORCID: https://orcid.org/0000-0002-5948-8691 (2025). A Selective Under-Sampling (SUS) Method For Imbalanced Regression. "Journal of Artificial Intelligence Research", v. 82 ; pp. 111-136. ISSN 10769757. https://doi.org/10.1613/jair.1.16062.

Descripción

Título: A Selective Under-Sampling (SUS) Method For Imbalanced Regression
Autor/es:
Tipo de Documento: Artículo
Título de Revista/Publicación: Journal of Artificial Intelligence Research
Fecha: 1 Enero 2025
ISSN: 10769757
Volumen: 82
Materias:
ODS:
Palabras Clave Informales: Adversarial machine learning; Collection Methods; Contrastive learning; Data Set; Imbalanced data; Machine learning approaches; Neural-Networks; Normal events; Performance; Real-World; Sampling method; under-sampling
Escuela: E.T.S. de Ingenieros Informáticos (UPM)
Departamento: Inteligencia Artificial
Licencias Creative Commons: Reconocimiento - Sin obra derivada - No comercial

Texto completo

[thumbnail of 10316658.pdf] PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (801kB)

Resumen

Many mainstream machine learning approaches, such as neural networks, are not well suited to work with imbalanced data. Yet, this problem is frequently present in many real-world data sets. Collection methods are imperfect, and often not able to capture enough data in a specific range of the target variable. Furthermore, in certain tasks data is inherently imbalanced with many more normal events than edge cases. This problem is well studied within the classification context. However, only several methods have been proposed to deal with regression tasks. In addition, the proposed methods often do not yield good performance with high-dimensional data, while imbalanced high-dimensional regression has scarcely been explored. In this paper we present a selective under-sampling (SUS) algorithm for dealing with imbalanced regression and its iterative version SUSiter. We assessed this method on 15 regression data sets from different imbalanced domains, 5 synthetic high-dimensional imbalanced data sets and 2 more complex imbalanced age estimation image data sets. Our results suggest that SUS and SUSiter typically outperform other state-of-the-art techniques like SMOGN, or random under-sampling, when used with neural networks as learners.

Más información

ID de Registro: 92679
Identificador DC: https://oa.upm.es/92679/
Identificador OAI: oai:oa.upm.es:92679
URL Portal Científico: https://portalcientifico.upm.es/es/ipublic/item/10316658
Identificador DOI: 10.1613/jair.1.16062
URL Oficial: https://www.jair.org/index.php/jair/article/view/1...
Depositado por: iMarina Portal Científico
Depositado el: 10 Ene 2026 08:04
Ultima Modificación: 10 Ene 2026 08:04