An L1-Regularized naïve bayes-inspired classifier for discarding redundant and irrelevant predictors

Vidaurre Henche, Diego and Bielza Lozoya, María Concepción and Larrañaga Múgica, Pedro María (2013). An L1-Regularized naïve bayes-inspired classifier for discarding redundant and irrelevant predictors. "International Journal on Artificial Intelligence Tools", v. 22 (n. 4); pp.. ISSN 1793-6349. https://doi.org/10.1142/S021821301350019X.

Description

Title: An L1-Regularized naïve bayes-inspired classifier for discarding redundant and irrelevant predictors
Author/s:
  • Vidaurre Henche, Diego
  • Bielza Lozoya, María Concepción
  • Larrañaga Múgica, Pedro María
Item Type: Article
Título de Revista/Publicación: International Journal on Artificial Intelligence Tools
Date: August 2013
ISSN: 1793-6349
Volume: 22
Subjects:
Freetext Keywords: Lasso, Regularization, Naïve Bayes, Redundancy
Faculty: E.T.S. de Ingenieros Informáticos (UPM)
Department: Inteligencia Artificial
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[thumbnail of LARRANAGA_2013_15_3.pdf] PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (834kB)

Abstract

The naïve Bayes model is a simple but often satisfactory supervised classification method. The original naïve Bayes scheme, does, however, have a serious weakness, namely, the harmful effect of redundant predictors. In this paper, we study how to apply a regularization technique to learn a computationally efficient classifier that is inspired by naïve Bayes. The proposed formulation, combined with an L1-penalty, is capable of discarding harmful, redundant predictors. A modification of the LARS algorithm is devised to solve this problem. We tackle both real-valued and discrete predictors, assuring that our method is applicable to a wide range of data. In the experimental section, we empirically study the effect of redundant and irrelevant predictors. We also test the method on a high dimensional data set from the neuroscience field, where there are many more predictors than data cases. Finally, we run the method on a real data set than combines categorical with numeric predictors. Our approach is compared with several naïve Bayes variants and other classification algorithms (SVM and kNN), and is shown to be competitive.

Funding Projects

Type
Code
Acronym
Leader
Title
Government of Spain
TIN2010-20900-C04-04
Unspecified
Unspecified
Unspecified
Government of Spain
2010-CSD2007-00018
Unspecified
Unspecified
Unspecified

More information

Item ID: 72794
DC Identifier: https://oa.upm.es/72794/
OAI Identifier: oai:oa.upm.es:72794
DOI: 10.1142/S021821301350019X
Official URL: https://www.worldscientific.com/doi/10.1142/S02182...
Deposited by: Biblioteca Facultad de Informatica
Deposited on: 17 Mar 2023 11:34
Last Modified: 17 Mar 2023 11:34
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM