Estimation of distribution algorithms as logistic regression regularizers of microarray classifiers

Bielza Lozoya, María Concepción ORCID: https://orcid.org/0000-0001-7109-2668, Robles Forcada, Víctor ORCID: https://orcid.org/0000-0003-3937-2269 and Larrañaga Múgica, Pedro María ORCID: https://orcid.org/0000-0002-1885-4501 (2009). Estimation of distribution algorithms as logistic regression regularizers of microarray classifiers. "Methods of Information in Medicine", v. 48 (n. 3); pp. 236-241. ISSN 0026-1270. https://doi.org/10.3414/ME9223.

Description

Title: Estimation of distribution algorithms as logistic regression regularizers of microarray classifiers
Author/s:
Item Type: Article
Título de Revista/Publicación: Methods of Information in Medicine
Date: March 2009
ISSN: 0026-1270
Volume: 48
Subjects:
Freetext Keywords: Logistic regression, Regularization, Estimation of distribution algorithms, DNA micro-arrays
Faculty: Facultad de Informática (UPM)
Department: Inteligencia Artificial
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[thumbnail of LARRANAGA_2009_01_02.pdf] PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (625kB)

Abstract

Objectives: The “large k (genes), small N (samples)” phenomenon complicates the problem of microarray classification with logistic regression. The indeterminacy of the maximum likelihood solutions, multicollinearity of predictor variables and data over-fitting cause unstable parameter estimates. Moreover, computational problems arise due to the large number of predictor (genes) variables. Regularized logistic regression excels as a solution. However, the difficulties found here involve an objective function hard to be optimized from a mathematical viewpoint and a careful required tuning of the regularization parameters.
Methods: Those difficulties are tackled by introducing a new way of regularizing the logistic regression. Estimation of distribution algorithms (EDAs), a kind of evolutionary algorithms, emerge as natural regularizers. Obtaining the regularized estimates of the logistic classifier amounts to maximizing the likelihood function via our EDA, without having to be penalized. Likelihood penalties add a number of difficulties to the resulting optimization problems, which vanish in our case. Simulation of new estimates during the evolutionary process of EDAs is performed in such a way that guarantees their shrinkage while maintaining their probabilistic dependence relationships learnt. The EDA process is embedded in an adapted recursive feature elimination procedure, thereby providing the genes that are best markers for the classification.
Results: The consistency with the literature and excellent classification performance achieved with our algorithm are illustrated on four microarray data sets: Breast, Colon, Leukemia and Prostate. Details on the last two data sets are available as supplementary material.
Conclusions: We have introduced a novel EDA-based logistic regression regularizer. It implicitly shrinks the coefficients during EDA evolution process while optimizing the usual likelihood function. The approach is combined with a gene subset selection procedure and automatically tunes the required parameters. Empirical results on microarray data sets provide sparse models with confirmed genes and performing better in classification than other competing regularized methods.

Funding Projects

Type
Code
Acronym
Leader
Title
Government of Spain
TIN2007-62626
Unspecified
Unspecified
Unspecified
Government of Spain
TIN2007-67148
Unspecified
Unspecified
Unspecified
Government of Spain
TIN2008-06815-C02
Unspecified
Unspecified
Unspecified
Government of Spain
2010-CSD2007-00018
Unspecified
Unspecified
Unspecified

More information

Item ID: 73026
DC Identifier: https://oa.upm.es/73026/
OAI Identifier: oai:oa.upm.es:73026
DOI: 10.3414/ME9223
Official URL: https://www.thieme-connect.com/products/ejournals/...
Deposited by: Biblioteca Facultad de Informatica
Deposited on: 22 Mar 2023 14:11
Last Modified: 22 Mar 2023 14:11
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM