Peakbin selection in Mass Spectrometry data using a consensus approach with estimation of distribution algorithms

Armañanzas Arnedillo, Ruben and Saeys, Yvan and Inza Cano, Iñaki and García Torres, Miguel and Bielza Lozoya, María Concepción and Peer, Yves van de and Larrañaga Múgica, Pedro María (2011). Peakbin selection in Mass Spectrometry data using a consensus approach with estimation of distribution algorithms. "IEEE/ACM Transactions on Computational Biology and Bioinformatics", v. 8 (n. 3); pp. 760-774. ISSN 1557-9964. https://doi.org/10.1109/TCBB.2010.18..

Description

Title: Peakbin selection in Mass Spectrometry data using a consensus approach with estimation of distribution algorithms
Author/s:
  • Armañanzas Arnedillo, Ruben
  • Saeys, Yvan
  • Inza Cano, Iñaki
  • García Torres, Miguel
  • Bielza Lozoya, María Concepción
  • Peer, Yves van de
  • Larrañaga Múgica, Pedro María
Item Type: Article
Título de Revista/Publicación: IEEE/ACM Transactions on Computational Biology and Bioinformatics
Date: May 2011
ISSN: 1557-9964
Volume: 8
Subjects:
Freetext Keywords: Mass spectrometry, EDA, Feature selection, Biomarker discovery
Faculty: Facultad de Informática (UPM)
Department: Inteligencia Artificial
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[thumbnail of LARRANAGA_2011_01_3.pdf] PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB)

Abstract

Progress is continuously being made in the quest for stable biomarkers linked to complex diseases. Mass spectrometers are one of the devices for tackling this problem. The data profiles they produce are noisy and unstable. In these profiles, biomarkers are detected as signal regions (peaks), where control and disease samples behave differently. Mass spectrometry (MS) data generally contain a limited number of samples described by a high number of features. In this work, we present a novel class of evolutionary algorithms, estimation of distribution algorithms (EDA), as an efficient peak selector in this MS domain. There is a trade-of between the reliability of the detected biomarkers and the low number of samples for analysis. For this reason, we introduce a consensus approach, built upon the classical EDA scheme, that improves stability and robustness of the final set of relevant peaks. An entire data workflow is designed to yield unbiased results. Four publicly available MS data sets (two MALDI-TOF and another two SELDI-TOF) are analyzed. The results are compared to the original works, and a new plot (peak frequential plot) for graphically inspecting the relevant peaks is introduced. A complete online supplementary page, which can be found at http://www.sc.ehu.es/ccwbayes/members/ruben/ms, includes extended info and results, in addition to Matlab scripts and references.

Funding Projects

Type
Code
Acronym
Leader
Title
Government of Spain
TIN2010-20900-C04-04
Unspecified
Unspecified
Unspecified
Government of Spain
TIN2010-14931
Unspecified
Unspecified
Unspecified
Government of Spain
TIN2008-68084-C02-00
Unspecified
Unspecified
Unspecified
Government of Spain
2010-CSD2007-00018
Unspecified
Unspecified
Unspecified

More information

Item ID: 72874
DC Identifier: https://oa.upm.es/72874/
OAI Identifier: oai:oa.upm.es:72874
DOI: 10.1109/TCBB.2010.18.
Official URL: https://ieeexplore.ieee.org/document/5438984
Deposited by: Biblioteca Facultad de Informatica
Deposited on: 17 Mar 2023 11:54
Last Modified: 17 Mar 2023 11:54
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM