Tractable learning of Bayesian networks from partially observed data

Benjumeda Barquita, Marco Alberto and Luengo Sánchez, Sergio and Larrañaga Múgica, Pedro María and Bielza Lozoya, María Concepción (2019). Tractable learning of Bayesian networks from partially observed data. "Pattern Recognition", v. 91 ; pp. 190-199. ISSN 0031-3203. https://doi.org/10.1016/j.patcog.2019.02.025.

Description

Title: Tractable learning of Bayesian networks from partially observed data
Author/s:
  • Benjumeda Barquita, Marco Alberto
  • Luengo Sánchez, Sergio
  • Larrañaga Múgica, Pedro María
  • Bielza Lozoya, María Concepción
Item Type: Article
Título de Revista/Publicación: Pattern Recognition
Date: July 2019
ISSN: 0031-3203
Volume: 91
Subjects:
Freetext Keywords: Structural expectation-maximization; Bayesian network; Incomplete data; Inference complexity; Structure learning
Faculty: E.T.S. de Ingenieros Informáticos (UPM)
Department: Inteligencia Artificial
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (987kB) | Preview

Abstract

The majority of real-world problems require addressing incomplete data. The use of the structural expectation-maximization algorithm is the most common approach toward learning Bayesian networks from incomplete datasets. However, its main limitation is its demanding computational cost, caused mainly by the need to make an inference at each iteration of the algorithm. In this paper, we propose a new method with the purpose of guaranteeing the efficiency of the learning process while improving the performance of the structural expectation-maximization algorithm. We address the first objective by applying an upper bound to the treewidth of the models to limit the complexity of the inference. To achieve this, we use an efficient heuristic to search the space of the elimination orders. For the second objective, we study the advantages of directly computing the score with respect to the observed data rather than an expectation of the score, and provide a strategy to efficiently perform these computations in the proposed method. We perform exhaustive experiments on synthetic and real-world datasets of varied dimensionalities, including datasets with thousands of variables and hundreds of thousands of instances. The experimental results support our claims empirically.

Funding Projects

TypeCodeAcronymLeaderTitle
Government of SpainC080020-09UnspecifiedUnspecifiedThe Cajal Blue Brain
Government of SpainTIN2016-79684-PUnspecifiedUniversidad Politécnica de MadridAvances en clasificación multidimensional y detección de anomalías con redes bayesianas
Madrid Regional GovernmentS2013/ICE-2845CASI-CAMUnspecifiedConceptos y aplicaciones de los sistemas inteligentes
Horizon 2020785907HBP SGA2École Polytechnique Fédérale de LausanneHuman Brain Project Specific Grant Agreement 2
Government of SpainBES-2014-068637UnspecifiedUniversidad Politécnica de MadridPredoctoral contract for the formation of doctors

More information

Item ID: 63470
DC Identifier: http://oa.upm.es/63470/
OAI Identifier: oai:oa.upm.es:63470
DOI: 10.1016/j.patcog.2019.02.025
Official URL: https://www.sciencedirect.com/science/article/abs/pii/S0031320319300974?via%3Dihub
Deposited by: Memoria Investigacion
Deposited on: 27 Oct 2020 07:59
Last Modified: 27 Oct 2020 07:59
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM