Learning Bayesian classifiers from positive and unlabeled examples

Calvo, Borja, Larrañaga Múgica, Pedro María ORCID: https://orcid.org/0000-0002-1885-4501 and Lozano Alonso, José Antonio (2007). Learning Bayesian classifiers from positive and unlabeled examples. "Pattern Recognition Letters", v. 28 (n. 16); pp. 2375-2384. ISSN 1872-7344. https://doi.org/10.1016/j.patrec.2007.08.003.

Description

Title: Learning Bayesian classifiers from positive and unlabeled examples
Author/s:
Item Type: Article
Título de Revista/Publicación: Pattern Recognition Letters
Date: December 2007
ISSN: 1872-7344
Volume: 28
Subjects:
Freetext Keywords: Positive unlabeled learning, Bayesian classifiers, Naive Bayes, Tree augmented naive Bayes, Bayesian approach
Faculty: Facultad de Informática (UPM)
Department: Inteligencia Artificial
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[thumbnail of LARRANAGA_2007_02_03.pdf] PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (10MB)

Abstract

The positive unlabeled learning term refers to the binary classification problem in the absence of negative examples. When only positive and unlabeled instances are available, semi-supervised classification algorithms cannot be directly applied, and thus new algorithms are required. One of these positive unlabeled learning algorithms is the positive naive Bayes (PNB), which is an adaptation of the naïve Bayes induction algorithm that does not require negative instances. In this work we propose two ways of enhancing this algorithm. On one hand, we have taken the concept behind PNB one step further, proposing a procedure to build more complex Bayesian classifiers in the absence of negative instances. We present a new algorithm (named positive tree augmented naive Bayes, PTAN) to obtain tree augmented naive Bayes models in the positive unlabeled domain. On the other hand, we propose a new Bayesian approach to deal with the a priori probability of the positive class that models the uncertainty over this parameter by means of a Beta distribution. This approach is applied to both PNB and PTAN, resulting in two new algorithms. The four algorithms are empirically compared in positive unlabelled learning problems based on real and synthetic databases. The results obtained in these comparisons suggest that, when the predicting variables are not conditionally independent given the class, the extension of PNB to more complex networks increases the classification performance. They also show that our Bayesian approach to the a priori probability of the positive class can improve the results obtained by PNB and PTAN.

Funding Projects

Type
Code
Acronym
Leader
Title
Government of Spain
TIN2005-03824
Unspecified
Unspecified
Unspecified

More information

Item ID: 73118
DC Identifier: https://oa.upm.es/73118/
OAI Identifier: oai:oa.upm.es:73118
DOI: 10.1016/j.patrec.2007.08.003
Official URL: https://www.sciencedirect.com/science/article/pii/...
Deposited by: Biblioteca Facultad de Informatica
Deposited on: 28 Mar 2023 13:17
Last Modified: 28 Mar 2023 13:17
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM