Predicting incorrect mappings: a data-driven approach applied to DBpedia

Rico-Almodóvar, Mariano and Mihindukulasooriya, Nandana and Kontokostas, Dimitris and Paulheim, Heiko and Hellmann, Sebastian and Gómez-Pérez, A. (2018). Predicting incorrect mappings: a data-driven approach applied to DBpedia. In: "SAC´18: Symposium on Applied Computing", 9-13 Apr 2018, Pau, Francia. ISBN 978-1-4503-5191-1. pp. 323-330. https://doi.org/10.1145/3167132.3167164.

Description

Title: Predicting incorrect mappings: a data-driven approach applied to DBpedia
Author/s:
  • Rico-Almodóvar, Mariano
  • Mihindukulasooriya, Nandana
  • Kontokostas, Dimitris
  • Paulheim, Heiko
  • Hellmann, Sebastian
  • Gómez-Pérez, A.
Item Type: Presentation at Congress or Conference (Article)
Event Title: SAC´18: Symposium on Applied Computing
Event Dates: 9-13 Apr 2018
Event Location: Pau, Francia
Title of Book: SAC '18: Proceedings of the 33rd Annual ACM Symposium on Applied Computing
Date: April 2018
ISBN: 978-1-4503-5191-1
Volume: 1
Subjects:
Freetext Keywords: Linked Data, Data Quality, Mappings, DBpedia, Machine Learning
Faculty: E.T.S. de Ingenieros Informáticos (UPM)
Department: Inteligencia Artificial
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img] PDF - Users in campus UPM only - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (562kB)

Abstract

DBpedia releases consist of more than 70 multilingual datasets that cover data extracted from different language-specific Wikipedia instances. The data extracted from those Wikipedia instances are transformed into RDF using mappings created by the DBpedia community. Nevertheless, not all the mappings are correct and consistent across all the distinct language-specific DBpedia datasets. As these incorrect mappings are spread in a large number of mappings, it is not feasible to inspect all such mappings manually to ensure their correctness. Thus, the goal of this work is to propose a data-driven method to detect incorrect mappings automatically by analyzing the information from both instance data as well as ontological axioms. We propose a machine learning based approach to building a predictive model which can detect incorrect mappings. We have evaluated different supervised classification algorithms for this task and our best model achieves 93% accuracy. These results help us to detect incorrect mappings and achieve a high-quality DBpedia.

Funding Projects

TypeCodeAcronymLeaderTitle
Government of SpainRTC-2016-4952-7UnspecifiedUnspecifiedUnspecified
Government of SpainTIN2013-46238-C4-2-RUnspecifiedUnspecifiedUnspecified
Government of SpainTIN2016-78011-C4-4-RUnspecifiedUnspecifiedUnspecified
Government of SpainBES-2014-068449UnspecifiedUnspecifiedUnspecified

More information

Item ID: 72464
DC Identifier: https://oa.upm.es/72464/
OAI Identifier: oai:oa.upm.es:72464
DOI: 10.1145/3167132.3167164
Official URL: https://dl.acm.org/doi/10.1145/3167132.3167164
Deposited by: Biblioteca Facultad de Informatica
Deposited on: 25 Jan 2023 13:58
Last Modified: 25 Jan 2023 13:58
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM