Full text
Preview |
PDF
- Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview |
Belkoura, Seddik, Zanin, Massimiliano and Latorre De La Fuente, Antonio ORCID: https://orcid.org/0000-0002-8718-5735
(2019).
Fostering interpretability of data mining models through data perturbation.
"Expert Systems with Applications", v. 137
;
pp. 191-201.
ISSN 0957-4174.
https://doi.org/10.1016/j.eswa.2019.07.001.
Title: | Fostering interpretability of data mining models through data perturbation |
---|---|
Author/s: |
|
Item Type: | Article |
Título de Revista/Publicación: | Expert Systems with Applications |
Date: | 2019 |
ISSN: | 0957-4174 |
Volume: | 137 |
Subjects: | |
Freetext Keywords: | Interpretability, Data mining, Random forest, Artificial neural networks |
Faculty: | E.T.S. de Ingenieros Informáticos (UPM) |
Department: | Arquitectura y Tecnología de Sistemas Informáticos |
Creative Commons Licenses: | Recognition - No derivative works - Non commercial |
Preview |
PDF
- Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview |
With the widespread adoption of data mining models to solve real-world problems, the scientific community is facing the need of increasing their interpretability and comprehensibility. This is especially relevant in the case of black box models, in which inputs and outputs are usually connected by highly complex and nonlinear functions; in applications requiring an interaction between the user and the model; and when the machine’s solution disagrees with the human experience. In this contribution we present a new methodology that allows to simplify the process of understanding the rules behind a classification model, even in the case of black box ones. It is based on the perturbation of the features describing one instance, and on finding the minimal variation required to change the forecasted class. It thus yields simplified rules describing under which circumstances would the solution have been different, and allows to compare these with the human expectation. We show how such methodology is well defined, model-agnostic, easy to implement and modular; and demonstrate its usefulness with several synthetic and real-world data sets.
Item ID: | 67064 |
---|---|
DC Identifier: | https://oa.upm.es/67064/ |
OAI Identifier: | oai:oa.upm.es:67064 |
DOI: | 10.1016/j.eswa.2019.07.001 |
Official URL: | https://www.sciencedirect.com/journal/expert-syste... |
Deposited by: | Memoria Investigacion |
Deposited on: | 17 May 2021 09:36 |
Last Modified: | 17 May 2021 09:36 |