Texto completo
Vista Previa |
PDF (Portable Document Format)
- Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (188kB) | Vista Previa |
ORCID: https://orcid.org/0000-0003-2003-5385 and González Cristóbal, José Carlos
ORCID: https://orcid.org/0000-0002-1461-2695
(2011).
Hybrid Approach Combining Machine Learning and a Rule-Based Expert System for Text Categorization.
En: "Twenty-Fourth International Florida Artificial Intelligence Research Society Conference", 18/05/2011 - 20/05/2011, Palm Beach, Florida, EEUU. pp. 323-328.
| Título: | Hybrid Approach Combining Machine Learning and a Rule-Based Expert System for Text Categorization |
|---|---|
| Autor/es: |
|
| Tipo de Documento: | Ponencia en Congreso o Jornada (Artículo) |
| Título del Evento: | Twenty-Fourth International Florida Artificial Intelligence Research Society Conference |
| Fechas del Evento: | 18/05/2011 - 20/05/2011 |
| Lugar del Evento: | Palm Beach, Florida, EEUU |
| Título del Libro: | Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference |
| Fecha: | 2011 |
| Materias: | |
| ODS: | |
| Escuela: | E.U.I.T. Telecomunicación (UPM) [antigua denominación] |
| Departamento: | Ingeniería y Arquitecturas Telemáticas [hasta 2014] |
| Licencias Creative Commons: | Reconocimiento - Sin obra derivada - No comercial |
Vista Previa |
PDF (Portable Document Format)
- Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (188kB) | Vista Previa |
This paper discusses a novel hybrid approach for text categorization that combines a machine learning algorithm, which provides a base model trained with a labeled corpus, with a rule-based expert system, which is used to improve the results provided by the previous classifier, by filtering false positives and dealing with false negatives. The main advantage is that the system can be easily fine-tuned by adding specific rules for those noisy or conflicting categories that have not been successfully trained. We also describe an implementation based on k-Nearest Neighbor and a simple rule language to express lists of positive, negative and relevant (multiword) terms appearing in the input text. The system is evaluated in several scenarios, including the popular Reuters-21578 news corpus for comparison to other approaches, and categorization using IPTC metadata, EUROVOC thesaurus and others. Results show that this approach achieves a precision that is comparable to top ranked methods, with the added value that it does not require a demanding human expert workload to train
| ID de Registro: | 13310 |
|---|---|
| Identificador DC: | https://oa.upm.es/13310/ |
| Identificador OAI: | oai:oa.upm.es:13310 |
| URL Oficial: | http://aaai.org/ocs/index.php/FLAIRS/FLAIRS11 |
| Depositado por: | Memoria Investigacion |
| Depositado el: | 28 Nov 2012 10:01 |
| Ultima Modificación: | 21 Abr 2016 12:37 |
Publicar en el Archivo Digital desde el Portal Científico