Hybrid Approach Combining Machine Learning and a Rule-Based Expert System for Text Categorization

Villena Román, Julio, Collada Pérez, Sonia, Lana Serrano, Sara

and González Cristóbal, José Carlos

(2011). Hybrid Approach Combining Machine Learning and a Rule-Based Expert System for Text Categorization. En: "Twenty-Fourth International Florida Artificial Intelligence Research Society Conference", 18/05/2011 - 20/05/2011, Palm Beach, Florida, EEUU. pp. 323-328.

Descripción

Título:	Hybrid Approach Combining Machine Learning and a Rule-Based Expert System for Text Categorization
Autor/es:	Villena Román, Julio Collada Pérez, Sonia Lana Serrano, Sara https://orcid.org/0000-0003-2003-5385 González Cristóbal, José Carlos https://orcid.org/0000-0002-1461-2695
Tipo de Documento:	Ponencia en Congreso o Jornada (Artículo)
Título del Evento:	Twenty-Fourth International Florida Artificial Intelligence Research Society Conference
Fechas del Evento:	18/05/2011 - 20/05/2011
Lugar del Evento:	Palm Beach, Florida, EEUU
Título del Libro:	Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference
Fecha:	2011
Materias:	Informática
ODS:	09. Industria, innovación e infraestructura
Escuela:	E.U.I.T. Telecomunicación (UPM) [antigua denominación]
Departamento:	Ingeniería y Arquitecturas Telemáticas [hasta 2014]
Licencias Creative Commons:	Reconocimiento - Sin obra derivada - No comercial

Texto completo

Vista Previa

PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (188kB) | Vista Previa

Resumen

This paper discusses a novel hybrid approach for text categorization that combines a machine learning algorithm, which provides a base model trained with a labeled corpus, with a rule-based expert system, which is used to improve the results provided by the previous classifier, by filtering false positives and dealing with false negatives. The main advantage is that the system can be easily fine-tuned by adding specific rules for those noisy or conflicting categories that have not been successfully trained. We also describe an implementation based on k-Nearest Neighbor and a simple rule language to express lists of positive, negative and relevant (multiword) terms appearing in the input text. The system is evaluated in several scenarios, including the popular Reuters-21578 news corpus for comparison to other approaches, and categorization using IPTC metadata, EUROVOC thesaurus and others. Results show that this approach achieves a precision that is comparable to top ranked methods, with the added value that it does not require a demanding human expert workload to train

Más información

ID de Registro:	13310
Identificador DC:	https://oa.upm.es/13310/
Identificador OAI:	oai:oa.upm.es:13310
URL Oficial:	http://aaai.org/ocs/index.php/FLAIRS/FLAIRS11
Depositado por:	Memoria Investigacion
Depositado el:	28 Nov 2012 10:01
Ultima Modificación:	21 Abr 2016 12:37

Estadísticas

Exportar cita

Editar (sólo personal del Archivo)

En esta página

Menú principal

Buscar

Hybrid Approach Combining Machine Learning and a Rule-Based Expert System for Text Categorization

Cita

Descripción

Texto completo

Resumen

Más información

Acciones

Documentos

El repositorio

Agrupados por ...

Datos Investigación

Financiadores

Especiales

En otros formatos

Redes sociales

Información adicional