DAEDALUS at PAN 2014: Guessing tweet author's gender and age

Villena Román, Julio y González Cristóbal, José Carlos (2014). DAEDALUS at PAN 2014: Guessing tweet author's gender and age. En: "5th Conference and Labs of the Evaluation Forum (CLEF 2014) Information Access Evaluation meets Multilinguality, Multimodality, and Interaction", 15/09/2014 - 18/09/2014, Sheffield, UK. pp. 1157-1163.

Descripción

Título: DAEDALUS at PAN 2014: Guessing tweet author's gender and age
Autor/es:
  • Villena Román, Julio
  • González Cristóbal, José Carlos
Tipo de Documento: Ponencia en Congreso o Jornada (Artículo)
Título del Evento: 5th Conference and Labs of the Evaluation Forum (CLEF 2014) Information Access Evaluation meets Multilinguality, Multimodality, and Interaction
Fechas del Evento: 15/09/2014 - 18/09/2014
Lugar del Evento: Sheffield, UK
Título del Libro: 5th Conference and Labs of the Evaluation Forum (CLEF 2014) Information Access Evaluation meets Multilinguality, Multimodality, and Interaction
Fecha: 2014
Materias:
Palabras Clave Informales: PAN, CLEF, author profiling, gender, age, user demographics, machine learning classifier, Naive Bayes Multinomial, term vector mode
Escuela: E.T.S.I. Telecomunicación (UPM)
Departamento: Ingeniería de Sistemas Telemáticos [hasta 2014]
Licencias Creative Commons: Reconocimiento - Sin obra derivada - No comercial

Texto completo

[img]
Vista Previa
PDF (Document Portable Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (594kB) | Vista Previa

Resumen

This paper describes our participation at PAN 2014 author profiling task. Our idea was to define, develop and evaluate a simple machine learning classifier able to guess the gender and the age of a given user based on his/her texts, which could become part of the solution portfolio of the company. We were interested in finding not the best possible classifier that achieves the highest accuracy, but to find the optimum balance between performance and throughput using the most simple strategy and less dependent of external systems. Results show that our software using Naive Bayes Multinomial with a term vector model representation of the text is ranked quite well among the rest of participants in terms of accuracy.

Más información

ID de Registro: 35363
Identificador DC: http://oa.upm.es/35363/
Identificador OAI: oai:oa.upm.es:35363
Depositado por: Memoria Investigacion
Depositado el: 27 May 2015 17:10
Ultima Modificación: 27 May 2015 17:10
  • Open Access
  • Open Access
  • Sherpa-Romeo
    Compruebe si la revista anglosajona en la que ha publicado un artículo permite también su publicación en abierto.
  • Dulcinea
    Compruebe si la revista española en la que ha publicado un artículo permite también su publicación en abierto.
  • Recolecta
  • e-ciencia
  • Observatorio I+D+i UPM
  • OpenCourseWare UPM