Report of MIRACLE team for the Ad-Hoc track in CLEF 2006

Goñi Menoyo, José Miguel and González Cristóbal, José Carlos and Villena Román, Julio (2006). Report of MIRACLE team for the Ad-Hoc track in CLEF 2006. In: "7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006", 20/09/2006-22/09/2006, Alicante, España. ISBN 2-912335-23-x.

Description

Title: Report of MIRACLE team for the Ad-Hoc track in CLEF 2006
Author/s:
  • Goñi Menoyo, José Miguel
  • González Cristóbal, José Carlos
  • Villena Román, Julio
Item Type: Presentation at Congress or Conference (Article)
Event Title: 7th Workshop of the Cross-Language Evaluation Forum, CLEF 2006
Event Dates: 20/09/2006-22/09/2006
Event Location: Alicante, España
Title of Book: Working Notes for the CLEF 2006 Workshop
Date: 2006
ISBN: 2-912335-23-x
Subjects:
Freetext Keywords: Linguistic Engineering, Information Retrieval, Trie Indexing
Faculty: E.T.S.I. Telecomunicación (UPM)
Department: Matemática Aplicada a las Tecnologías de la Información [hasta 2014]
UPM's Research Group: Grupo de Sistemas Inteligentes
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[thumbnail of Menoyo_10.pdf]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (239kB) | Preview

Abstract

This paper presents the 2006 MIRACLE’s team approach to the AdHoc Information Retrieval track. The experiments for this campaign keep on testing our IR approach. First, a baseline set of runs is obtained, including standard components: stemming, transforming, filtering, entities detection and extracting, and others. Then, a extended set of runs is obtained using several types of combinations of these baseline runs. The improvements introduced for this campaign have been a few ones: we have used an entity recognition and indexing prototype tool into our tokenizing scheme, and we have run more combining experiments for the robust multilingual case than in previous campaigns. However, no significative improvements have been achieved. For the this campaign, runs were submitted for the following languages and tracks: - Monolingual: Bulgarian, French, Hungarian, and Portuguese. - Bilingual: English to Bulgarian, French, Hungarian, and Portuguese; Spanish to French and Portuguese; and French to Portuguese. - Robust monolingual: German, English, Spanish, French, Italian, and Dutch. - Robust bilingual: English to German, Italian to Spanish, and French to Dutch. - Robust multilingual: English to robust monolingual languages. We still need to work harder to improve some aspects of our processing scheme, being the most important, to our knowledge, the entities recognition and normalization.

More information

Item ID: 4688
DC Identifier: https://oa.upm.es/4688/
OAI Identifier: oai:oa.upm.es:4688
Official URL: http://ims-sites.dei.unipd.it/documents/71612/8636...
Deposited by: Memoria Investigacion
Deposited on: 22 Oct 2010 10:59
Last Modified: 20 Apr 2016 13:48
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM