ARIES: A Lexical Platform for Engineering Spanish Processing Tools

Goñi Menoyo, José Miguel and González Cristóbal, José Carlos and Moreno Sandoval, Antonio (1997). ARIES: A Lexical Platform for Engineering Spanish Processing Tools. "Natural Language Engineering", v. 3 (n. 4); pp. 317-345. ISSN 1351-3249. https://doi.org/10.1017/S1351324997001812.

Description

Title: ARIES: A Lexical Platform for Engineering Spanish Processing Tools
Author/s:
  • Goñi Menoyo, José Miguel
  • González Cristóbal, José Carlos
  • Moreno Sandoval, Antonio
Item Type: Article
Título de Revista/Publicación: Natural Language Engineering
Date: December 1997
Volume: 3
Subjects:
Faculty: E.T.S.I. Telecomunicación (UPM)
Department: Matemática Aplicada a las Tecnologías de la Información [hasta 2014]
UPM's Research Group: Grupo de Sistemas Inteligentes
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (2MB) | Preview

Abstract

We present a lexical platform that has been developed for the Spanish language. It achieves portability between different computer systems and efficiency, in terms of speed and lexical coverage. A model for the full treatment of Spanish inflectional morphology for verbs, nouns and adjectives is presented. This model permits word formation based solely on morpheme concatenation, driven by a feature-based unification grammar. The run-time lexicon is a collection of allomorphs for both stems and endings. Although not tested, it should be suitable also for other Romance and highly inflected languages. A formalism is also described for encoding a lemma-based lexical source, well suited for expressing linguistic generalizations: inheritance classes, lemma encoding, morpho-graphemic allomorphy rules and limited type-checking. From this source base, we can automatically generate an allomorph indexed dictionary adequate for efficient retrieval and processing. A set of software tools has been implemented around this formalism: lexical base augmenting aids, lexical compilers to build run-time dictionaries and access libraries for them, feature manipulation libraries, unification and pseudo-unification modules, morphological processors, a parsing system, etc. Software interfaces among the different modules and tools are cleanly defined to ease software integration and tool combination in a flexible way. Directions for accessing our e-mail and web demonstration prototypes are also provided. Some figures are given, showing the lexical coverage of our platform compared to some popular spelling checkers.

More information

Item ID: 4739
DC Identifier: http://oa.upm.es/4739/
OAI Identifier: oai:oa.upm.es:4739
DOI: 10.1017/S1351324997001812
Official URL: http://journals.cambridge.org/abstract_S1351324997001812
Deposited by: Memoria Investigacion
Deposited on: 27 Oct 2010 08:00
Last Modified: 20 Apr 2016 13:50
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM