ARIES: A Lexical Platform for Engineering Spanish Processing Tools

Goñi Menoyo, José Miguel ORCID: https://orcid.org/0000-0001-8922-5529, González Cristóbal, José Carlos ORCID: https://orcid.org/0000-0002-1461-2695 and Moreno Sandoval, Antonio (1997). ARIES: A Lexical Platform for Engineering Spanish Processing Tools. "Natural Language Engineering", v. 3 (n. 4); pp. 317-345. ISSN 1351-3249. https://doi.org/10.1017/S1351324997001812.

Descripción

Título: ARIES: A Lexical Platform for Engineering Spanish Processing Tools
Autor/es:
Tipo de Documento: Artículo
Título de Revista/Publicación: Natural Language Engineering
Fecha: Diciembre 1997
ISSN: 1351-3249
Volumen: 3
Número: 4
Materias:
ODS:
Escuela: E.T.S.I. Telecomunicación (UPM)
Departamento: Matemática Aplicada a las Tecnologías de la Información [hasta 2014]
Grupo Investigación UPM: Grupo de Sistemas Inteligentes
Licencias Creative Commons: Reconocimiento - Sin obra derivada - No comercial

Texto completo

[thumbnail of 1997_NLE.pdf]
Vista Previa
PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (2MB) | Vista Previa

Resumen

We present a lexical platform that has been developed for the Spanish language. It achieves portability between different computer systems and efficiency, in terms of speed and lexical coverage. A model for the full treatment of Spanish inflectional morphology for verbs, nouns and adjectives is presented. This model permits word formation based solely on morpheme concatenation, driven by a feature-based unification grammar. The run-time lexicon is a collection of allomorphs for both stems and endings. Although not tested, it should be suitable also for other Romance and highly inflected languages. A formalism is also described for encoding a lemma-based lexical source, well suited for expressing linguistic generalizations: inheritance classes, lemma encoding, morpho-graphemic allomorphy rules and limited type-checking. From this source base, we can automatically generate an allomorph indexed dictionary adequate for efficient retrieval and processing. A set of software tools has been implemented around this formalism: lexical base augmenting aids, lexical compilers to build run-time dictionaries and access libraries for them, feature manipulation libraries, unification and pseudo-unification modules, morphological processors, a parsing system, etc. Software interfaces among the different modules and tools are cleanly defined to ease software integration and tool combination in a flexible way. Directions for accessing our e-mail and web demonstration prototypes are also provided. Some figures are given, showing the lexical coverage of our platform compared to some popular spelling checkers.

Más información

ID de Registro: 4739
Identificador DC: https://oa.upm.es/4739/
Identificador OAI: oai:oa.upm.es:4739
Identificador DOI: 10.1017/S1351324997001812
URL Oficial: http://journals.cambridge.org/abstract_S1351324997...
Depositado por: Memoria Investigacion
Depositado el: 27 Oct 2010 08:00
Ultima Modificación: 20 Abr 2016 13:50