Detecting acronyms from capital letter sequences in Spanish

San Segundo Hernández, Rubén; Montero Martínez, Juan Manuel; Lopez Ludeña, Veronica y King, Simon (2012). Detecting acronyms from capital letter sequences in Spanish. En: "13th Annual Conference of the International Speech Communication Association (INTERSPEECH 2012)", 09/09/2013 - 13/09/2013, Portland, Oregon. pp. 1-4.

Descripción

Título: Detecting acronyms from capital letter sequences in Spanish
Autor/es:
  • San Segundo Hernández, Rubén
  • Montero Martínez, Juan Manuel
  • Lopez Ludeña, Veronica
  • King, Simon
Tipo de Documento: Ponencia en Congreso o Jornada (Artículo)
Título del Evento: 13th Annual Conference of the International Speech Communication Association (INTERSPEECH 2012)
Fechas del Evento: 09/09/2013 - 13/09/2013
Lugar del Evento: Portland, Oregon
Título del Libro: Annual Conference of the International Speech Communication Association (INTERSPEECH 2012)
Fecha: 2012
Materias:
Palabras Clave Informales: Capital letter sequence pronunciation, Speech synthesis, Spelling, Spanish, Acronyms, Abbreviations
Escuela: E.T.S.I. Telecomunicación (UPM)
Departamento: Ingeniería Electrónica
Licencias Creative Commons: Reconocimiento - Sin obra derivada - No comercial

Texto completo

[img]
Vista Previa
PDF (Document Portable Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (139kB) | Vista Previa

Resumen

This paper presents an automatic strategy to decide how to pronounce a Capital Letter Sequence (CLS) in a Text to Speech system (TTS). If CLS is well known by the TTS, it can be expanded in several words. But when the CLS is unknown, the system has two alternatives: spelling it (abbreviation) or pronouncing it as a new word (acronym). In Spanish, there is a high relationship between letters and phonemes. Because of this, when a CLS is similar to other words in Spanish, there is a high tendency to pronounce it as a standard word. This paper proposes an automatic method for detecting acronyms. Additionaly, this paper analyses the discrimination capability of some features, and several strategies for combining them in order to obtain the best classifier. For the best classifier, the classification error is 8.45%. About the feature analysis, the best features have been the Letter Sequence Perplexity and the Average N-gram order.

Más información

ID de Registro: 20355
Identificador DC: http://oa.upm.es/20355/
Identificador OAI: oai:oa.upm.es:20355
Depositado por: Memoria Investigacion
Depositado el: 02 Oct 2013 16:38
Ultima Modificación: 21 Abr 2016 23:06
  • Open Access
  • Open Access
  • Sherpa-Romeo
    Compruebe si la revista anglosajona en la que ha publicado un artículo permite también su publicación en abierto.
  • Dulcinea
    Compruebe si la revista española en la que ha publicado un artículo permite también su publicación en abierto.
  • Recolecta
  • e-ciencia
  • Observatorio I+D+i UPM
  • OpenCourseWare UPM