Decoupling Vocal Tract from Glottal Source Estimates in Speaker's Identification

Gómez Vilda, Pedro; Álvarez Marquina, Agustin; Mazaira Fernández, Luis Miguel; Fernández-Baillo Gallego de la Sacristana, Roberto; Nieto Lluis, Victor; Martínez Olalla, Rafael; Muñoz, Cristina y Rodellar Biarge, M. Victoria (2008). Decoupling Vocal Tract from Glottal Source Estimates in Speaker's Identification. "Language Design" (n. Specia); pp. 111-118. ISSN 1139-4218.

Descripción

Título: Decoupling Vocal Tract from Glottal Source Estimates in Speaker's Identification
Autor/es:
  • Gómez Vilda, Pedro
  • Álvarez Marquina, Agustin
  • Mazaira Fernández, Luis Miguel
  • Fernández-Baillo Gallego de la Sacristana, Roberto
  • Nieto Lluis, Victor
  • Martínez Olalla, Rafael
  • Muñoz, Cristina
  • Rodellar Biarge, M. Victoria
Tipo de Documento: Artículo
Título de Revista/Publicación: Language Design
Fecha: Enero 2008
Materias:
Escuela: E.U.I.T. Telecomunicación (UPM) [antigua denominación]
Departamento: Ingeniería de Circuitos y Sistemas [hasta 2014]
Licencias Creative Commons: Reconocimiento - Sin obra derivada - No comercial

Texto completo

[img]
Vista Previa
PDF (Document Portable Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (345kB) | Vista Previa

Resumen

Classical parameterization techniques in Speaker Identification tasks use the codification of the power spectral density of speech as a whole, not discriminating between articulatory features due to the dynamics of vocal tract (acoustic-phonetics) and those contributed by the glottal source. Through the present paper a study is conducted to separate voicing fragments of speech into vocal and glottal components, dominated respectively by the vocal tract transfer function estimated adaptively to track the acoustic-phonetic sequence of the message, and by the glottal characteristics of the speaker and the phonation gesture. In this way information which is conveyed in both components depending in different degree on message and biometry is estimated and treated differently to be fused at the time of template composition. The methodology to separate both components is based on the decorrelation hypothesis between vocal and glottal information and it is carried out using Joint Process Estimation. This methodology is briefly discussed and its application on vowel-like speech is presented as an example to observe the resulting estimates both in the time as in the frequency domain. The parameterization methodology to produce representative templates of the glottal and vocal components is also described. Speaker Identification experiments conducted on a wide database of 240 speakers is also given with comparative scorings obtained using different parameterization strategies. The results confirm the better performance of de-coupled parameterization techniques compared against approaches based on full speech parameterization.

Más información

ID de Registro: 2322
Identificador DC: http://oa.upm.es/2322/
Identificador OAI: oai:oa.upm.es:2322
URL Oficial: http://elies.rediris.es/Language_Design/editorial_info.html
Depositado por: Memoria Investigacion
Depositado el: 24 May 2010 09:06
Ultima Modificación: 30 Sep 2014 15:52
  • Open Access
  • Open Access
  • Sherpa-Romeo
    Compruebe si la revista anglosajona en la que ha publicado un artículo permite también su publicación en abierto.
  • Dulcinea
    Compruebe si la revista española en la que ha publicado un artículo permite también su publicación en abierto.
  • Recolecta
  • e-ciencia
  • Observatorio I+D+i UPM
  • OpenCourseWare UPM