Decoupling Vocal Tract from Glottal Source Estimates in Speaker's Identification

Gómez Vilda, Pedro and Álvarez Marquina, Agustin and Mazaira Fernández, Luis Miguel and Fernández-Baillo Gallego de la Sacristana, Roberto and Nieto Lluis, Victor and Martínez Olalla, Rafael and Muñoz, Cristina and Rodellar Biarge, M. Victoria (2008). Decoupling Vocal Tract from Glottal Source Estimates in Speaker's Identification. "Language Design" (n. Specia); pp. 111-118. ISSN 1139-4218.

Description

Title: Decoupling Vocal Tract from Glottal Source Estimates in Speaker's Identification
Author/s:
  • Gómez Vilda, Pedro
  • Álvarez Marquina, Agustin
  • Mazaira Fernández, Luis Miguel
  • Fernández-Baillo Gallego de la Sacristana, Roberto
  • Nieto Lluis, Victor
  • Martínez Olalla, Rafael
  • Muñoz, Cristina
  • Rodellar Biarge, M. Victoria
Item Type: Article
Título de Revista/Publicación: Language Design
Date: January 2008
ISSN: 1139-4218
Subjects:
Faculty: E.U.I.T. Telecomunicación (UPM)
Department: Ingeniería de Circuitos y Sistemas [hasta 2014]
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (345kB) | Preview

Abstract

Classical parameterization techniques in Speaker Identification tasks use the codification of the power spectral density of speech as a whole, not discriminating between articulatory features due to the dynamics of vocal tract (acoustic-phonetics) and those contributed by the glottal source. Through the present paper a study is conducted to separate voicing fragments of speech into vocal and glottal components, dominated respectively by the vocal tract transfer function estimated adaptively to track the acoustic-phonetic sequence of the message, and by the glottal characteristics of the speaker and the phonation gesture. In this way information which is conveyed in both components depending in different degree on message and biometry is estimated and treated differently to be fused at the time of template composition. The methodology to separate both components is based on the decorrelation hypothesis between vocal and glottal information and it is carried out using Joint Process Estimation. This methodology is briefly discussed and its application on vowel-like speech is presented as an example to observe the resulting estimates both in the time as in the frequency domain. The parameterization methodology to produce representative templates of the glottal and vocal components is also described. Speaker Identification experiments conducted on a wide database of 240 speakers is also given with comparative scorings obtained using different parameterization strategies. The results confirm the better performance of de-coupled parameterization techniques compared against approaches based on full speech parameterization.

More information

Item ID: 2322
DC Identifier: http://oa.upm.es/2322/
OAI Identifier: oai:oa.upm.es:2322
Official URL: http://elies.rediris.es/Language_Design/editorial_info.html
Deposited by: Memoria Investigacion
Deposited on: 24 May 2010 09:06
Last Modified: 30 Sep 2014 15:52
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM