Phonotactic language recognition using i-vectors and phoneme posteriogram counts

D'haro Enríquez, Luis Fernando; Glembek, Ondřej; Plchot, Oldřich; Matějka, Pavel; Soufifar, Mehdi; Córdoba Herralde, Ricardo de y Černocký, Jan (2012). Phonotactic language recognition using i-vectors and phoneme posteriogram counts. En: "InterSpeech 2012 - 13th Annual Conference of the International Speech Communication Association", 09/09/2012 - 13/09/2012, Portland, Oregon. pp. 1-4.

Descripción

Título: Phonotactic language recognition using i-vectors and phoneme posteriogram counts
Autor/es:
  • D'haro Enríquez, Luis Fernando
  • Glembek, Ondřej
  • Plchot, Oldřich
  • Matějka, Pavel
  • Soufifar, Mehdi
  • Córdoba Herralde, Ricardo de
  • Černocký, Jan
Tipo de Documento: Ponencia en Congreso o Jornada (Artículo)
Título del Evento: InterSpeech 2012 - 13th Annual Conference of the International Speech Communication Association
Fechas del Evento: 09/09/2012 - 13/09/2012
Lugar del Evento: Portland, Oregon
Título del Libro: InterSpeech 2012 - 13th Annual Conference of the International Speech Communication Association
Fecha: 2012
Materias:
Palabras Clave Informales: subspace modeling, multinomial distributions,LID
Escuela: E.T.S.I. Telecomunicación (UPM)
Departamento: Ingeniería Electrónica
Licencias Creative Commons: Reconocimiento - Sin obra derivada - No comercial

Texto completo

[img]
Vista Previa
PDF (Document Portable Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (245kB) | Vista Previa

Resumen

This paper describes a novel approach to phonotactic LID, where instead of using soft-counts based on phoneme lattices, we use posteriogram to obtain n-gram counts. The high-dimensional vectors of counts are reduced to low-dimensional units for which we adapted the commonly used term i-vectors. The reduction is based on multinomial subspace modeling and is designed to work in the total-variability space. The proposed technique was tested on the NIST 2009 LRE set with better results to a system based on using soft-counts (Cavg on 30s: 3.15% vs 3.43%), and with very good results when fused with an acoustic i-vector LID system (Cavg on 30s acoustic 2.4% vs 1.25%). The proposed technique is also compared with another low dimensional projection system based on PCA. In comparison with the original soft-counts, the proposed technique provides better results, reduces the problems due to sparse counts, and avoids the process of using pruning techniques when creating the lattices.

Más información

ID de Registro: 20403
Identificador DC: http://oa.upm.es/20403/
Identificador OAI: oai:oa.upm.es:20403
Depositado por: Memoria Investigacion
Depositado el: 05 Oct 2013 08:17
Ultima Modificación: 21 Abr 2016 23:11
  • Open Access
  • Open Access
  • Sherpa-Romeo
    Compruebe si la revista anglosajona en la que ha publicado un artículo permite también su publicación en abierto.
  • Dulcinea
    Compruebe si la revista española en la que ha publicado un artículo permite también su publicación en abierto.
  • Recolecta
  • e-ciencia
  • Observatorio I+D+i UPM
  • OpenCourseWare UPM