Phonotactic language recognition using i-vectors and phoneme posteriogram counts

D'haro Enríquez, Luis Fernando and Glembek, Ondřej and Plchot, Oldřich and Matějka, Pavel and Soufifar, Mehdi and Córdoba Herralde, Ricardo de and Černocký, Jan (2012). Phonotactic language recognition using i-vectors and phoneme posteriogram counts. In: "InterSpeech 2012 - 13th Annual Conference of the International Speech Communication Association", 09/09/2012 - 13/09/2012, Portland, Oregon. pp. 1-4.

Description

Title: Phonotactic language recognition using i-vectors and phoneme posteriogram counts
Author/s:
  • D'haro Enríquez, Luis Fernando
  • Glembek, Ondřej
  • Plchot, Oldřich
  • Matějka, Pavel
  • Soufifar, Mehdi
  • Córdoba Herralde, Ricardo de
  • Černocký, Jan
Item Type: Presentation at Congress or Conference (Article)
Event Title: InterSpeech 2012 - 13th Annual Conference of the International Speech Communication Association
Event Dates: 09/09/2012 - 13/09/2012
Event Location: Portland, Oregon
Title of Book: InterSpeech 2012 - 13th Annual Conference of the International Speech Communication Association
Date: 2012
Subjects:
Freetext Keywords: subspace modeling, multinomial distributions,LID
Faculty: E.T.S.I. Telecomunicación (UPM)
Department: Ingeniería Electrónica
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (245kB) | Preview

Abstract

This paper describes a novel approach to phonotactic LID, where instead of using soft-counts based on phoneme lattices, we use posteriogram to obtain n-gram counts. The high-dimensional vectors of counts are reduced to low-dimensional units for which we adapted the commonly used term i-vectors. The reduction is based on multinomial subspace modeling and is designed to work in the total-variability space. The proposed technique was tested on the NIST 2009 LRE set with better results to a system based on using soft-counts (Cavg on 30s: 3.15% vs 3.43%), and with very good results when fused with an acoustic i-vector LID system (Cavg on 30s acoustic 2.4% vs 1.25%). The proposed technique is also compared with another low dimensional projection system based on PCA. In comparison with the original soft-counts, the proposed technique provides better results, reduces the problems due to sparse counts, and avoids the process of using pruning techniques when creating the lattices.

More information

Item ID: 20403
DC Identifier: http://oa.upm.es/20403/
OAI Identifier: oai:oa.upm.es:20403
Deposited by: Memoria Investigacion
Deposited on: 05 Oct 2013 08:17
Last Modified: 21 Apr 2016 23:11
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM