Extended phone log-likelihood ratio features and acoustic-based I-vectors for language recognition

D'haro Enríquez, Luis Fernando; Cordoba Herralde, Ricardo de; Salamea Palacios, Christian Raúl y Echeverry Correa, Julian David (2014). Extended phone log-likelihood ratio features and acoustic-based I-vectors for language recognition. En: "International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2014)", 04/05/2014 - 09/05/2014, Florence, Italy. pp. 5342-5346.

Descripción

Título: Extended phone log-likelihood ratio features and acoustic-based I-vectors for language recognition
Autor/es:
  • D'haro Enríquez, Luis Fernando
  • Cordoba Herralde, Ricardo de
  • Salamea Palacios, Christian Raúl
  • Echeverry Correa, Julian David
Tipo de Documento: Ponencia en Congreso o Jornada (Artículo)
Título del Evento: International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2014)
Fechas del Evento: 04/05/2014 - 09/05/2014
Lugar del Evento: Florence, Italy
Título del Libro: International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2014)
Fecha: 2014
Materias:
Palabras Clave Informales: Phone Log-Likelihood Ratios, SDC, dimensionality reduction
Escuela: E.T.S.I. Telecomunicación (UPM)
Departamento: Ingeniería Electrónica
Licencias Creative Commons: Reconocimiento - Sin obra derivada - No comercial

Texto completo

[img]
Vista Previa
PDF (Document Portable Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (2MB) | Vista Previa

Resumen

This paper presents new techniques with relevant improvements added to the primary system presented by our group to the Albayzin 2012 LRE competition, where the use of any additional corpora for training or optimizing the models was forbidden. In this work, we present the incorporation of an additional phonotactic subsystem based on the use of phone log-likelihood ratio features (PLLR) extracted from different phonotactic recognizers that contributes to improve the accuracy of the system in a 21.4% in terms of Cavg (we also present results for the official metric during the evaluation, Fact). We will present how using these features at the phone state level provides significant improvements, when used together with dimensionality reduction techniques, especially PCA. We have also experimented with applying alternative SDC-like configurations on these PLLR features with additional improvements. Also, we will describe some modifications to the MFCC-based acoustic i-vector system which have also contributed to additional improvements. The final fused system outperformed the baseline in 27.4% in Cavg.

Proyectos asociados

TipoCódigoAcrónimoResponsableTítulo
Comunidad de MadridS2009/TIC-1542Sin especificarSin especificarSin especificar
Gobierno de EspañaDPI2010-21247-C02-02Sin especificarSin especificarSin especificar
Gobierno de EspañaTIN2011-28169-C05-03Sin especificarSin especificarSin especificar
FP7287678SIMPLE4ALLUniversity of EdinburghSpeech synthesis that improves through adaptive learning

Más información

ID de Registro: 37538
Identificador DC: http://oa.upm.es/37538/
Identificador OAI: oai:oa.upm.es:37538
URL Oficial: http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=6854623
Depositado por: Memoria Investigacion
Depositado el: 21 Sep 2015 16:36
Ultima Modificación: 21 Sep 2015 16:36
  • Open Access
  • Open Access
  • Sherpa-Romeo
    Compruebe si la revista anglosajona en la que ha publicado un artículo permite también su publicación en abierto.
  • Dulcinea
    Compruebe si la revista española en la que ha publicado un artículo permite también su publicación en abierto.
  • Recolecta
  • e-ciencia
  • Observatorio I+D+i UPM
  • OpenCourseWare UPM