On the use of phone-gram units in recurrent neural networks for language identification

Salamea Palacios, Christian Raúl and D'Haro Enríquez, Luis Fernando and Cordoba Herralde, Ricardo de and San Segundo Hernández, Rubén (2016). On the use of phone-gram units in recurrent neural networks for language identification. In: "Odyssey 2016: The Speaker and Language Recognition Workshop", 21/06/2016 - 24/06/2016, Bilbao - España. pp. 117-123. https://doi.org/DOI: 10.21437/Odyssey.2016-17.

Description

Title: On the use of phone-gram units in recurrent neural networks for language identification
Author/s:
  • Salamea Palacios, Christian Raúl
  • D'Haro Enríquez, Luis Fernando
  • Cordoba Herralde, Ricardo de
  • San Segundo Hernández, Rubén
Item Type: Presentation at Congress or Conference (Article)
Event Title: Odyssey 2016: The Speaker and Language Recognition Workshop
Event Dates: 21/06/2016 - 24/06/2016
Event Location: Bilbao - España
Title of Book: Proceedings of The Speaker and Language Recognition Workshop, Odyssey 2016
Date: June 2016
Subjects:
Faculty: E.T.S.I. Telecomunicación (UPM)
Department: Ingeniería Electrónica
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (147kB) | Preview

Abstract

In this paper we present our results on using RNN-based LM scores trained on different phone-gram orders and using different phonetic ASR recognizers. In order to avoid data sparseness problems and to reduce the vocabulary of all possible n-gram combinations, a K-means clustering procedure was performed using phone-vector embeddings as a pre-processing step. Additional experiments to optimize the amount of classes, batch-size, hidden neurons, state-unfolding, are also presented. We have worked with the KALAKA-3 database for the plenty-closed condition [1]. Thanks to our clustering technique and the combination of high level phonegrams, our phonotactic system performs ~13% better than the unigram-based RNNLM system. Also, the obtained RNNLM scores are calibrated and fused with other scores from an acoustic-based i-vector system and a traditional PPRLM system. This fusion provides additional improvements showing that they provide complementary information to the LID system.

Funding Projects

TypeCodeAcronymLeaderTitle
Madrid Regional GovernmentTIN2014-54288-C4-1-RASLP-MULÁNUnspecifiedUnspecified
Madrid Regional GovernmentMICINN DPI2014-53525-C3-2-RNAVEGABLEUnspecifiedUnspecified
Madrid Regional GovernmentS2009/TIC-1542MA2VICMRUnspecifiedUnspecified

More information

Item ID: 47224
DC Identifier: http://oa.upm.es/47224/
OAI Identifier: oai:oa.upm.es:47224
DOI: DOI: 10.21437/Odyssey.2016-17
Official URL: http://www.odyssey2016.org/papers/pdfs_stamped/53.pdf
Deposited by: Memoria Investigacion
Deposited on: 24 Oct 2017 16:20
Last Modified: 24 Oct 2017 16:20
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM