On the use of phone-gram units in recurrent neural networks for language identification

Salamea Palacios, Christian Raúl, D'Haro Enríquez, Luis Fernando ORCID: https://orcid.org/0000-0002-3411-7384, Córdoba Herralde, Ricardo de ORCID: https://orcid.org/0000-0002-7136-9636 and San Segundo Hernández, Rubén ORCID: https://orcid.org/0000-0001-9659-5464 (2016). On the use of phone-gram units in recurrent neural networks for language identification. In: "Odyssey 2016: The Speaker and Language Recognition Workshop", 21/06/2016 - 24/06/2016, Bilbao - España. pp. 117-123. https://doi.org/10.21437/Odyssey.2016-17.

Description

Title: On the use of phone-gram units in recurrent neural networks for language identification
Author/s:
Item Type: Presentation at Congress or Conference (Article)
Event Title: Odyssey 2016: The Speaker and Language Recognition Workshop
Event Dates: 21/06/2016 - 24/06/2016
Event Location: Bilbao - España
Title of Book: Proceedings of The Speaker and Language Recognition Workshop, Odyssey 2016
Date: June 2016
Subjects:
Faculty: E.T.S.I. Telecomunicación (UPM)
Department: Ingeniería Electrónica
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[thumbnail of INVE_MEM_2016_259543.pdf]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (147kB) | Preview

Abstract

In this paper we present our results on using RNN-based LM scores trained on different phone-gram orders and using different phonetic ASR recognizers. In order to avoid data sparseness problems and to reduce the vocabulary of all possible n-gram combinations, a K-means clustering procedure was performed using phone-vector embeddings as a pre-processing step. Additional experiments to optimize the amount of classes, batch-size, hidden neurons, state-unfolding, are also presented. We have worked with the KALAKA-3 database for the plenty-closed condition [1]. Thanks to our clustering technique and the combination of high level phonegrams, our phonotactic system performs ~13% better than the unigram-based RNNLM system. Also, the obtained RNNLM scores are calibrated and fused with other scores from an acoustic-based i-vector system and a traditional PPRLM system. This fusion provides additional improvements showing that they provide complementary information to the LID system.

Funding Projects

Type
Code
Acronym
Leader
Title
Madrid Regional Government
TIN2014-54288-C4-1-R
ASLP-MULÁN
Unspecified
Unspecified
Madrid Regional Government
MICINN DPI2014-53525-C3-2-R
NAVEGABLE
Unspecified
Unspecified
Madrid Regional Government
S2009/TIC-1542
MA2VICMR
Unspecified
Unspecified

More information

Item ID: 47224
DC Identifier: https://oa.upm.es/47224/
OAI Identifier: oai:oa.upm.es:47224
DOI: 10.21437/Odyssey.2016-17
Official URL: http://www.odyssey2016.org/papers/pdfs_stamped/53....
Deposited by: Memoria Investigacion
Deposited on: 24 Oct 2017 16:20
Last Modified: 25 Mar 2023 10:28
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM