Acoustic Emotion Recognition using Dynamic Bayesian Networks and Multi-Space Distributions

Barra Chicote, Roberto and Fernández Martínez, Fernando and Lebai Lutfi, Syaheerah Binti and Lucas Cuesta, Juan Manuel and Macías Guarasa, Javier and Montero Martínez, Juan Manuel and San Segundo Hernández, Rubén and Pardo Muñoz, José Manuel (2009). Acoustic Emotion Recognition using Dynamic Bayesian Networks and Multi-Space Distributions. In: "10th Annual Conference of the International Speech Communication Association, Interspeech 2009", 06/09/2009 - 10/09/2009, Brighton, UK.

Description

Title: Acoustic Emotion Recognition using Dynamic Bayesian Networks and Multi-Space Distributions
Author/s:
  • Barra Chicote, Roberto
  • Fernández Martínez, Fernando
  • Lebai Lutfi, Syaheerah Binti
  • Lucas Cuesta, Juan Manuel
  • Macías Guarasa, Javier
  • Montero Martínez, Juan Manuel
  • San Segundo Hernández, Rubén
  • Pardo Muñoz, José Manuel
Item Type: Presentation at Congress or Conference (Article)
Event Title: 10th Annual Conference of the International Speech Communication Association, Interspeech 2009
Event Dates: 06/09/2009 - 10/09/2009
Event Location: Brighton, UK
Title of Book: Proceedings of 10th Annual Conference of the International Speech Communication Association, Interspeech 2009
Date: 2009
Subjects:
Freetext Keywords: automatic emotion recognition, multi-space probability distribution, dynamic bayesian networks, emotion challenge.
Faculty: E.T.S.I. Telecomunicación (UPM)
Department: Ingeniería Electrónica
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (288kB) | Preview

Abstract

In this paper we describe the acoustic emotion recognition system built at the Speech Technology Group of the Universidad Politecnica de Madrid (Spain) to participate in the INTERSPEECH 2009 Emotion Challenge. Our proposal is based on the use of a Dynamic Bayesian Network (DBN) to deal with the temporal modelling of the emotional speech information. The selected features (MFCC, F0, Energy and their variants) are modelled as different streams, and the F0 related ones are integrated under a Multi Space Distribution (MSD) framework, to properly model its dual nature (voiced/unvoiced). Experimental evaluation on the challenge test set, show a 67.06%and 38.24% of unweighted recall for the 2 and 5-classes tasks respectively. In the 2-class case, we achieve similar results compared with the baseline, with a considerable less number of features. In the 5-class case, we achieve a statistically significant 6.5% relative improvement

More information

Item ID: 5575
DC Identifier: http://oa.upm.es/5575/
OAI Identifier: oai:oa.upm.es:5575
Official URL: http://www.isca-speech.org/archive/interspeech_2009/i09_0336.html
Deposited by: Memoria Investigacion
Deposited on: 23 Dec 2010 08:44
Last Modified: 20 Apr 2016 14:21
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM