Emotion transplantation through adaptation in HMM-based speech synthesis

Lorenzo Trueba, Jaime; Barra Chicote, Roberto; San Segundo Hernández, Rubén; Ferreiros López, Javier; Yamagishi, Junichi y Montero Martínez, Juan Manuel (2015). Emotion transplantation through adaptation in HMM-based speech synthesis. "Computer Speech & Language", v. 34 (n. 1); pp. 292-307. ISSN 0885-2308. https://doi.org/10.1016/j.csl.2015.03.008.

Descripción

Título: Emotion transplantation through adaptation in HMM-based speech synthesis
Autor/es:
  • Lorenzo Trueba, Jaime
  • Barra Chicote, Roberto
  • San Segundo Hernández, Rubén
  • Ferreiros López, Javier
  • Yamagishi, Junichi
  • Montero Martínez, Juan Manuel
Tipo de Documento: Artículo
Título de Revista/Publicación: Computer Speech & Language
Fecha: Noviembre 2015
Volumen: 34
Materias:
Palabras Clave Informales: Statistical parametric speech synthesis; Expressive speech synthesis; Cascade adaptation; Emotion transplantation
Escuela: E.T.S.I. Telecomunicación (UPM)
Departamento: Ingeniería Electrónica
Licencias Creative Commons: Reconocimiento - Sin obra derivada - No comercial

Texto completo

[img]
Vista Previa
PDF (Document Portable Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (7MB) | Vista Previa

Resumen

This paper proposes an emotion transplantation method capable of modifying a synthetic speech model through the use of CSMAPLR adaptation in order to incorporate emotional information learned from a different speaker model while maintaining the identity of the original speaker as much as possible. The proposed method relies on learning both emotional and speaker identity information by means of their adaptation function from an average voice model, and combining them into a single cascade transform capable of imbuing the desired emotion into the target speaker. This method is then applied to the task of transplanting four emotions (anger, happiness, sadness and surprise) into 3 male speakers and 3 female speakers and evaluated in a number of perceptual tests. The results of the evaluations show how the perceived naturalness for emotional text significantly favors the use of the proposed transplanted emotional speech synthesis when compared to traditional neutral speech synthesis, evidenced by a big increase in the perceived emotional strength of the synthesized utterances at a slight cost in speech quality. A final evaluation with a robotic laboratory assistant application shows how by using emotional speech we can significantly increase the students’ satisfaction with the dialog system, proving how the proposed emotion transplantation system provides benefits in real applications.

Proyectos asociados

TipoCódigoAcrónimoResponsableTítulo
FP7TIN2011-28169-C05-03TIMPANOSin especificarSin especificar
FP7MICINNINAPRASin especificarSin especificar
Comunidad de MadridS2009/TIC-1542MA2VICMRSin especificarSin especificar

Más información

ID de Registro: 40458
Identificador DC: http://oa.upm.es/40458/
Identificador OAI: oai:oa.upm.es:40458
Identificador DOI: 10.1016/j.csl.2015.03.008
URL Oficial: http://www.sciencedirect.com/science/article/pii/S0885230815000376
Depositado por: Memoria Investigacion
Depositado el: 23 May 2016 16:50
Ultima Modificación: 01 Dic 2017 23:30
  • Open Access
  • Open Access
  • Sherpa-Romeo
    Compruebe si la revista anglosajona en la que ha publicado un artículo permite también su publicación en abierto.
  • Dulcinea
    Compruebe si la revista española en la que ha publicado un artículo permite también su publicación en abierto.
  • Recolecta
  • e-ciencia
  • Observatorio I+D+i UPM
  • OpenCourseWare UPM