Towards speaking style transplantation in speech synthesis

Lorenzo Trueba, Jaime and Barra Chicote, Roberto and Yamagishi, J. and Watts, Oliver and Montero Martínez, Juan Manuel (2013). Towards speaking style transplantation in speech synthesis. In: "8th ISCA Speech Synthesis Workshop", 31/08/2013 - 02/09/2013, Barcelona, Spain. pp. 159-163.

Description

Title: Towards speaking style transplantation in speech synthesis
Author/s:
  • Lorenzo Trueba, Jaime
  • Barra Chicote, Roberto
  • Yamagishi, J.
  • Watts, Oliver
  • Montero Martínez, Juan Manuel
Item Type: Presentation at Congress or Conference (Article)
Event Title: 8th ISCA Speech Synthesis Workshop
Event Dates: 31/08/2013 - 02/09/2013
Event Location: Barcelona, Spain
Title of Book: 8th ISCA Speech Synthesis Workshop
Date: 2013
Subjects:
Freetext Keywords: Expressive speech synthesis, speaking styles, adaptation, expressiveness transplantation
Faculty: E.T.S.I. Telecomunicación (UPM)
Department: Ingeniería Electrónica
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (576kB) | Preview

Abstract

One of the biggest challenges in speech synthesis is the production of naturally sounding synthetic voices. This means that the resulting voice must be not only of high enough quality but also that it must be able to capture the natural expressiveness imbued in human speech. This paper focus on solving the expressiveness problem by proposing a set of different techniques that could be used for extrapolating the expressiveness of proven high quality speaking style models into neutral speakers in HMM-based synthesis. As an additional advantage, the proposed techniques are based on adaptation approaches, which means that they can be used with little training data (around 15 minutes of training data are used in each style for this paper). For the final implementation, a set of 4 speaking styles were considered: news broadcasts, live sports commentary, interviews and parliamentary speech. Finally, the implementation of the 5 techniques were tested through a perceptual evaluation that proves that the deviations between neutral and speaking style average models can be learned and used to imbue expressiveness into target neutral speakers as intended.

More information

Item ID: 30098
DC Identifier: http://oa.upm.es/30098/
OAI Identifier: oai:oa.upm.es:30098
Deposited by: Memoria Investigacion
Deposited on: 27 Jul 2014 08:02
Last Modified: 22 Apr 2016 00:24
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM