Development of a genre-dependent TTS system with cross-speaker speaking-style transplantation

Lorenzo Trueba, Jaime, Echeverry Correa, Julian David, Barra Chicote, Roberto, San Segundo Hernández, Rubén

, Ferreiros López, Javier

, Gallardo Antolín, Ascensión, Yamagishi, Junichi, King, Simon and Montero Martínez, Juan Manuel

(2014). Development of a genre-dependent TTS system with cross-speaker speaking-style transplantation. En: "Proceedings of the 2nd International Workshop on Speech, Language and Audio in Multimedia (SLAM2014)", 11/09/2014 - 12/09/2014, Penang, Malaysia. pp. 39-42.

Descripción

Título:	Development of a genre-dependent TTS system with cross-speaker speaking-style transplantation
Autor/es:	Lorenzo Trueba, Jaime Echeverry Correa, Julian David Barra Chicote, Roberto San Segundo Hernández, Rubén https://orcid.org/0000-0001-9659-5464 Ferreiros López, Javier https://orcid.org/0000-0001-8834-3080 Gallardo Antolín, Ascensión Yamagishi, Junichi King, Simon Montero Martínez, Juan Manuel https://orcid.org/0000-0002-7908-5400
Tipo de Documento:	Ponencia en Congreso o Jornada (Artículo)
Título del Evento:	Proceedings of the 2nd International Workshop on Speech, Language and Audio in Multimedia (SLAM2014)
Fechas del Evento:	11/09/2014 - 12/09/2014
Lugar del Evento:	Penang, Malaysia
Título del Libro:	2nd International Workshop on Speech, Language and Audio in Multimedia (SLAM2014)
Fecha:	2014
Materias:	Telecomunicaciones
ODS:	09. Industria, innovación e infraestructura
Palabras Clave Informales:	Speech synthesis, speaking style transplantation, automatic genre identification, Latent Semantic Analysis
Escuela:	E.T.S.I. Telecomunicación (UPM)
Departamento:	Ingeniería Electrónica
Licencias Creative Commons:	Reconocimiento - Sin obra derivada - No comercial

Texto completo

Vista Previa

PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (107kB) | Vista Previa

Resumen

One of the biggest challenges in speech synthesis is the production of contextually-appropriate naturally sounding synthetic voices. This means that a Text-To-Speech system must be able to analyze a text beyond the sentence limits in order to select, or even modulate, the speaking style according to a broader context. Our current architecture is based on a two-step approach: text genre identification and speaking style synthesis according to the detected discourse genre. For the final implementation, a set of four genres and their corresponding speaking styles were considered: broadcast news, live sport commentaries, interviews and political speeches. In the final TTS evaluation, the four speaking styles were transplanted to the neutral voices of other speakers not included in the training database. When the transplanted styles were compared to the neutral voices, transplantation was significantly preferred and the similarity to the target speaker was as high as 78%.

Proyectos asociados

Tipo

Código

Acrónimo

Responsable

Título

Gobierno de España

TIN2011-28169-C05-03

Sin especificar

Gobierno de España

DPI2010-21247-C02-02

Sin especificar

Más información

ID de Registro:	37521
Identificador DC:	https://oa.upm.es/37521/
Identificador OAI:	oai:oa.upm.es:37521
Depositado por:	Memoria Investigacion
Depositado el:	14 Oct 2015 17:05
Ultima Modificación:	06 Jun 2016 17:05

Estadísticas

Exportar cita

Editar (sólo personal del Archivo)

En esta página

Menú principal

Buscar

Development of a genre-dependent TTS system with cross-speaker speaking-style transplantation

Cita

Descripción

Texto completo

Resumen

Proyectos asociados

Más información

Acciones

Documentos

El repositorio

Agrupados por ...

Datos Investigación

Financiadores

Especiales

En otros formatos

Redes sociales

Información adicional