eprintid: 20409 rev_number: 15 eprint_status: archive userid: 1903 dir: disk0/00/02/04/09 datestamp: 2013-10-05 08:38:23 lastmod: 2016-04-21 23:12:24 status_changed: 2013-10-05 08:38:23 type: conference_item metadata_visibility: show item_issues_count: 0 creators_name: Lorenzo Trueba, Jaime creators_name: Barra Chicote, Roberto creators_name: Raitio, Tuomo creators_name: Obin, Nicolas creators_name: Alku, Paavo creators_name: Yamagishi, J. creators_name: Montero Martínez, Juan Manuel creators_id: jaime.lorenzo@die.upm.es creators_id: barra@die.upm.es title: Towards glottal source controllability in expressive speech synthesis ispublished: pub subjects: telecomunicaciones keywords: Expressive speech synthesis, speaking style, glottal source modeling. abstract: In order to obtain more human like sounding humanmachine interfaces we must first be able to give them expressive capabilities in the way of emotional and stylistic features so as to closely adequate them to the intended task. If we want to replicate those features it is not enough to merely replicate the prosodic information of fundamental frequency and speaking rhythm. The proposed additional layer is the modification of the glottal model, for which we make use of the GlottHMM parameters. This paper analyzes the viability of such an approach by verifying that the expressive nuances are captured by the aforementioned features, obtaining 95% recognition rates on styled speaking and 82% on emotional speech. Then we evaluate the effect of speaker bias and recording environment on the source modeling in order to quantify possible problems when analyzing multi-speaker databases. Finally we propose a speaking styles separation for Spanish based on prosodic features and check its perceptual significance. date: 2012 date_type: published full_text_status: public pres_type: paper pagerange: 1-4 pages: 4 event_title: InterSpeech 2012 - 13th Annual Conference of the International Speech Communication Association event_location: Portland, Oregon event_dates: 09/09/2012 - 13/09/2012 event_type: conference institution: Telecomunicacion department: Ingenieria_Electronica refereed: TRUE book_title: InterSpeech 2012 - 13th Annual Conference of the International Speech Communication Association rights: by-nc-nd citation: Lorenzo Trueba, Jaime, Barra Chicote, Roberto, Raitio, Tuomo, Obin, Nicolas, Alku, Paavo, Yamagishi, J. and Montero Martínez, Juan Manuel (2012). Towards glottal source controllability in expressive speech synthesis. In: "InterSpeech 2012 - 13th Annual Conference of the International Speech Communication Association", 09/09/2012 - 13/09/2012, Portland, Oregon. pp. 1-4. document_url: https://oa.upm.es/20409/1/INVE_MEM_2012_134451.pdf