Full text
Preview |
PDF
- Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (9MB) | Preview |
Rivas Ruzafa, Elena (2020). Pix2Pitch: generating music from paintings by using conditionals GANs. Thesis (Master thesis), E.T.S. de Ingenieros Informáticos (UPM).
Title: | Pix2Pitch: generating music from paintings by using conditionals GANs |
---|---|
Author/s: |
|
Contributor/s: |
|
Item Type: | Thesis (Master thesis) |
Masters title: | Inteligencia Artificial |
Date: | July 2020 |
Subjects: | |
Faculty: | E.T.S. de Ingenieros Informáticos (UPM) |
Department: | Inteligencia Artificial |
Creative Commons Licenses: | Recognition - No derivative works - Non commercial |
Preview |
PDF
- Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (9MB) | Preview |
Generative adversarial networks (GANs) (Goodfellow et al., 2014) have been extensively used to transform and create images or sounds in their own domains. But transformation between different modalities is a problem that hasn’t been so explored. This work proposes a method to generate sound from image, based on Pix2Pix architecture (Isola et al., 2017), a conditional GAN that was designed for general purpose image-to-image translation. In this work a new implementation that allows creating music from images has been developed. The main goal is to create music that describes specific paintings and to answer the question: How does that image sound?. This is an answer that blind people could find useful in several applications, like in museums. To do so it has been taken into account different thesis that posit that there is an interaction between visual art and music, also several works that study synesthetic experimentations. The process implies: first to label and pair images and sounds from different style and points in time, second extract common features from the data by exploring multiple methods for music feature extraction and third to introduce multimodal layers into the GAN. Finally, a method to create novel pieces of music by using the generated sound features has been implemented. As it will be presented in the state-of-the-art section, some advances in crossmodal generation have been achieved but most of them are focused on creating image from sound or image from text, but only a few explore image-to-sound transformations.
Item ID: | 63694 |
---|---|
DC Identifier: | https://oa.upm.es/63694/ |
OAI Identifier: | oai:oa.upm.es:63694 |
Deposited by: | Biblioteca Facultad de Informatica |
Deposited on: | 09 Sep 2020 13:26 |
Last Modified: | 09 Sep 2020 13:26 |