Introducing knowledge distillation in a multi-task neural network

Hernández Munuera, Alejandro (2019). Introducing knowledge distillation in a multi-task neural network. Thesis (Master thesis), E.T.S. de Ingenieros Informáticos (UPM).

Description

Title: Introducing knowledge distillation in a multi-task neural network
Author/s:
  • Hernández Munuera, Alejandro
Contributor/s:
  • Obermayer, Klaus
Item Type: Thesis (Master thesis)
Masters title: Data Science
Date: 1 May 2019
Subjects:
Faculty: E.T.S. de Ingenieros Informáticos (UPM)
Department: Otro
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (415kB) | Preview

Abstract

Multi-task Learning (MTL) consists of training a neural network (NN) to perform on more than one task exploiting the knowledge shared between tasks. One of the main problems in MTL is the lack of large datasets containing ground truth (GT) for several tasks on each of its data points. As a consequence, the loss and backpropagation process have to be adapted to this circumstance, undermining the performance of the model. Based on previous work, we use a knowledge distillation (KD) technique to try to overcome this limitation and transfer a more general knowledge using other NN output. KD is a set up where a NN doesn’t learn from the GT, instead, it extracts knowledge from another NN (already trained for the corresponding task) soft output. Along this project, we analyze if the KD technique could be used to solve the data limitation that MTL faces and therefore exploit the MTL advantages in each training data point. Training a multi-task NN on the segmentation and object detection tasks, and making use of the VOC2012seg dataset, which has GT for both tasks on each data point, we run several experiments substituting GT of a specific task with the corresponding “tutor” output in different subsets of images. The results show that KD and the appropriate “tutor” NN can allow MTL training process to lack 100% of the GT for a specific task and even reach a better performance than the same NN trained with the complete GT.

More information

Item ID: 57395
DC Identifier: http://oa.upm.es/57395/
OAI Identifier: oai:oa.upm.es:57395
Deposited by: Biblioteca Facultad de Informatica
Deposited on: 25 Nov 2019 08:32
Last Modified: 25 Nov 2019 08:32
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM