5G Multimedia QoE Optimization based on Deep Reinforcement Learning algorithms (A2C)

Río Ponce, Alberto del (2020). 5G Multimedia QoE Optimization based on Deep Reinforcement Learning algorithms (A2C). Thesis (Master thesis), E.T.S.I. Telecomunicación (UPM).

Description

Title: 5G Multimedia QoE Optimization based on Deep Reinforcement Learning algorithms (A2C)
Author/s:
  • Río Ponce, Alberto del
Contributor/s:
  • Serrano Romero, Javier
Item Type: Thesis (Master thesis)
Masters title: Teoría de la Señal y Comunicaciones
Date: 2020
Subjects:
Freetext Keywords: Deep Reinforcement Learning, Adaptive Multimedia, A2C, Quality of Experience
Faculty: E.T.S.I. Telecomunicación (UPM)
Department: Teoría de la Señal y Comunicaciones
Creative Commons Licenses: Recognition - Non commercial - Share

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview

Abstract

Many real-world systems require a complex abstraction of the long-term consequences of a specific configuration, as well as the actions you take on them. Reflecting the configuration of these systems implies a response accordingly, which, properly coded, could be treated as a Reinforcement Learning problem. Two of the most powerful fields in recent years are Deep Learning and Reinforcement Learning. In general, the first is responsible for simulating Neural Networks to achieve greater efficiency in training models, while the second seeks to predict what actions an agent should take, maximizing the reward received. One of the algorithms that combines both disciplines is Advantage Actor-Critic (A2C). Due to the innovations produced in the telecommunications sector with the standardization of the new 5G networks, new business opportunities are appearing, being one of the most outstanding multimedia content control under a 5G network. Thanks to the innovations of these networks, it is possible to send content over long distances with low latency, with one of the challenges to guarantee a Quality of Experience (QoE) through automated control of intermediate processes. This research project focuses on the development of the A2C algorithm using Deep Reinforcement Learning techniques, for the automation of the bitrate control of a multimedia transmission, focusing efforts on guaranteeing a QoE. A set of components for the streaming handling is recreated, obtaining various metrics to feed the encoded states of the model, using the actions that the model will predict in real-time to configure the maximum bitrate to transmit. To assess the performance of the training, several reward functions are developed, focusing the most important on the Mean Opinion Score (MOS) evaluated by a quality probe acting in the viewing position by an end-user. The evaluation of the developed models included the development of various Test Cases in which we modified the reward functions to emphasize the importance of the available metrics, including inquiries about the challenge of real-time training. The thesis concluded with a discussion on the potential directions to take in future research, as well as possible extensions in system optimizations.

More information

Item ID: 65602
DC Identifier: http://oa.upm.es/65602/
OAI Identifier: oai:oa.upm.es:65602
Deposited by: Alberto del Río Ponce
Deposited on: 01 Dec 2020 09:59
Last Modified: 01 Dec 2020 09:59
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM