5G Multimedia QoE Optimization based on Deep Reinforcement Learning algorithms (A2C)

Río Ponce, Alberto del (2020). 5G Multimedia QoE Optimization based on Deep Reinforcement Learning algorithms (A2C). Tesis (Master), E.T.S.I. Telecomunicación (UPM).

Descripción

Título: 5G Multimedia QoE Optimization based on Deep Reinforcement Learning algorithms (A2C)
Autor/es:
  • Río Ponce, Alberto del
Director/es:
Tipo de Documento: Tesis (Master)
Título del máster: Teoría de la Señal y Comunicaciones
Fecha: 2020
Materias:
ODS:
Palabras Clave Informales: Deep Reinforcement Learning, Adaptive Multimedia, A2C, Quality of Experience
Escuela: E.T.S.I. Telecomunicación (UPM)
Departamento: Teoría de la Señal y Comunicaciones
Licencias Creative Commons: Reconocimiento - No comercial - Compartir igual

Texto completo

[thumbnail of TFM_ALBERTO_DEL_RIO_PONCE.pdf]
Vista Previa
PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (1MB) | Vista Previa

Resumen

Many real-world systems require a complex abstraction of the long-term consequences of a specific configuration, as well as the actions you take on them. Reflecting the configuration of these systems implies a response accordingly, which, properly coded, could be treated as a Reinforcement Learning problem.

Two of the most powerful fields in recent years are Deep Learning and Reinforcement Learning. In general, the first is responsible for simulating Neural Networks to achieve greater efficiency in training models, while the second seeks to predict what actions an agent should take, maximizing the reward received. One of the algorithms that combines both disciplines is Advantage Actor-Critic (A2C).
Due to the innovations produced in the telecommunications sector with the standardization of the new 5G networks, new business opportunities are appearing, being one of the most outstanding multimedia content control under a 5G network.
Thanks to the innovations of these networks, it is possible to send content over long distances with low latency, with one of the challenges to guarantee a Quality of Experience (QoE) through automated control of intermediate processes.

This research project focuses on the development of the A2C algorithm using Deep Reinforcement Learning techniques, for the automation of the bitrate control of a multimedia transmission, focusing efforts on guaranteeing a QoE. A set of components for the streaming handling is recreated, obtaining various metrics to feed the encoded states of the model, using the actions that the model will predict in real-time to configure the maximum bitrate to transmit. To assess the performance of the training, several reward functions are developed, focusing the most important on the Mean Opinion Score (MOS) evaluated by a quality probe acting in the viewing position by an end-user.

The evaluation of the developed models included the development of various Test Cases in which we modified the reward functions to emphasize the importance of the available metrics, including inquiries about the challenge of real-time training. The thesis concluded with a discussion on the potential directions to take in future
research, as well as possible extensions in system optimizations.

Más información

ID de Registro: 65602
Identificador DC: https://oa.upm.es/65602/
Identificador OAI: oai:oa.upm.es:65602
Depositado por: Alberto del Río Ponce
Depositado el: 01 Dic 2020 09:59
Ultima Modificación: 01 Dic 2020 09:59