Full text
Preview |
PDF
- Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview |
González González, Alejandro (2020). Telemetry data for machine learning based scheduling. Thesis (Master thesis), E.T.S. de Ingenieros Informáticos (UPM).
Title: | Telemetry data for machine learning based scheduling |
---|---|
Author/s: |
|
Contributor/s: |
|
Item Type: | Thesis (Master thesis) |
Masters title: | Data Science |
Date: | 2020 |
Subjects: | |
Faculty: | E.T.S. de Ingenieros Informáticos (UPM) |
Department: | Inteligencia Artificial |
Creative Commons Licenses: | Recognition - No derivative works - Non commercial |
Preview |
PDF
- Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (1MB) | Preview |
The amount of data generated by computing clusters is very large, including nodes resources data or application related data, among others. However, current systems do not exploit all the potential that this data can offer. This thesis attempts to put into use cluster telemetry data for two different purposes, scheduling and workload estimation. Motivated by the latest advancements in the machine learning field, a Deep Reinforcement Learning (DRL) based scheduler is proposed. Two different scheduling experiments are performed in a simulated cluster environment. The results show that the DRL based scheduler can be trained in specific cluster architectures to optimize performance parameters, such as, job completion time, hence, obtaining the best scheduling policy compared to traditional scheduling heuristics. In addition, Long Short-Term Memory (LSTM) neural networks are proposed to estimate the workload in computing clusters. Hence, an experiment using LSTM to forecast cluster resource usage was implemented. The results of the experiment reveal that telemetry data from the past can be successfully used to predict the future workload of the system. Furthermore, the results expose that LSTM neural networks can be used to anticipate system failures. Finally, a combination of DRL based scheduling and workload estimation is proposed as a future line of research.
Item ID: | 64997 |
---|---|
DC Identifier: | https://oa.upm.es/64997/ |
OAI Identifier: | oai:oa.upm.es:64997 |
Deposited by: | Biblioteca Facultad de Informatica |
Deposited on: | 26 Oct 2020 10:11 |
Last Modified: | 01 Jun 2022 12:50 |