Cost of fault-tolerance on data stream processing

Vianello, Valerio, Patiño Martínez, Marta ORCID: https://orcid.org/0000-0001-6947-4974, Azqueta Alzúaz, Ainhoa ORCID: https://orcid.org/0000-0002-5451-8900 and Jiménez Peris, Ricardo (2018). Cost of fault-tolerance on data stream processing. In: "Workshop on Autonomic Solutions for Parallel and Distributed Data Stream Processing (Auto-DaSP)", 27-28 Aug 2018, Turín, Italia. ISBN 978-3-030-10548-8. pp. 17-27. https://doi.org/10.1007/978-3-030-10549-5_2.

Description

Title: Cost of fault-tolerance on data stream processing
Author/s:
Item Type: Presentation at Congress or Conference (Article)
Event Title: Workshop on Autonomic Solutions for Parallel and Distributed Data Stream Processing (Auto-DaSP)
Event Dates: 27-28 Aug 2018
Event Location: Turín, Italia
Title of Book: Euro-Par 2018: Parallel Processing Workshops
Date: 2018
ISBN: 978-3-030-10548-8
Subjects:
Freetext Keywords: Data streaming; Fault tolerance; Evaluation; HiBench
Faculty: E.T.S. de Ingenieros Informáticos (UPM)
Department: Lenguajes y Sistemas Informáticos e Ingeniería del Software
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[thumbnail of INVE_MEM_2018_305600.pdf]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (2MB) | Preview

Abstract

Data streaming engines process data on the fly in contrast to databases that first, store the data and then, they process it. In order to process the increasing amount of data produced every day, data streaming engines run on top of a distributed system. In this setting failures will likely happen. Current distributed data streaming engines like Apache Flink provide fault tolerance. In this paper we evaluate the impact on performance of fault tolerance mechanisms of Flink during regular operation (when there are no failures) on a distributed system and the impact on performance when there are failures. We use the Intel HiBench for conducting the evaluation.

Funding Projects

Type
Code
Acronym
Leader
Title
Horizon 2020
732051
CloudDBAppliance
BULL SAS
European cloud in-memory database appliance with predictable performance for critical applications
Horizon 2020
727560
CrowdHEALTH
ATOS SPAIN SA
Collective wisdom driving public health policies
Horizon 2020
779747
BigDataStack
IBM ISRAEL - SCIENCE AND TECHNOLOGY LTD
High-performance data-centric stack for big data applications and operations
Madrid Regional Government
S2013TIC2894
Cloud4BigData
Unspecified
Unspecified
Government of Spain
TIN2016-80350
Unspecified
Universidad Politécnica de Madrid
CloudDB: una base de datos ultraescalable, eficiente y altamente disponible

More information

Item ID: 56629
DC Identifier: https://oa.upm.es/56629/
OAI Identifier: oai:oa.upm.es:56629
DOI: 10.1007/978-3-030-10549-5_2
Official URL: https://link.springer.com/chapter/10.1007/978-3-03...
Deposited by: Memoria Investigacion
Deposited on: 23 Oct 2019 11:12
Last Modified: 23 Oct 2019 11:12
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM