Cost of fault-tolerance on data stream processing

Vianello, Valerio and Patiño Martínez, Marta and Azqueta Alzúaz, Ainhoa and Jiménez Peris, Ricardo (2018). Cost of fault-tolerance on data stream processing. In: "Workshop on Autonomic Solutions for Parallel and Distributed Data Stream Processing (Auto-DaSP)", 27-28 Aug 2018, Turín, Italia. ISBN 978-3-030-10548-8. pp. 17-27. https://doi.org/10.1007/978-3-030-10549-5_2.

Description

Title: Cost of fault-tolerance on data stream processing
Author/s:
  • Vianello, Valerio
  • Patiño Martínez, Marta
  • Azqueta Alzúaz, Ainhoa
  • Jiménez Peris, Ricardo
Item Type: Presentation at Congress or Conference (Article)
Event Title: Workshop on Autonomic Solutions for Parallel and Distributed Data Stream Processing (Auto-DaSP)
Event Dates: 27-28 Aug 2018
Event Location: Turín, Italia
Title of Book: Euro-Par 2018: Parallel Processing Workshops
Date: 2018
ISBN: 978-3-030-10548-8
Subjects:
Freetext Keywords: Data streaming; Fault tolerance; Evaluation; HiBench
Faculty: E.T.S. de Ingenieros Informáticos (UPM)
Department: Lenguajes y Sistemas Informáticos e Ingeniería del Software
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (2MB) | Preview

Abstract

Data streaming engines process data on the fly in contrast to databases that first, store the data and then, they process it. In order to process the increasing amount of data produced every day, data streaming engines run on top of a distributed system. In this setting failures will likely happen. Current distributed data streaming engines like Apache Flink provide fault tolerance. In this paper we evaluate the impact on performance of fault tolerance mechanisms of Flink during regular operation (when there are no failures) on a distributed system and the impact on performance when there are failures. We use the Intel HiBench for conducting the evaluation.

Funding Projects

TypeCodeAcronymLeaderTitle
Horizon 2020732051CloudDBApplianceBULL SASEuropean cloud in-memory database appliance with predictable performance for critical applications
Horizon 2020727560CrowdHEALTHATOS SPAIN SACollective wisdom driving public health policies
Horizon 2020779747BigDataStackIBM ISRAEL - SCIENCE AND TECHNOLOGY LTDHigh-performance data-centric stack for big data applications and operations
Madrid Regional GovernmentS2013TIC2894Cloud4BigDataUnspecifiedUnspecified
Government of SpainTIN2016-80350UnspecifiedUniversidad Politécnica de MadridCloudDB: una base de datos ultraescalable, eficiente y altamente disponible

More information

Item ID: 56629
DC Identifier: http://oa.upm.es/56629/
OAI Identifier: oai:oa.upm.es:56629
DOI: 10.1007/978-3-030-10549-5_2
Official URL: https://link.springer.com/chapter/10.1007/978-3-030-10549-5_2
Deposited by: Memoria Investigacion
Deposited on: 23 Oct 2019 11:12
Last Modified: 23 Oct 2019 11:12
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM