Towards a unified ingestion-and-storage architecture for stream processing

Marcu, Ovidiu-Cristian, Costan, Alexandru, Antoniu, Gabriel, Pérez Hernández, María de los Santos ORCID: https://orcid.org/0000-0003-2949-3307, Tudoran, Radu, Bortoli, Stefano and Nicolae, Bogdan (2017). Towards a unified ingestion-and-storage architecture for stream processing. In: "2017 IEEE International Conference on Big Data (BigData)", 11-14 Dic 2017, Boston, Estados Unidos. ISBN 978-1-5386-2715-0. pp. 2402-2407. https://doi.org/10.1109/BigData.2017.8258196.

Description

Title: Towards a unified ingestion-and-storage architecture for stream processing
Author/s:
Item Type: Presentation at Congress or Conference (Article)
Event Title: 2017 IEEE International Conference on Big Data (BigData)
Event Dates: 11-14 Dic 2017
Event Location: Boston, Estados Unidos
Title of Book: BigData Conference 2017
Date: 2017
ISBN: 978-1-5386-2715-0
Subjects:
Freetext Keywords: Big Data; Streaming; Storage; Ingestion; Unified architecture
Faculty: E.T.S. de Ingenieros Informáticos (UPM)
Department: Arquitectura y Tecnología de Sistemas Informáticos
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[thumbnail of INVE_MEM_2017_272986.pdf]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (703kB) | Preview

Abstract

Big Data applications are rapidly moving from a batch-oriented execution model to a streaming execution model in order to extract value from the data in real-time. However, processing live data alone is often not enough: in many cases, such applications need to combine the live data with previously archived data to increase the quality of the extracted insights. Current streaming-oriented runtimes and middlewares are not flexible enough to deal with this trend, as they address ingestion (collection and pre-processing of data streams) and persistent storage (archival of intermediate results) using separate services. This separation often leads to I/O redundancy (e.g., write data twice to disk or transfer data twice over the network) and interference (e.g., I/O bottlenecks when collecting data streams and writing archival data simultaneously). In this position paper, we argue for a unified ingestion and storage architecture for streaming data that addresses the aforementioned challenge. We identify a set of constraints and benefits for such a unified model, while highlighting the important architectural aspects required to implement it in real life. Based on these aspects, we briefly sketch our plan for future work that develops the position defended in this paper.

Funding Projects

Type
Code
Acronym
Leader
Title
Horizon 2020
MSCA-ITN-2014-642963
BigStorage
Unspecified
BigStorage: Storage-based convergence between HPC and Cloud to handle Big Data

More information

Item ID: 50630
DC Identifier: https://oa.upm.es/50630/
OAI Identifier: oai:oa.upm.es:50630
DOI: 10.1109/BigData.2017.8258196
Official URL: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&ar...
Deposited by: Memoria Investigacion
Deposited on: 05 Jun 2019 10:35
Last Modified: 05 Jun 2019 10:35
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM