Parallel query processing in a polystore

Kranas, Pavlos

, Kolev, Boyan

, Levchenko, Oleksandra

, Pacitti, Esther

, Valduriez, Patrick

, Jiménez Peris, Ricardo

and Patiño Martínez, Marta

(2021). Parallel query processing in a polystore. "Distributed And Parallel Databases", v. 39 ; pp. 939-977. ISSN 1573-7578. https://doi.org/10.1007/s10619-021-07322-5.

Descripción

Título:	Parallel query processing in a polystore
Autor/es:	Kranas, Pavlos https://orcid.org/0009-0006-3110-6977 Kolev, Boyan https://orcid.org/0000-0003-4871-0434 Levchenko, Oleksandra https://orcid.org/0000-0002-4230-338X Pacitti, Esther https://orcid.org/0000-0003-1370-9943 Valduriez, Patrick https://orcid.org/0000-0001-6506-7538 Jiménez Peris, Ricardo https://orcid.org/0000-0002-5130-9927 Patiño Martínez, Marta https://orcid.org/0000-0003-2997-3722
Tipo de Documento:	Artículo
Título de Revista/Publicación:	Distributed And Parallel Databases
Fecha:	3 Febrero 2021
ISSN:	1573-7578
Volumen:	39
Materias:	Informática
Palabras Clave Informales:	Database integration, Distributed and parallel databases, Heterogeneus databases, Polystores, Query languages, Query Processing
Escuela:	E.T.S. de Ingenieros Informáticos (UPM)
Departamento:	Lenguajes y Sistemas Informáticos e Ingeniería del Software
Licencias Creative Commons:	Reconocimiento - Sin obra derivada - No comercial

Texto completo

[thumbnail of Parallel query processing.pdf]

PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (1MB)

Resumen

The blooming of different data stores has made polystores a major topic in the cloud and big data landscape. As the amount of data grows rapidly, it becomes critical to exploit the inherent parallel processing capabilities of underlying data stores and data processing platforms. To fully achieve this, a polystore should: (i) preserve the expressivity of each data store’s native query or scripting language and (ii) leverage a distributed architecture to enable parallel data integration, i.e. joins, on top of parallel retrieval of underlying partitioned datasets. In this paper, we address these points by: (i) using the polyglot approach of the CloudMdsQL query language that allows native queries to be expressed as inline scripts and combined with SQL statements for ad-hoc integration and (ii) incorporating the approach within the LeanXcale distributed query engine, thus allowing for native scripts to be processed in parallel at data store shards. In addition, (iii) efficient optimization techniques, such as bind join, can take place to improve the performance of selective joins. We evaluate the performance benefits of exploiting parallelism in combination with high expressivity and optimization through our experimental validation.

Proyectos asociados

Tipo

Código

Acrónimo

Responsable

Título

Horizonte 2020

779747

Sin especificar

BigDataStack

Horizonte 2020

856632

Sin especificar

INFINITECH

Horizonte 2020

870675

Sin especificar

PolicyCLOUD

Comunidad de Madrid

P2018/TCS-4499

Sin especificar

EDGEDATA

Comunidad de Madrid

TIN2016-80350-P

Sin especificar

CLOUDDB

Más información

ID de Registro:	86719
Identificador DC:	https://oa.upm.es/86719/
Identificador OAI:	oai:oa.upm.es:86719
URL Portal Científico:	https://portalcientifico.upm.es/es/ipublic/item/9123386
Identificador DOI:	10.1007/s10619-021-07322-5
URL Oficial:	https://link.springer.com/article/10.1007/s10619-0...
Depositado por:	iMarina Portal Científico
Depositado el:	24 Ene 2025 12:50
Ultima Modificación:	19 Mar 2025 11:50

Estadísticas

Exportar cita

Editar (sólo personal del Archivo)

En esta página

Menú principal

Buscar

Parallel query processing in a polystore

Cita

Descripción

Texto completo

Resumen

Proyectos asociados

Más información

Acciones

Metrics

Altmetrics probando

Dimensions

Documentos

El repositorio

Agrupados por ...

Datos Investigación

Financiadores

Especiales

En otros formatos

Redes sociales

Información adicional