Mesh traversal and sorting for efficient memory usage in scientific codes

Barrio López-Cortijo, Pablo y Carreras Vaquer, Carlos (2011). Mesh traversal and sorting for efficient memory usage in scientific codes. En: "IEEE 30th International Performance Computing and Communications Conference (IPCCC)", 17/11/2011 - 19/11/2012, Orlando, EEUU. pp. 1-8.

Descripción

Título: Mesh traversal and sorting for efficient memory usage in scientific codes
Autor/es:
  • Barrio López-Cortijo, Pablo
  • Carreras Vaquer, Carlos
Tipo de Documento: Ponencia en Congreso o Jornada (Artículo)
Título del Evento: IEEE 30th International Performance Computing and Communications Conference (IPCCC)
Fechas del Evento: 17/11/2011 - 19/11/2012
Lugar del Evento: Orlando, EEUU
Título del Libro: IEEE 30th International Performance Computing and Communications Conference (IPCCC)
Fecha: Noviembre 2011
Materias:
Escuela: E.T.S.I. Telecomunicación (UPM)
Departamento: Ingeniería Electrónica
Licencias Creative Commons: Reconocimiento - Sin obra derivada - No comercial

Texto completo

[img]
Vista Previa
PDF (Document Portable Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (632kB) | Vista Previa

Resumen

Applications that operate on meshes are very popular in High Performance Computing (HPC) environments. In the past, many techniques have been developed in order to optimize the memory accesses for these datasets. Different loop transformations and domain decompositions are com- monly used for structured meshes. However, unstructured grids are more challenging. The memory accesses, based on the mesh connectivity, do not map well to the usual lin- ear memory model. This work presents a method to improve the memory performance which is suitable for HPC codes that operate on meshes. We develop a method to adjust the sequence in which the data are used inside the algorithm, by means of traversing and sorting the mesh. This sorted mesh can be transferred sequentially to the lower memory levels and allows for minimum data transfer requirements. The method also reduces the lower memory requirements dra- matically: up to 63% of the L1 cache misses are removed in a traditional cache system. We have obtained speedups of up to 2.58 on memory operations as measured in a general- purpose CPU. An improvement is also observed with se- quential access memories, where we have observed reduc- tions of up to 99% in the required low-level memory size.

Más información

ID de Registro: 21749
Identificador DC: http://oa.upm.es/21749/
Identificador OAI: oai:oa.upm.es:21749
URL Oficial: http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=6108106&tag=1
Depositado por: Memoria Investigacion
Depositado el: 23 Nov 2013 10:25
Ultima Modificación: 21 Abr 2016 12:29
  • Open Access
  • Open Access
  • Sherpa-Romeo
    Compruebe si la revista anglosajona en la que ha publicado un artículo permite también su publicación en abierto.
  • Dulcinea
    Compruebe si la revista española en la que ha publicado un artículo permite también su publicación en abierto.
  • Recolecta
  • e-ciencia
  • Observatorio I+D+i UPM
  • OpenCourseWare UPM