Hardware Accelerated RISC-V Vector Extension for High Performance Embedded Computing

Corral Margeli, Ane (2026). Hardware Accelerated RISC-V Vector Extension for High Performance Embedded Computing. Tesis (Master), E.T.S.I. Industriales (UPM).

Descripción

Título: Hardware Accelerated RISC-V Vector Extension for High Performance Embedded Computing
Autor/es:
  • Corral Margeli, Ane
Director/es:
  • Rodríguez Medina, Alfonso https://orcid.org/0000-0001-6326-743X
  • Díez de Ulzurrun Aquerreta, Iñigo
Tipo de Documento: Tesis (Master)
Título del máster: Electrónica Industrial
Fecha: 2 Febrero 2026
Materias:
ODS:
Escuela: E.T.S.I. Industriales (UPM)
Departamento: Automática, Ingeniería Eléctrica y Electrónica e Informática Industrial
Licencias Creative Commons: Reconocimiento - Sin obra derivada - No comercial

Texto completo

[thumbnail of TFM_ANE_C_M.pdf] PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (10MB)

Resumen

The contemporary computational landscape is increasingly defined by the intensive data-processing requirements of Artificial Intelligence (AI) and Machine Learning (ML). To address these demands, Single Instruction, Multiple Data (SIMD) strategies have emerged as a critical approach for accelerating data-intensive workloads through the parallelisation of operations. Within this context, the open-source RISC-V Instruction Set Architecture (ISA) facilitates efficient SIMD computation through its RISC-V Vector Extension (RVV). This thesis presents the design and validation of a vector accelerator tailored for high-performance tasks within embedded systems. The architecture is developed with a modular foundation to support future scalability and implements the Zve32x vector sub-extension, providing support for 32-bit integer operations. Integrated as a coprocessor to the CV32E20 core within the eXtendable Heterogeneous Energy-efficient Platform (X-HEEP) ecosystem, the accelerator extends the scalar core’s capabilities via the Core-V eXtension Interface (CV-X-IF) 1.0. Data memory accesses to perform load/store operations are handled through the Open Bus Interface (OBI) v1.0 protocol to ensure efficient data throughput. A first implementation of the accelerator, constrained to a Vector Register Length (VLEN) of 128 bits, was validated via simulation and on Xilinx Pynq-z2 Field-Programmable Gate Array (FPGA). Performance was evaluated using standard data-parallel kernels: SAXPY and Indexed Arithmetic, which perform scalar-vector multiplication; and Matmul, executing matrix multiplication. Furthermore, this research evaluates the RISC-V GNU Compiler Toolchain, specifically investigating its auto-vectorisation capabilities in C-based applications. Comparative analysis was performed between standard C implementations and those utilising “RISC-V Vector C Intrinsics”. Results from simulation and FPGA execution demonstrate that the proposed accelerator achieves a maximum speed-up of 3.83× for the Indexed Arithmetic algorithm when employing the C Intrinsics library, highlighting the performance advantages of manual vectorisation in specialised embedded hardware.

Más información

ID de Registro: 94619
Identificador DC: https://oa.upm.es/94619/
Identificador OAI: oai:oa.upm.es:94619
Depositado por: Ane Corral Margeli
Depositado el: 06 Mar 2026 14:59
Ultima Modificación: 06 Mar 2026 14:59