High Efficiency Inference Accelerating Algorithm for NOMA-Based Edge Intelligence

Yuan, Xin ORCID: https://orcid.org/0000-0001-6616-9640, Li, Ning ORCID: https://orcid.org/0000-0002-8567-4025, Zhang, Tuo, Li, Muqing, Chen, Yuwen ORCID: https://orcid.org/0000-0001-6414-9697, Martínez Ortega, José Fernán ORCID: https://orcid.org/0000-0002-7635-4564 and Guo, Song ORCID: https://orcid.org/0000-0001-9831-2202 (2024). High Efficiency Inference Accelerating Algorithm for NOMA-Based Edge Intelligence. "IEEE Transactions on Wireless Communications", v. 23 (n. 11); pp. 17539-17556. ISSN 1536-1276. https://doi.org/10.1109/TWC.2024.3454086.

Descripción

Título: High Efficiency Inference Accelerating Algorithm for NOMA-Based Edge Intelligence
Autor/es:
Tipo de Documento: Artículo
Título de Revista/Publicación: IEEE Transactions on Wireless Communications
Fecha: Noviembre 2024
ISSN: 1536-1276
Volumen: 23
Número: 11
Materias:
Palabras Clave Informales: Edge intelligence, model split, inference accelerating, NOMA
Escuela: E.T.S.I. y Sistemas de Telecomunicación (UPM)
Departamento: Ingeniería Telemática y Electrónica
Licencias Creative Commons: Reconocimiento

Texto completo

[thumbnail of 10270967.pdf] PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (12MB)

Resumen

Even the artificial intelligence (AI) has been widely used and significantly changed our life, deploying the large AI models on resource limited edge devices directly is not appropriate. Thus, the model split inference is proposed to improve the performance of edge intelligence (EI), in which the AI model is divided into different sub-models and the resource-intensive sub-model is offloaded to edge server wirelessly for reducing resource requirements and inference latency. Unfortunately, with the sharp increasing of edge devices, the shortage of spectrum resource in edge network becomes seriously in recent years, which limits the performance improvement of EI. Refer to the NOMA-based edge computing (EC), integrating non-orthogonal multiple access (NOMA) technology with split inference in EI is attractive. However, the NOMA-based communication aspect and the influence of intermediate data transmission fail to be considered properly in model split inference of EI in previous works, and the sophistication in resource allocation caused by NOMA scheme makes it further complicated. Thus, the Effective Communication and Computing resource allocation algorithm is proposed in this paper for accelerating the split inference in NOMA-based EI, shorted as ECC. Specifically, the ECC takes the energy consumption and the inference latency into account to find the optimal model split strategy and resource allocation strategy (subchannel, transmission power, computing resource). Since the minimum inference delay and energy consumption cannot be satisfied simultaneously, the gradient descent (GD) based algorithm is adopted to find the optimal tradeoff between them. Moreover, the loop iteration GD approach (Li-GD) is developed to reduce the complexity of the GD algorithm caused by parameter discretization. The key idea of Li-GD is that: the initial value of the ith layer's GD procedure is selected from the optimal results of the former (i-1) layers' GD procedure whose intermediate data size is the closest to ith layer. Additionally, the properties of the proposed algorithms are investigated, including convergence, complexity, and approximation error. The experimental results demonstrate that the performance of ECC is much better than that of the previous studies.

Proyectos asociados

Tipo
Código
Acrónimo
Responsable
Título
Sin especificar
62101159
Sin especificar
Sin especificar
Sin especificar
Sin especificar
ZR2021MF055
Sin especificar
Sin especificar
Sin especificar
Sin especificar
AoE/E-601/22-R
Sin especificar
Sin especificar
Sin especificar

Más información

ID de Registro: 86112
Identificador DC: https://oa.upm.es/86112/
Identificador OAI: oai:oa.upm.es:86112
URL Portal Científico: https://portalcientifico.upm.es/es/ipublic/item/10270967
Identificador DOI: 10.1109/TWC.2024.3454086
URL Oficial: https://ieeexplore.ieee.org/document/10669843
Depositado por: iMarina Portal Científico
Depositado el: 15 Ene 2025 18:59
Ultima Modificación: 15 Ene 2025 18:59