Comprehensive comparison between vision transformers and convolutional neural networks for face recognition tasks

Rodrigo Talavera, Marcos de

, Cuevas Rodríguez, Carlos

and García Santos, Narciso

(2024). Comprehensive comparison between vision transformers and convolutional neural networks for face recognition tasks. "Scientific Reports", v. 14 ; ISSN 2045-2322. https://doi.org/10.1038/s41598-024-72254-w.

Descripción

Título:	Comprehensive comparison between vision transformers and convolutional neural networks for face recognition tasks
Autor/es:	Rodrigo Talavera, Marcos de https://orcid.org/0000-0002-1808-4738 Cuevas Rodríguez, Carlos https://orcid.org/0000-0001-9873-8502 García Santos, Narciso https://orcid.org/0000-0002-0397-894X
Tipo de Documento:	Artículo
Título de Revista/Publicación:	Scientific Reports
Fecha:	2024
ISSN:	2045-2322
Volumen:	14
Materias:	Telecomunicaciones
Escuela:	E.T.S.I. Telecomunicación (UPM)
Departamento:	Señales, Sistemas y Radiocomunicaciones
Licencias Creative Commons:	Reconocimiento - No comercial - Compartir igual

Texto completo

PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (2MB)

Resumen

This paper presents a comprehensive comparison between Vision Transformers and Convolutional Neural Networks for face recognition related tasks, including extensive experiments on the tasks of face identification and verification. Our study focuses on six state-of-the-art models: EfficientNet, Inception, MobileNet, ResNet, VGG, and Vision Transformers. Our evaluation of these models is based on five diverse datasets: Labeled Faces in the Wild, Real World Occluded Faces, Surveillance Cameras Face, UPM-GTI-Face, and VGG Face 2. These datasets present unique challenges regarding people diversity, distance from the camera, and face occlusions such as those produced by masks and glasses. Our contribution to the field includes a deep analysis of the experimental results, including a thorough examination of the training and evaluation process, as well as the software and hardware configurations used. Our results show that Vision Transformers outperform Convolutional Neural Networks in terms of accuracy and robustness against distance and occlusions for face recognition related tasks, while also presenting a smaller memory footprint and an impressive inference speed, rivaling even the fastest Convolutional Neural Networks. In conclusion, our study provides valuable insights into the performance of Vision Transformers for face recognition related tasks and highlights the potential of these models as a more efficient solution than Convolutional Neural Networks.

Proyectos asociados

Tipo

Código

Acrónimo

Responsable

Título

Sin especificar

PID2020-115132RB

SARAOS

Sin especificar

Más información

ID de Registro:	85303
Identificador DC:	https://oa.upm.es/85303/
Identificador OAI:	oai:oa.upm.es:85303
URL Portal Científico:	https://portalcientifico.upm.es/es/ipublic/item/10250855
Identificador DOI:	10.1038/s41598-024-72254-w
URL Oficial:	https://www.nature.com/articles/s41598-024-72254-w
Depositado por:	Dr. Carlos Cuevas Rodríguez
Depositado el:	12 Dic 2024 09:43
Ultima Modificación:	12 Dic 2024 09:43

Estadísticas

Exportar cita

Editar (sólo personal del Archivo)

En esta página

Menú principal

Buscar

Comprehensive comparison between vision transformers and convolutional neural networks for face recognition tasks

Cita

Descripción

Texto completo

Resumen

Proyectos asociados

Más información

Acciones

Metrics

Altmetrics probando

Dimensions

Documentos

El repositorio

Agrupados por ...

Datos Investigación

Financiadores

Especiales

En otros formatos

Redes sociales

Información adicional