Texto completo
|
PDF (Portable Document Format)
- Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (2MB) |
ORCID: https://orcid.org/0000-0002-1808-4738, Cuevas Rodríguez, Carlos
ORCID: https://orcid.org/0000-0001-9873-8502 and García Santos, Narciso
ORCID: https://orcid.org/0000-0002-0397-894X
(2024).
Comprehensive comparison between vision transformers and convolutional neural networks for face recognition tasks.
"Scientific Reports", v. 14
;
ISSN 2045-2322.
https://doi.org/10.1038/s41598-024-72254-w.
| Título: | Comprehensive comparison between vision transformers and convolutional neural networks for face recognition tasks |
|---|---|
| Autor/es: |
|
| Tipo de Documento: | Artículo |
| Título de Revista/Publicación: | Scientific Reports |
| Fecha: | 2024 |
| ISSN: | 2045-2322 |
| Volumen: | 14 |
| Materias: | |
| Escuela: | E.T.S.I. Telecomunicación (UPM) |
| Departamento: | Señales, Sistemas y Radiocomunicaciones |
| Licencias Creative Commons: | Reconocimiento - No comercial - Compartir igual |
|
PDF (Portable Document Format)
- Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (2MB) |
This paper presents a comprehensive comparison between Vision Transformers and Convolutional Neural Networks for face recognition related tasks, including extensive experiments on the tasks of face identification and verification. Our study focuses on six state-of-the-art models: EfficientNet, Inception, MobileNet, ResNet, VGG, and Vision Transformers. Our evaluation of these models is based on five diverse datasets: Labeled Faces in the Wild, Real World Occluded Faces, Surveillance Cameras Face, UPM-GTI-Face, and VGG Face 2. These datasets present unique challenges regarding people diversity, distance from the camera, and face occlusions such as those produced by masks and glasses. Our contribution to the field includes a deep analysis of the experimental results, including a thorough examination of the training and evaluation process, as well as the software and hardware configurations used. Our results show that Vision Transformers outperform Convolutional Neural Networks in terms of accuracy and robustness against distance and occlusions for face recognition related tasks, while also presenting a smaller memory footprint and an impressive inference speed, rivaling even the fastest Convolutional Neural Networks. In conclusion, our study provides valuable insights into the performance of Vision Transformers for face recognition related tasks and highlights the potential of these models as a more efficient solution than Convolutional Neural Networks.
| ID de Registro: | 85303 |
|---|---|
| Identificador DC: | https://oa.upm.es/85303/ |
| Identificador OAI: | oai:oa.upm.es:85303 |
| URL Portal Científico: | https://portalcientifico.upm.es/es/ipublic/item/10250855 |
| Identificador DOI: | 10.1038/s41598-024-72254-w |
| URL Oficial: | https://www.nature.com/articles/s41598-024-72254-w |
| Depositado por: | Dr. Carlos Cuevas Rodríguez |
| Depositado el: | 12 Dic 2024 09:43 |
| Ultima Modificación: | 12 Dic 2024 09:43 |
Publicar en el Archivo Digital desde el Portal Científico