La Leaderboard: A Large Language Model Leaderboard for Spanish Varieties and Languages of Spain and Latin America

Grandury González, María, Aula Blasco, Javier, Falcão, Júlia, Fourrier, Clémentine, González, Miguel, Martínez Ruiz, Gonzalo ORCID: https://orcid.org/0000-0002-9125-6225, Santamaría, Gonzalo, Agerri, Rodrigo, Aldama, Nuria, Chiruzzo, Luis, Conde Díaz, Javier ORCID: https://orcid.org/0000-0002-5304-0626, Gómez, Helena, Guerrero Nieto, Marta, Ivetta, Guido, López, Natalia, Plaza del Arco, Flor Miriam, Martín Valdivia, María Teresa, Montoro, Helena, Muñoz, Carmen, Reviriego Vasallo, Pedro ORCID: https://orcid.org/0000-0003-2540-5234, Rosado, Leire, Vaca, Alejandro, Vallecillo Rodríguez, María Estrella, Vallego, Jorge and Zubiaga, Irune (2025). La Leaderboard: A Large Language Model Leaderboard for Spanish Varieties and Languages of Spain and Latin America. En: "NAACL 2025 Workshop on Language Models for Underserved Communities", May 04, 2025.

Descripción

Título: La Leaderboard: A Large Language Model Leaderboard for Spanish Varieties and Languages of Spain and Latin America
Autor/es:
  • Grandury González, María
  • Aula Blasco, Javier
  • Falcão, Júlia
  • Fourrier, Clémentine
  • González, Miguel
  • Martínez Ruiz, Gonzalo https://orcid.org/0000-0002-9125-6225
  • Santamaría, Gonzalo
  • Agerri, Rodrigo
  • Aldama, Nuria
  • Chiruzzo, Luis
  • Conde Díaz, Javier https://orcid.org/0000-0002-5304-0626
  • Gómez, Helena
  • Guerrero Nieto, Marta
  • Ivetta, Guido
  • López, Natalia
  • Plaza del Arco, Flor Miriam
  • Martín Valdivia, María Teresa
  • Montoro, Helena
  • Muñoz, Carmen
  • Reviriego Vasallo, Pedro https://orcid.org/0000-0003-2540-5234
  • Rosado, Leire
  • Vaca, Alejandro
  • Vallecillo Rodríguez, María Estrella
  • Vallego, Jorge
  • Zubiaga, Irune
Tipo de Documento: Ponencia en Congreso o Jornada (Póster)
Título del Evento: NAACL 2025 Workshop on Language Models for Underserved Communities
Fechas del Evento: May 04, 2025
Título del Libro: NAACL 2025 Workshop on Language Models for Underserved Communities
Fecha: 2025
Materias:
ODS:
Escuela: E.T.S.I. Telecomunicación (UPM)
Departamento: Ingeniería de Sistemas Telemáticos
Grupo Investigación UPM: Internet de Nueva Generación
Licencias Creative Commons: Reconocimiento

Texto completo

[thumbnail of 19_La_Leaderboard_A_Large_Lang.pdf] PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (10MB)

Resumen

Leaderboards showcase the current capabilities and limitations of Large Language Models (LLMs). To motivate the development of LLMs that represent the linguistic and cultural diversity of the Spanish-speaking community, we present LA LEADERBOARD 1, the first open-source leaderboard to evaluate generative LLMs in languages and language varieties of Spain and Latin America. LA LEADERBOARD is a communitydriven project that aims to establish an evaluation standard for everyone interested in developing LLMs for the Spanish-speaking community. This initial version combines 66 datasets in Catalan, Basque, Galician, and different Spanish varieties, showcasing the evaluation results of 50 models. To encourage communitydriven development of leaderboards in other languages, we explain our methodology, including guidance on selecting the most suitable evaluation setup for each downstream task. In particular, we provide a rationale for using fewer few-shot examples than typically found in the literature, aiming to reduce environmental impact and facilitate access to reproducible results for a broader research community

Proyectos asociados

Tipo
Código
Acrónimo
Responsable
Título
Gobierno de España
PID2022-136684OB-C22
FUN4DATE
Sin especificar
Sin especificar

Más información

ID de Registro: 89095
Identificador DC: https://oa.upm.es/89095/
Identificador OAI: oai:oa.upm.es:89095
URL Oficial: https://openreview.net/forum?id=Pvg9CMgIES
Depositado por: Javier Conde Díaz
Depositado el: 16 May 2025 07:12
Ultima Modificación: 16 May 2025 07:12