Combining the power of large language models with finetuning based on strategically collected human ratings: A case study about age-of-acquisition estimates of Spanish words

Sendín, Eneko, Conde Díaz, Javier

, Reviriego Vasallo, Pedro

, Haro Rodríguez, Juan

, Ferré Romeu, Pilar

, Hinojosa Poveda, José Antonio

and Brysbaert, Marc

(2025). Combining the power of large language models with finetuning based on strategically collected human ratings: A case study about age-of-acquisition estimates of Spanish words. "Psicologica", v. 46 (n. 2); https://doi.org/10.20350/digitalCSIC/17563.

Descripción

Título:	Combining the power of large language models with finetuning based on strategically collected human ratings: A case study about age-of-acquisition estimates of Spanish words
Autor/es:	Sendín, Eneko Conde Díaz, Javier https://orcid.org/0000-0002-5304-0626 Reviriego Vasallo, Pedro https://orcid.org/0000-0003-2540-5234 Haro Rodríguez, Juan https://orcid.org/0000-0002-3456-4731 Ferré Romeu, Pilar https://orcid.org/0000-0002-3192-0040 Hinojosa Poveda, José Antonio https://orcid.org/0000-0002-7482-9503 Brysbaert, Marc https://orcid.org/0000-0002-3645-3189
Tipo de Documento:	Artículo
Título de Revista/Publicación:	Psicologica
Fecha:	2025
Volumen:	46
Número:	2
Materias:	Psicología Telecomunicaciones
Escuela:	E.T.S.I. Telecomunicación (UPM)
Departamento:	Ingeniería de Sistemas Telemáticos
Grupo Investigación UPM:	Internet de Nueva Generación
Licencias Creative Commons:	Reconocimiento

Texto completo

PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (807kB)

Resumen

This study examined the ability of a large language model, GPT-4o mini, to predict age of acquisition (AoA) for Spanish words, as compared to human ratings. We found a strong correlation (ρ=.75) between the model's AoA estimates and mean human ratings. This correlation was lower than the level of agreement observed between individual human raters (ρ=.85), but we found that finetuning the model on a relatively small dataset of 2000 human AoA ratings has the potential to enhance the model's performance to a level comparable to human consensus. Consistent with theoretical expectations, our analyses confirmed that AoA estimates are meaningful only for words within an individual's vocabulary. Finally, we present a novel dataset of AoA estimates for 28,453 Spanish words likely known by adult speakers.

Más información

ID de Registro:	91144
Identificador DC:	https://oa.upm.es/91144/
Identificador OAI:	oai:oa.upm.es:91144
Identificador DOI:	10.20350/digitalCSIC/17563
URL Oficial:	https://psicologicajournal.com/combining-the-power...
Depositado por:	Javier Conde Díaz
Depositado el:	28 Sep 2025 07:36
Ultima Modificación:	28 Sep 2025 07:36

Estadísticas

Exportar cita

Editar (sólo personal del Archivo)

En esta página

Menú principal

Buscar

Combining the power of large language models with finetuning based on strategically collected human ratings: A case study about age-of-acquisition estimates of Spanish words

Cita

Descripción

Texto completo

Resumen

Más información

Acciones

Metrics

Altmetrics probando

Dimensions

Documentos

El repositorio

Agrupados por ...

Datos Investigación

Financiadores

Especiales

En otros formatos

Redes sociales

Información adicional