Design, Generation and Evaluation of a Synthetic Dialogue Dataset for Contextually Aware Chatbots in Art Museums

Rachidi, Inass, Ezzakri, Anas, Bellver Soler, Jaime ORCID: https://orcid.org/0009-0006-7973-4913 and D'Haro Enríquez, Luis Fernando ORCID: https://orcid.org/0000-0002-3411-7384 (2025). Design, Generation and Evaluation of a Synthetic Dialogue Dataset for Contextually Aware Chatbots in Art Museums. En: "15th International Workshop on Spoken Dialogue Systems Technology (IWSDS 2025)", May, 27-30, 2025, Bilbao, España.

Descripción

Título: Design, Generation and Evaluation of a Synthetic Dialogue Dataset for Contextually Aware Chatbots in Art Museums
Autor/es:
Tipo de Documento: Ponencia en Congreso o Jornada (Artículo)
Título del Evento: 15th International Workshop on Spoken Dialogue Systems Technology (IWSDS 2025)
Fechas del Evento: May, 27-30, 2025
Lugar del Evento: Bilbao, España
Título del Libro: Proceedings of the 15th International Workshop on Spoken Dialogue Systems Technology
Fecha: Mayo 2025
Materias:
ODS:
Escuela: E.T.S.I. Telecomunicación (UPM)
Departamento: Ingeniería Electrónica
Licencias Creative Commons: Reconocimiento - No comercial - Compartir igual

Texto completo

[thumbnail of 2025iwsds1.pdf] PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (651kB)

Resumen

This paper presents the design, synthetic generation, and automated evaluation of ArtGenEval-GPT++, an advanced dataset for training and fine-tuning conversational agents with artificial awareness capabilities targeting to the art domain. Building on the foundation of a previously released dataset (ArtGenEval-GPT), the new version introduces enhancements for greater personalization (eg, gender, ethnicity, age, and knowledge) while addressing prior limitations, including low-quality dialogues and hallucinations. The dataset comprises approximately 12,500 dyadic, multi-turn dialogues generated using state-of-the-art large language models (LLMs). These dialogues span diverse museum scenarios, incorporating varied visitor profiles, emotional states, interruptions, and chatbot behaviors. Objective evaluations confirm the dataset’s quality and contextual coherence. Ethical considerations, including biases and hallucinations, are analyzed, with proposed directions for improving the dataset utility. This work contributes to the development of personalized, context-aware conversational agents capable of navigating complex, real-world environments, such as museums, to enhance visitor engagement and satisfaction.

Proyectos asociados

Tipo
Código
Acrónimo
Responsable
Título
Horizonte Europa
101071191
ASTOUND3
Luis Fernando D'Haro
ASTOUND
Gobierno de España
PID2021- 126061OB-C43
Sin especificar
Sin especificar
Sin especificar
Comunidad de Madrid
PHS-2024/PH-HUM52
INNOVATRAD-CM
Sin especificar
Sin especificar

Más información

ID de Registro: 90858
Identificador DC: https://oa.upm.es/90858/
Identificador OAI: oai:oa.upm.es:90858
URL Oficial: https://aclanthology.org/2025.iwsds-1.3/
Depositado por: Jaime Bellver Soler
Depositado el: 16 Sep 2025 07:37
Ultima Modificación: 16 Sep 2025 07:38