Automatic evaluation of end-to-end dialog systems with adequacy-fluency metrics

D'Haro Enriquez, Luis Fernando and Banchs, Rafael E. and Hori, Chiori and Li, Haizhou (2019). Automatic evaluation of end-to-end dialog systems with adequacy-fluency metrics. "Computer Speech And Language", v. 55 ; pp. 200-215. ISSN 0885-2308. https://doi.org/10.1016/j.csl.2018.12.004.

Description

Title: Automatic evaluation of end-to-end dialog systems with adequacy-fluency metrics
Author/s:
  • D'Haro Enriquez, Luis Fernando
  • Banchs, Rafael E.
  • Hori, Chiori
  • Li, Haizhou
Item Type: Article
Título de Revista/Publicación: Computer Speech And Language
Date: May 2019
ISSN: 0885-2308
Volume: 55
Subjects:
Freetext Keywords: Automatic evaluation metrics; dialog systems; DSTC; adequacy and fluency
Faculty: E.T.S.I. Telecomunicación (UPM)
Department: Ingeniería Electrónica
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (7MB) | Preview

Abstract

End-to-End dialog systems are gaining interest due to the recent advances of deep neuralnetworks and the availability of large human-human dialog corpora. However, in spite ofbeing of fundamental importance to systematically improve the performance of this kind ofsystems, automatic evaluation of the generated dialog utterances is still an unsolved problem.Indeed, most of the proposed objective metrics shown low correlation with human evaluations.In this paper, we evaluate a two-dimensional evaluation metric that is designed to operateat sentence level, which considers the syntactic and semantic information carried along theanswers generated by an end-to-end dialog system with respect to a set of references. Theproposed metric, when applied to outputs generated by the systems participating in track 2of the DSTC-6 challenge, shows a higher correlation with human evaluations (up to 12.8%relative improvement at the system level) than the best of the alternative state-of-the-artautomatic metrics currently available.

More information

Item ID: 64443
DC Identifier: https://oa.upm.es/64443/
OAI Identifier: oai:oa.upm.es:64443
DOI: 10.1016/j.csl.2018.12.004
Official URL: https://www.sciencedirect.com/science/article/pii/S0885230818300858
Deposited by: Memoria Investigacion
Deposited on: 20 Dec 2020 09:34
Last Modified: 01 Jun 2021 22:30
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM