Full text
![]() |
PDF
- Users in campus UPM only until 1 June 2021
- Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (7MB) |
D'Haro Enriquez, Luis Fernando and Banchs, Rafael E. and Hori, Chiori and Li, Haizhou (2019). Automatic evaluation of end-to-end dialog systems with adequacy-fluency metrics. "Computer Speech And Language", v. 55 ; pp. 200-215. ISSN 0885-2308. https://doi.org/10.1016/j.csl.2018.12.004.
Title: | Automatic evaluation of end-to-end dialog systems with adequacy-fluency metrics |
---|---|
Author/s: |
|
Item Type: | Article |
Título de Revista/Publicación: | Computer Speech And Language |
Date: | May 2019 |
ISSN: | 0885-2308 |
Volume: | 55 |
Subjects: | |
Freetext Keywords: | Automatic evaluation metrics; dialog systems; DSTC; adequacy and fluency |
Faculty: | E.T.S.I. Telecomunicación (UPM) |
Department: | Ingeniería Electrónica |
Creative Commons Licenses: | Recognition - No derivative works - Non commercial |
![]() |
PDF
- Users in campus UPM only until 1 June 2021
- Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (7MB) |
End-to-End dialog systems are gaining interest due to the recent advances of deep neuralnetworks and the availability of large human-human dialog corpora. However, in spite ofbeing of fundamental importance to systematically improve the performance of this kind ofsystems, automatic evaluation of the generated dialog utterances is still an unsolved problem.Indeed, most of the proposed objective metrics shown low correlation with human evaluations.In this paper, we evaluate a two-dimensional evaluation metric that is designed to operateat sentence level, which considers the syntactic and semantic information carried along theanswers generated by an end-to-end dialog system with respect to a set of references. Theproposed metric, when applied to outputs generated by the systems participating in track 2of the DSTC-6 challenge, shows a higher correlation with human evaluations (up to 12.8%relative improvement at the system level) than the best of the alternative state-of-the-artautomatic metrics currently available.
Item ID: | 64443 |
---|---|
DC Identifier: | http://oa.upm.es/64443/ |
OAI Identifier: | oai:oa.upm.es:64443 |
DOI: | 10.1016/j.csl.2018.12.004 |
Official URL: | https://www.sciencedirect.com/science/article/pii/S0885230818300858 |
Deposited by: | Memoria Investigacion |
Deposited on: | 20 Dec 2020 09:34 |
Last Modified: | 20 Dec 2020 09:34 |