Adversarial recovery of agent rewards from latent spaces of the limit order book

Roa Vicens, Jacobo and Wang, Yuanbo and Mison, Virgile and Gal, Yarin and Silva, Ricardo (2019). Adversarial recovery of agent rewards from latent spaces of the limit order book. In: "NeurIPS 2019 Workshop on Robust AI in Financial Services. 33rd Conference on Neural Information Processing Systems", 8 – 14 Diciembre 2019, Vancouver.

Description

Title: Adversarial recovery of agent rewards from latent spaces of the limit order book
Author/s:
  • Roa Vicens, Jacobo
  • Wang, Yuanbo
  • Mison, Virgile
  • Gal, Yarin
  • Silva, Ricardo
Item Type: Presentation at Congress or Conference (Poster)
Event Title: NeurIPS 2019 Workshop on Robust AI in Financial Services. 33rd Conference on Neural Information Processing Systems
Event Dates: 8 – 14 Diciembre 2019
Event Location: Vancouver
Title of Book: NeurIPS 2019 Workshop on Robust AI in Financial Services. 33rd Conference on Neural Information Processing Systems
Date: 9 December 2019
Subjects:
Faculty: E.T.S.I. Telecomunicación (UPM)
Department: Señales, Sistemas y Radiocomunicaciones
Creative Commons Licenses: Recognition - Share

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (667kB) | Preview

Abstract

Inverse reinforcement learning has proved its ability to explain state-action trajectories of expert agents by recovering their underlying reward functions in increasingly challenging environments. Recent advances in adversarial learning have allowed extending inverse RL to applications with non-stationary environment dynamics unknown to the agents, arbitrary structures of reward functions and improved handling of the ambiguities inherent to the ill-posed nature of inverse RL. This is particularly relevant in real time applications on stochastic environments involving risk, like volatile financial markets. Moreover, recent work on simulation of complex environments enable learning algorithms to engage with real market data through simulations of its latent space representations, avoiding a costly exploration of the original environment. In this paper, we explore whether adversarial inverse RL algorithms can be adapted and trained within such latent space simulations from real market data, while maintaining their ability to recover agent rewards robust to variations in the underlying dynamics, and transfer them to new regimes of the original environment.

More information

Item ID: 67304
DC Identifier: https://oa.upm.es/67304/
OAI Identifier: oai:oa.upm.es:67304
Official URL: https://arxiv.org/abs/1912.04242
Deposited by: Jacobo Roa Vicens
Deposited on: 02 Jun 2021 09:22
Last Modified: 02 Jun 2021 09:22
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM