Cooperative off-policy prediction of markov decision processes in adaptive networks

Valcarcel Macua, Sergio and Chen, Jianshu and Zazo Bello, Santiago and Sayed, Ali H. (2013). Cooperative off-policy prediction of markov decision processes in adaptive networks. In: "IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)", 26/05/2013 - 31/05/2013, Vancouver, Canada. pp. 4539-4543. https://doi.org/10.1109/ICASSP.2013.6638519.

Description

Title: Cooperative off-policy prediction of markov decision processes in adaptive networks
Author/s:
  • Valcarcel Macua, Sergio
  • Chen, Jianshu
  • Zazo Bello, Santiago
  • Sayed, Ali H.
Item Type: Presentation at Congress or Conference (Article)
Event Title: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Event Dates: 26/05/2013 - 31/05/2013
Event Location: Vancouver, Canada
Title of Book: IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)
Date: 2013
Subjects:
Faculty: E.T.S.I. Telecomunicación (UPM)
Department: Señales, Sistemas y Radiocomunicaciones
Creative Commons Licenses: Recognition - No derivative works - Non commercial

Full text

[img]
Preview
PDF - Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (851kB)

Abstract

We apply diffusion strategies to propose a cooperative reinforcement learning algorithm, in which agents in a network communicate with their neighbors to improve predictions about their environment. The algorithm is suitable to learn off-policy even in large state spaces. We provide a mean-square-error performance analysis under constant step-sizes. The gain of cooperation in the form of more stability and less bias and variance in the prediction error, is illustrated in the context of a classical model. We show that the improvement in performance is especially significant when the behavior policy of the agents is different from the target policy under evaluation.

More information

Item ID: 28941
DC Identifier: http://oa.upm.es/28941/
OAI Identifier: oai:oa.upm.es:28941
DOI: 10.1109/ICASSP.2013.6638519
Deposited by: Memoria Investigacion
Deposited on: 29 Jun 2014 11:38
Last Modified: 22 Sep 2014 11:43
  • Logo InvestigaM (UPM)
  • Logo GEOUP4
  • Logo Open Access
  • Open Access
  • Logo Sherpa/Romeo
    Check whether the anglo-saxon journal in which you have published an article allows you to also publish it under open access.
  • Logo Dulcinea
    Check whether the spanish journal in which you have published an article allows you to also publish it under open access.
  • Logo de Recolecta
  • Logo del Observatorio I+D+i UPM
  • Logo de OpenCourseWare UPM