Full text
Preview |
PDF
- Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (384kB) | Preview |
Valcarcel Macua, Sergio, Belanovic, Pavle and Zazo Bello, Santiago ORCID: https://orcid.org/0000-0001-9073-7927
(2012).
Diffusion Gradient Temporal Difference for Cooperative Reinforcement Learning with Linear Function Approximation.
In: "3rd International Workshop on Cognitive Incromation Processing (CIP)", 28/05/2012 - 30/05/2012, Baiona. ISBN 978-1-4673-1877-8. pp. 1-6.
Title: | Diffusion Gradient Temporal Difference for Cooperative Reinforcement Learning with Linear Function Approximation |
---|---|
Author/s: |
|
Item Type: | Presentation at Congress or Conference (Article) |
Event Title: | 3rd International Workshop on Cognitive Incromation Processing (CIP) |
Event Dates: | 28/05/2012 - 30/05/2012 |
Event Location: | Baiona |
Title of Book: | 3rd International Workshop on Cognitive Incromation Processing (CIP) |
Date: | May 2012 |
ISBN: | 978-1-4673-1877-8 |
Subjects: | |
Freetext Keywords: | TD, distributed reinforcement learning, distributed control, cooperative learning, multi-agent, distributed decision making, distributed temporal difference |
Faculty: | E.T.S.I. Telecomunicación (UPM) |
Department: | Señales, Sistemas y Radiocomunicaciones |
Creative Commons Licenses: | Recognition - No derivative works - Non commercial |
Preview |
PDF
- Requires a PDF viewer, such as GSview, Xpdf or Adobe Acrobat Reader
Download (384kB) | Preview |
We introduce a diffusion-based algorithm in which multiple agents cooperate to predict a common and global statevalue function by sharing local estimates and local gradient information among neighbors. Our algorithm is a fully distributed implementation of the gradient temporal difference with linear function approximation, to make it applicable to multiagent settings. Simulations illustrate the benefit of cooperation in learning, as made possible by the proposed algorithm.
Item ID: | 20234 |
---|---|
DC Identifier: | https://oa.upm.es/20234/ |
OAI Identifier: | oai:oa.upm.es:20234 |
Official URL: | http://ieeexplore.ieee.org/xpl/articleDetails.jsp?... |
Deposited by: | Memoria Investigacion |
Deposited on: | 01 Oct 2013 15:52 |
Last Modified: | 21 Apr 2016 22:58 |