Texto completo
Vista Previa |
PDF (Portable Document Format)
- Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (384kB) | Vista Previa |
ORCID: https://orcid.org/0000-0001-9073-7927
(2012).
Diffusion Gradient Temporal Difference for Cooperative Reinforcement Learning with Linear Function Approximation.
En: "3rd International Workshop on Cognitive Incromation Processing (CIP)", 28/05/2012 - 30/05/2012, Baiona. ISBN 978-1-4673-1877-8. pp. 1-6.
| Título: | Diffusion Gradient Temporal Difference for Cooperative Reinforcement Learning with Linear Function Approximation |
|---|---|
| Autor/es: |
|
| Tipo de Documento: | Ponencia en Congreso o Jornada (Artículo) |
| Título del Evento: | 3rd International Workshop on Cognitive Incromation Processing (CIP) |
| Fechas del Evento: | 28/05/2012 - 30/05/2012 |
| Lugar del Evento: | Baiona |
| Título del Libro: | 3rd International Workshop on Cognitive Incromation Processing (CIP) |
| Fecha: | Mayo 2012 |
| ISBN: | 978-1-4673-1877-8 |
| Materias: | |
| ODS: | |
| Palabras Clave Informales: | TD, distributed reinforcement learning, distributed control, cooperative learning, multi-agent, distributed decision making, distributed temporal difference |
| Escuela: | E.T.S.I. Telecomunicación (UPM) |
| Departamento: | Señales, Sistemas y Radiocomunicaciones |
| Licencias Creative Commons: | Reconocimiento - Sin obra derivada - No comercial |
Vista Previa |
PDF (Portable Document Format)
- Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (384kB) | Vista Previa |
We introduce a diffusion-based algorithm in which multiple agents cooperate to predict a common and global statevalue function by sharing local estimates and local gradient information among neighbors. Our algorithm is a fully distributed implementation of the gradient temporal difference with linear function approximation, to make it applicable to multiagent settings. Simulations illustrate the benefit of cooperation in learning, as made possible by the proposed algorithm.
| ID de Registro: | 20234 |
|---|---|
| Identificador DC: | https://oa.upm.es/20234/ |
| Identificador OAI: | oai:oa.upm.es:20234 |
| URL Oficial: | http://ieeexplore.ieee.org/xpl/articleDetails.jsp?... |
| Depositado por: | Memoria Investigacion |
| Depositado el: | 01 Oct 2013 15:52 |
| Ultima Modificación: | 21 Abr 2016 22:58 |
Publicar en el Archivo Digital desde el Portal Científico