Diffusion Gradient Temporal Difference for Cooperative Reinforcement Learning with Linear Function Approximation

Valcarcel Macua, Sergio, Belanovic, Pavle and Zazo Bello, Santiago

(2012). Diffusion Gradient Temporal Difference for Cooperative Reinforcement Learning with Linear Function Approximation. En: "3rd International Workshop on Cognitive Incromation Processing (CIP)", 28/05/2012 - 30/05/2012, Baiona. ISBN 978-1-4673-1877-8. pp. 1-6.

Descripción

Título:	Diffusion Gradient Temporal Difference for Cooperative Reinforcement Learning with Linear Function Approximation
Autor/es:	Valcarcel Macua, Sergio Belanovic, Pavle Zazo Bello, Santiago https://orcid.org/0000-0001-9073-7927
Tipo de Documento:	Ponencia en Congreso o Jornada (Artículo)
Título del Evento:	3rd International Workshop on Cognitive Incromation Processing (CIP)
Fechas del Evento:	28/05/2012 - 30/05/2012
Lugar del Evento:	Baiona
Título del Libro:	3rd International Workshop on Cognitive Incromation Processing (CIP)
Fecha:	Mayo 2012
ISBN:	978-1-4673-1877-8
Materias:	Telecomunicaciones Robótica e Informática Industrial
ODS:	09. Industria, innovación e infraestructura
Palabras Clave Informales:	TD, distributed reinforcement learning, distributed control, cooperative learning, multi-agent, distributed decision making, distributed temporal difference
Escuela:	E.T.S.I. Telecomunicación (UPM)
Departamento:	Señales, Sistemas y Radiocomunicaciones
Licencias Creative Commons:	Reconocimiento - Sin obra derivada - No comercial

Texto completo

Vista Previa

PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (384kB) | Vista Previa

Resumen

We introduce a diffusion-based algorithm in which multiple agents cooperate to predict a common and global statevalue function by sharing local estimates and local gradient information among neighbors. Our algorithm is a fully distributed implementation of the gradient temporal difference with linear function approximation, to make it applicable to multiagent settings. Simulations illustrate the benefit of cooperation in learning, as made possible by the proposed algorithm.

Más información

ID de Registro:	20234
Identificador DC:	https://oa.upm.es/20234/
Identificador OAI:	oai:oa.upm.es:20234
URL Oficial:	http://ieeexplore.ieee.org/xpl/articleDetails.jsp?...
Depositado por:	Memoria Investigacion
Depositado el:	01 Oct 2013 15:52
Ultima Modificación:	21 Abr 2016 22:58

Estadísticas

Exportar cita

Editar (sólo personal del Archivo)

En esta página

Menú principal

Buscar

Diffusion Gradient Temporal Difference for Cooperative Reinforcement Learning with Linear Function Approximation

Cita

Descripción

Texto completo

Resumen

Más información

Acciones

Documentos

El repositorio

Agrupados por ...

Datos Investigación

Financiadores

Especiales

En otros formatos

Redes sociales

Información adicional