A unified framework for linear function approximation of value functions in stochastic control

Sánchez Fernández, Matilde, Valcarcel Macua, Sergio and Zazo Bello, Santiago

(2013). A unified framework for linear function approximation of value functions in stochastic control. En: "21st European Signal Processing Conference (EUSIPCO)", 09/09/2013 - 13/09/2013, Marrakech, Morocco. pp. 1-5.

Descripción

Título:	A unified framework for linear function approximation of value functions in stochastic control
Autor/es:	Sánchez Fernández, Matilde Valcarcel Macua, Sergio Zazo Bello, Santiago https://orcid.org/0000-0001-9073-7927
Tipo de Documento:	Ponencia en Congreso o Jornada (Artículo)
Título del Evento:	21st European Signal Processing Conference (EUSIPCO)
Fechas del Evento:	09/09/2013 - 13/09/2013
Lugar del Evento:	Marrakech, Morocco
Título del Libro:	21st European Signal Processing Conference (EUSIPCO)
Fecha:	2013
Materias:	Telecomunicaciones
ODS:	09. Industria, innovación e infraestructura
Palabras Clave Informales:	Approximate dynamic programming, Linear value function approximation, Mean squared Bellman Error, Mean squared projected Bellman Error, Reinforcement Learning
Escuela:	E.T.S.I. Telecomunicación (UPM)
Departamento:	Señales, Sistemas y Radiocomunicaciones
Licencias Creative Commons:	Reconocimiento - Sin obra derivada - No comercial

Texto completo

Vista Previa

PDF (Portable Document Format) - Se necesita un visor de ficheros PDF, como GSview, Xpdf o Adobe Acrobat Reader
Descargar (1MB)

Resumen

This paper contributes with a unified formulation that merges previ- ous analysis on the prediction of the performance ( value function ) of certain sequence of actions ( policy ) when an agent operates a Markov decision process with large state-space. When the states are represented by features and the value function is linearly approxi- mated, our analysis reveals a new relationship between two common cost functions used to obtain the optimal approximation. In addition, this analysis allows us to propose an efficient adaptive algorithm that provides an unbiased linear estimate. The performance of the pro- posed algorithm is illustrated by simulation, showing competitive results when compared with the state-of-the-art solutions.

Más información

ID de Registro:	28942
Identificador DC:	https://oa.upm.es/28942/
Identificador OAI:	oai:oa.upm.es:28942
URL Oficial:	http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumb...
Depositado por:	Memoria Investigacion
Depositado el:	30 Jun 2014 16:04
Ultima Modificación:	22 Sep 2014 11:43

Estadísticas

Exportar cita

Editar (sólo personal del Archivo)

En esta página

Menú principal

Buscar

A unified framework for linear function approximation of value functions in stochastic control

Cita

Descripción

Texto completo

Resumen

Más información

Acciones

Documentos

El repositorio

Agrupados por ...

Datos Investigación

Financiadores

Especiales

En otros formatos

Redes sociales

Información adicional