NSTL回溯数据服务平台

首页

按字顺浏览

期刊浏览

卷期浏览

On Howard's policy improvement method

On Howard's policy improvement method

作者: Ulrich Rieder,

期刊: Mathematische Operationsforschung und Statistik. Series Optimization （Taylor Available online 1977）
卷期: Volume 8, issue 2

页码: 227-236

ISSN:0323-3898

年代: 1977

DOI:10.1080/02331937708842420

出版商: Akademic-Verlag

数据来源: Taylor

摘要:

We consider a stationary dynamic program with general state and action spaces and with an unbounded reward function. Taking a martingale approach to the optimization problem we derive several necessary and sufficient conditions for the validity of Howard's policy improvement method. The conditions hold both in the positive and negative ease. By means of these results we can construct a sequence of stationary policies for which the expected rewards converge to the value function. The construction is a straightforward generalization of the method given by Frid [3].

点击下载: PDF (823KB)