首页   按字顺浏览 期刊浏览 卷期浏览 On Howard's policy improvement method
On Howard's policy improvement method

 

作者: Ulrich Rieder,  

 

期刊: Mathematische Operationsforschung und Statistik. Series Optimization  (Taylor Available online 1977)
卷期: Volume 8, issue 2  

页码: 227-236

 

ISSN:0323-3898

 

年代: 1977

 

DOI:10.1080/02331937708842420

 

出版商: Akademic-Verlag

 

数据来源: Taylor

 

摘要:

We consider a stationary dynamic program with general state and action spaces and with an unbounded reward function. Taking a martingale approach to the optimization problem we derive several necessary and sufficient conditions for the validity of Howard's policy improvement method. The conditions hold both in the positive and negative ease. By means of these results we can construct a sequence of stationary policies for which the expected rewards converge to the value function. The construction is a straightforward generalization of the method given by Frid [3].

 

点击下载:  PDF (823KB)



返 回