首页   按字顺浏览 期刊浏览 卷期浏览 Learning action probabilities from delayed reinforcement
Learning action probabilities from delayed reinforcement

 

作者: S. I. AHSON,   R. SRINIVAS,  

 

期刊: International Journal of Systems Science  (Taylor Available online 1993)
卷期: Volume 24, issue 12  

页码: 2415-2421

 

ISSN:0020-7721

 

年代: 1993

 

DOI:10.1080/00207729308949639

 

出版商: Taylor & Francis Group

 

数据来源: Taylor

 

摘要:

A reinforcement scheme for learning automata, applicable to real situations where the reinforcement received from the environment is delayed, is presented. The scheme divides the state space into regions following the boxes approach of Michie and Chambers. Each region maintains estimates of the reward characteristics of the environment and contains a local automaton that updates action probabilities whenever the system state enters it. Estimates of reward characteristics are obtained using reinforcement received during the period of eligibility. Results obtained through computer simulation of the inverted pendulum problem are compared with the adaptive critic learning developed by Bartoet al. (1983).

 

点击下载:  PDF (208KB)



返 回