Solution to reinforcement learning problems with artificial potential field

Li-juan Xie , Guang-rong Xie , Huan-wen Chen , Xiao-li Li

Journal of Central South University ›› 2008, Vol. 15 ›› Issue (4) : 552 -557.

PDF
Journal of Central South University ›› 2008, Vol. 15 ›› Issue (4) : 552 -557. DOI: 10.1007/s11771-008-0104-x
Article

Solution to reinforcement learning problems with artificial potential field

Author information +
History +
PDF

Abstract

A novel method was designed to solve reinforcement learning problems with artificial potential field. Firstly a reinforcement learning problem was transferred to a path planning problem by using artificial potential field(APF), which was a very appropriate method to model a reinforcement learning problem. Secondly, a new APF algorithm was proposed to overcome the local minimum problem in the potential field methods with a virtual water-flow concept. The performance of this new method was tested by a gridworld problem named as key and door maze. The experimental results show that within 45 trials, good and deterministic policies are found in almost all simulations. In comparison with WIERING’s HQ-learning system which needs 20 000 trials for stable solution, the proposed new method can obtain optimal and stable policy far more quickly than HQ-learning. Therefore, the new method is simple and effective to give an optimal solution to the reinforcement learning problem.

Keywords

reinforcement learning / path planning / mobile robot navigation / artificial potential field / virtual water-flow

Cite this article

Download citation ▾
Li-juan Xie, Guang-rong Xie, Huan-wen Chen, Xiao-li Li. Solution to reinforcement learning problems with artificial potential field. Journal of Central South University, 2008, 15(4): 552-557 DOI:10.1007/s11771-008-0104-x

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

KaelblingL. P., LittmanM. L., MooreA. W.. Reinforcement learning: A survey [J]. Journal of Artificial Intelligence Research, 1996, 4(1): 273-285

[2]

SuttonR. S., BartoA.Reinforcement learning: An introduction [M], 1998, Cambridge, MIT Press

[3]

BanerjeeB., StoneP.. General game learning using knowledge transfer [C]. Proceedings of the 20th International Joint Conference on Artificial Intelligence, 2007, California, AAAI Press: 672-677

[4]

AsadiM., HuberM.. Effective control knowledge transfer through learning skill and representation hierarchies [C]. Proceedings of the 20th International Joint Conference on Artificial Intelligence, 2007, California, AAAI Press: 2054-2059

[5]

KonidarisG., BartoA.. Autonomous shaping: Knowledge transfer in reinforcement learning [C]. Proceedings of the 23rd International Conference on Machine Learning, 2006, Pittsburgh, ACM Press: 489-496

[6]

MehtaN., NatarajanS., TadepalliP., FernA.. Transfer in variable-reward hierarchical reinforcement learning [C]. Workshop on Transfer Learning at Neural Information Processing Systems, 2005, Oregon, ACM Press: 20-23

[7]

WilsonA., FernA., RayS., TadepalliP.. Multi-Task reinforcement learning: A hierarchical Bayesian approach [C]. Proceedings of the 24th International Conference on Machine Learning, 2007, Oregon, ACM Press: 923-930

[8]

GoelS., HuberM.. Subgoal discovery for hierarchical reinforcement learning using learned policies [C]. Proceedings of the 16th International FLAIRS Conference, 2003, Florida, AAAI Press: 346-350

[9]

TayorM. E., StoneP.. Behavior transfer for value-function-based reinforcement learning [C]. The Fourth International Joint Conference on Autonomous Agents and Multiagent Systems, 2005, New York, ACM Press: 53-59

[10]

HengstB.. Discovering hierarchy in reinforcement learning with HexQ [C]. Proceedings of the 19th International Conference on Machine Learning, 2002, San Francisco, Morgan Kaufmann: 243-250

[11]

DiukC., StrehlA. L., LittmanM. L.. A hierarchical approach to efficient reinforcement learning in deterministic domains [C]. Proceedings of the 5th International Joint Conference on Autonomous Agents and Multiagent Systems, 2006, New York, ACM Press: 313-319

[12]

ZhouW., CogginsR.. A biologically inspired hierarchical reinforcement learning system [J]. Cybernetics and Systems, 2005, 36(1): 1-44

[13]

BartoA., MahadevanS.. Recent advances in hierarchical reinforcement learning [J]. Discrete Event Dynamic Systems: Theory and Applications, 2003, 13(1): 41-77

[14]

KearnsM., KollerD.. Efficient reinforcement learning in factored MDPs [C]. Proceedings of the 6th International Joint Conference on Artificial Intelligence, 1999, Stockholm, Morgan Kaufmann: 740-747

[15]

WenZ.-q., CaiZ.-xing.. Global path planning approach based on ant colony optimization algorithm [J]. Journal of Central South University of Technology, 2006, 13(6): 707-712

[16]

ZhuX.-c., DongG.-h., CaiZ.-xing.. Robust simultaneous tracking and stabilization of wheeled mobile robots not satisfying nonholonomic constraint [J]. Journal of Central South University of Technology, 2007, 14(4): 537-545

[17]

ZouX.-b., CaiZ.-x., SunG.-rong.. Non-smooth environment modeling and global path planning for mobile robots [J]. Journal of Central South University of Technology, 2003, 10(3): 248-254

[18]

AndrewsJ. R., HoganN.. Impedance control as a framework for implementing obstacle avoidance in a manipulator [C]. Proceedings of Control of Manufacturing Process and Robotic System, 1983, New York, ASME Press: 243-251

[19]

KhatibO.. Real-time obstacle avoidance for manipulators and mobile robots [J]. International Journal of Robotics Research, 1986, 5(1): 90-98

[20]

HuangW. H., FajenB. R., FinkJ. R.. Visual navigation and obstacle avoidance using a steering potential function [J]. Journal of Robotics and Autonomous Systems, 2006, 54(4): 288-299

[21]

ParkM. G., LeeM. C.. Artificial potential field based path planning for mobile robots using a virtual obstacle concept [C]. Proceedings of IEEE/ASME International Conference on Advanced Intelligent Mechatronics, 2003, Victoria, IEEE Press: 735-740

[22]

LiuC. Q., KrishnanH., YongL. S.. Virtual obstacle concept for local-minimum-recovery in potential-field based navigation [C]. Proceedings of the IEEE International Conference on Robotics & Automation, 2000, San Francisco, IEEE Press: 983-988

[23]

BrockO., KhatibO.. High-speed navigation using the global dynamic window approach [C]. Proceedings of the IEEE International Conference on Robotics and Automation, 1999, Detroit, IEEE Press: 341-346

[24]

KonoligeK.. A gradient method for real time robot control [C]. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, 2000, Victoria, IEEE Press: 639-646

[25]

RimonE., KoditschekD.. Exact robot navigation using artificial potential functions [J]. IEEE Transactions on Robotics and Automation, 1992, 8(5): 501-518

[26]

WieringM., SchmidhuberJ.. HQ-learning [J]. Adaptive Behavior, 1998, 6(2): 219-246

AI Summary AI Mindmap
PDF

155

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/