Application of hierarchical reinforcement learning in engineering domain

Wei Li; Qingtai Ye; Changming Zhu

doi:10.1007/s11518-006-0190-y

Journal of Systems Science and Systems Engineering ›› 2005, Vol. 14 ›› Issue (2) :207 -217. DOI: 10.1007/s11518-006-0190-y

Article

Application of hierarchical reinforcement learning in engineering domain

Author information +

History +

PDF

Abstract

The slow convergence rate of reinforcement learning algorithms limits their wider application. In engineering domains, hierarchical reinforcement learning is developed to perform actions temporally according to prior knowledge. This system can converge fast due to reduced state space. There is a test of elevator group control to show the power of the new system. Two conventional group control algorithms are adopted as prior knowledge. Performance indicates that hierarchical reinforcement learning can reduce the learning time dramatically.

Keywords

Engineering domain knowledge / controller / reinforcement learning / elevator / group control

Cite this article

Download citation ▾

Wei Li, Qingtai Ye, Changming Zhu. Application of hierarchical reinforcement learning in engineering domain. Journal of Systems Science and Systems Engineering, 2005, 14(2): 207-217 DOI:10.1007/s11518-006-0190-y

登录浏览全文

4963

注册一个新账户忘记密码

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Bao, G., C. G. Cassandras, T. E. Djaferis, A.D. Gandhi, and D. P. Looze, “Elevator dispatchers for down peak traffic”, ECE Department Technical Report, University of Massachusetts, 1994.

[2]	Barto A. G., Mahadevan S.. Recent advances in hierarchical reinforcement learning. Discrete Event Dynamic Systems: Theory and Applications, 2003, 13: 41-77.

[3]	Bradtke, S. J. and M. O. Duff, “Reinforcement learning methods for continuous-time Markov decision problems”, Advances in Neural Information Processing Systems 7, Cambridge, MA, 1995.

[4]	Crites, R. H. and A. G. Barto, “Improving elevator performance using reinforcement learning”, Advances in Neural Information Processing Systems 8, pp1017–1023, 1996.

[5]	Mahadevan, S., M. Nicholas, D. Tapas. and G. Abhijit, “Self-Improving factory simulation using continuous-time average-reward reinforcement learning”, Proceedings of the 14th International Conference on Machine Learning (IMLC’ 97), Nashville, TN, 1997.

[6]	Mataric M.. Reinforcement learning in the multi-robot domain. Autonomous Robots, 1997, 4(1): 73-83.

[7]	Parr R.. Hierarchical control and learning for markov decision processes, 1998, Berkeley, CA: University of California

[8]	Rajbala, M., M. Sridhar, and G. Mohammad, “Hierarchical multi-agent reinforcement learning”, Proceedings of the fifth International Conference on Autonomous Agents, pp246–253, 2001.

[9]	Sutton R.S., Barto A.G.. Reinforcement Learning: An Introduction, 1998, Cambridge, MA: MIT Press

[10]	Szepesvári C., Littman M. L.. A unified analysis of value-function-based reinforcement learning algorithms. Neuro Computing, 1999, 11: 2017-2060.