Stochastic learning and optimization — Ideas vs mathematics?

Stochastic learning and optimization — Ideas vs mathematics?

Xi-Ren CAO

PDF(362 KB)

ISSN 2095-2732 (Print) ISSN 2095-2740 (Online) CN 10-1028/TM (Ceased publications)

PDF(362 KB)

Front. Electr. Electron. Eng. ›› DOI: 10.1007/s11460-011-0161-z

RESEARCH ARTICLE

RESEARCH ARTICLE

Stochastic learning and optimization — Ideas vs mathematics?

Xi-Ren CAO¹^,²

Author information +

History +

Abstract

What are the roles that ideas and mathematics play in research of engineering subjects? This article tries to answer this question with the author’s own research experience. In the past 30 years, the author’s research started from perturbation analysis (PA) of queueing networks, to PA of Markov systems, to Markov decision processes (MDP), and to stochastic control; and based on these research, the author has successfully developed a sensitivity-based optimization approach to the area of learning and optimization of stochastic systems, leading to a simple and unified framework for the area, new research directions, and new results in the area. This paper reviews the above research topics and the history of their development with an emphasis on what the roles that ideas and mathematics play in each of the advances along the path.

Keywords

perturbation analysis (PA) / Markov decision processes (MDP) / sensitivity-based optimization / perturbation realization factors / performance potentials

Cite this article

Download citation ▾

Xi-Ren CAO. Stochastic learning and optimization — Ideas vs mathematics?. Front Elect Electr Eng Chin, https://doi.org/10.1007/s11460-011-0161-z

References

Publishing order | Descend order by publishing year | Descend order by cited within

[1]	Ho Y C, Eyler A, Chien T T. A gradient technique for general buffer-storage design in a serial production line. International Journal of Production Research, 1979, 17(6): 557-580 CrossRef Google scholar

[2]	Ho Y C, Eyler A, Chien T T. A new approach to determine parameter sensitivities on transfer lines. Management Science, 1983, 29(6): 700-714 CrossRef Google scholar

[3]	Bryson A E, Ho Y C. Applied Optimal Control: Optimiza tion, Estimation, and Control. Waltham, MA: Blaisdell, 1969

[4]	Ho Y C, Cao X R. Perturbation analysis and optimization of queueing networks. Journal of Optimization Theory and Applications, 1983, 40(4): 559-582 CrossRef Google scholar

[5]	Cao X R. Stochastic Learning and Optimization — A Sensitivity-Based Approach. New York, NY: Springer, 2007

[6]	Xia L, Cao X R. Performance optimization of queueing systems with perturbation realization. European Journal of Operations Research, invited survey paper, 2011 (submitted)

[7]	Cao X R. First-order perturbation analysis of a single multiclass finite source queue. Performance Evaluation, 1987, 7(1): 31-41 CrossRef Google scholar

[8]	Cao X R. First-order perturbation analysis of a single multiclass finite source queue. Performance Evaluation, 1987, 7(1): 31-41 CrossRef Google scholar

[9]	Heidelberger P, Cao X R, Zazanis M A, Suri R. Convergence properties of infinitesimal rerturbation analysis estimates. Management Science, 1988, 34(11): 1281-1302 CrossRef Google scholar

[10]	Cao X R. A Sample performance function of Jackson queueing networks. Operations Research, 1988, 36(1): 128-136 CrossRef Google scholar

[11]	Cao X R. Realization Probabilities — The Dynamics of Queueing Systems. New York, NY: Springer Verlag, 1994

[12]	Cao X R. Convergence of parameter sensitivity estimates in a stochastic experiment. IEEE Transactions on Automatic Control, 1985, 30(9): 834-843

[13]	Glasserman P. Gradient Estimation via Perturbation Analysis. Boston, MA: Kluwer Academic Publishers, 1991

[14]	Cao X R. Realization probability in closed Jackson queueing networks and its application. Advances in Applied Probability, 1987, 19(3): 708-738 CrossRef Google scholar

[15]	Fu M C, Hu J Q. Conditional Monte Carlo: Gradient Estimation and Optimization Applications. Boston, MA: Kluwer Academic Publishers, 1997

[16]	Gong W B, Ho Y C. Smoothed (conditional) perturbation analysis for discrete event dynamic systems. IEEE Transactions on Automatic Control, 1987, 32(10): 858-866 CrossRef Google scholar

[17]	Ho Y C, Cao X R, Cassandras C G. Infinitesimal and finite perturbation analysis for queueing networks. Automatica, 1983, 19(4): 439-445 CrossRef Google scholar

[18]	Ho Y C, Cao X R. Perturbation Analysis of Discrete Event Systems. Norwell, MA: Kluwer Academic Publishers, 1991 CrossRef Google scholar

[19]	Cassandras C G, Wardi Y, Melamed B, Sun G, Panayiotou C G. Perturbation analysis for online control and optimization of stochastic fluid models. IEEE Transactions on Automatic Control, 2002, 47(8): 1234-1248 CrossRef Google scholar

[20]	Liu Y, Gong W B. Perturbation analysis for stochastic fluid queueing systems. Discrete Event Dynamic Systems: Theory and Applications, 2002, 12(4): 391-416 CrossRef Google scholar

[21]	Wardi Y, Melamed B, Cassandras C G, Panayiotou C G. Online IPA gradient estimators in stochastic continuous fluid models. Journal of Optimization Theory and Applications, 2002, 115(2): 369-405 CrossRef Google scholar

[22]	Cao X R, Chen H F. Perturbation realization, potentials, and sensitivity analysis of Markov processes. IEEE Transactions on Automatic Control, 1997, 42(10): 1382-1393 CrossRef Google scholar

[23]	Cao X R, Yuan X M, Qiu L. A single sample path-based performance sensitivity formula for Markov chains. IEEE Transactions on Automatic Control, 1996, 41(12): 1814-1817 CrossRef Google scholar

[24]	Cao X R. The potential structure of sample paths and performance sensitivities of Markov systems. IEEE Transactions on Automatic Control, 2004, 49(12): 2129-2142 CrossRef Google scholar

[25]	Cao X R. A unified approach to Markov decision problems and performance sensitivity analysis. Automatica, 2000, 36(5): 771-774 CrossRef Google scholar

[26]	Cao X R. From perturbation analysis to Markov decision processes and reinforcement learning. Discrete Event Dynamic Systems: Theory and Applications, 2003, 13(1-2): 9-39 CrossRef Google scholar

[27]	Fang H T, Cao X R. Potential-based on-line policy iteration algorithms for Markov decision processes. IEEE Transactions on Automatic Control, 2004, 49(4): 493-505 CrossRef Google scholar

[28]	Marbach P, Tsitsiklis T N. Simulation-based optimization of Markov reward processes. IEEE Transactions on Automatic Control, 2001, 46(2): 191-209 CrossRef Google scholar

[29]	Zhang J Y, Cao X R. Continuous-time Markov decision processes with nth-bias optimality criteria. Automatica, 2009, 45(7): 1628-1638 CrossRef Google scholar

[30]	Cao X R, Zhang J Y. The nth-order bias optimality for multichain Markov decision processes. IEEE Transactions on Automatic Control, 2008, 53(2): 496-508 CrossRef Google scholar

[31]	Puterman M L. Markov Decision Processes: Discrete Stochastic Dynamic Programming. New York, NY: John Wiley & Sons, 1994

[32]	Veinott A F. Discrete dynamic programming with sensitive discount optimality criteria. The Annals of Mathematical Statistics, 1969, 40(5): 1635-1660 CrossRef Google scholar

[33]	Cao X R. Basic ideas for event-based optimization of Markov systems. Discrete Event Dynamic Systems: Theory and Applications, 2005, 15(2): 169-197 CrossRef Google scholar

[34]	Cao X R, Zhang J Y. Event-based optimization of Markov systems. IEEE Transactions on Automatic Control, 2008, 53(4): 1076-1082 CrossRef Google scholar

[35]	Xu Y K, Cao X R. Optimal control with Lebesgue sampling and time aggregation. IEEE Transactions on Automatic Control (to appear)

[36]	Xia L, Chen X, Cao X R. Policy iteration for customeraverage performance optimization of closed queueing systems. Automatica, 2009, 45(7): 1639-1648 CrossRef Google scholar

[37]	Xia L, Cao X R. Aggregation of perturbation realization factors and service rate-based policy iteration for queueing systems. In: Proceedings of the 45th IEEE Conference on Decision and Control. 2006, 1063-1068 CrossRef Google scholar

[38]	Xia L, Cao X R. Relationship between perturbation realization factors with queueing models and Markov models. IEEE Transactions on Automatic Control, 2006, 51(10): 1699-1704 CrossRef Google scholar

[39]	Cao X R. Stochastic control of continuous-time and continuous-state systems via direct comparison. In: Proceedings of the 48th IEEE Conference on Decision and Control. 2009, 1593-1598

[40]	Cao X R. A new model of continuous-time Markov processes and impulse stochastic control. In: Proceedings of the 48th IEEE Conference on Decision and Control. 2009, 525-530

[41]	Cao X R, Wang D X, Lu T, Xu Y F. Stochastic control via direct comparison. 2011 (submitted)

[42]	Oksendal B, Sulem A. Applied Stochastic Control of Jump Diffusions. 2nd ed. Berlin: Springer, 2007 CrossRef Google scholar

[43]	Ho Y C. What is experimental system engineering? ScienceNet, 2010. http://www.sciencenet.cn/m/user content. aspx?id=382212

[44]	Cao X R. Mathematics, analysis, and engineering. Guest article, Prof. Y. C. Ho’s blog at ScineceNet, 2008. http://www.sciencenet.cn/m/user content.aspx?id=38753

RIGHTS & PERMISSIONS

2014 Higher Education Press and Springer-Verlag Berlin Heidelberg

PDF(362 KB)

Accesses

Citations

Detail

Sections

Recommended

/

〈

〉