Gradient Convergence of Deep Learning-Based Numerical Methods for BSDEs

Zixuan Wang , Shanjian Tang

Chinese Annals of Mathematics, Series B ›› 2021, Vol. 42 ›› Issue (2) : 199 -216.

PDF
Chinese Annals of Mathematics, Series B ›› 2021, Vol. 42 ›› Issue (2) : 199 -216. DOI: 10.1007/s11401-021-0253-x
Article

Gradient Convergence of Deep Learning-Based Numerical Methods for BSDEs

Author information +
History +
PDF

Abstract

The authors prove the gradient convergence of the deep learning-based numerical method for high dimensional parabolic partial differential equations and backward stochastic differential equations, which is based on time discretization of stochastic differential equations (SDEs for short) and the stochastic approximation method for nonconvex stochastic programming problem. They take the stochastic gradient decent method, quadratic loss function, and sigmoid activation function in the setting of the neural network. Combining classical techniques of randomized stochastic gradients, Euler scheme for SDEs, and convergence of neural networks, they obtain the $O(K^{\frac{1}{4}})$ rate of gradient convergence with K being the total number of iterative steps.

Keywords

PDEs / BSDEs / Deep learning / Nonconvex stochastic programming / Convergence result

Cite this article

Download citation ▾
Zixuan Wang, Shanjian Tang. Gradient Convergence of Deep Learning-Based Numerical Methods for BSDEs. Chinese Annals of Mathematics, Series B, 2021, 42(2): 199-216 DOI:10.1007/s11401-021-0253-x

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

Beck, C., Becker, S., Grohs, P., et al., Solving stochastic differential equations and Kolmogorov equations by means of deep learning. arXiv: 1806.00421, 2018

[2]

Bender C, Zhang J. Time discretization and Markovian iteration for coupled FBSDEs. The Annals of Applied Probability, 2008, 18(1): 143-177

[3]

Bouchard B, Touzi N. Discrete-time approximation and Monte-Carlo simulation of backward stochastic differential equations. Stochastic Processes and their applications, 2004, 111(2): 175-206

[4]

Carreira-Perpinan, M. and Wang, W., Distributed optimization of deeply nested systems, Appearing in Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS) 2014, Reykjavik, Iceland. JMLR: W&CP volume 33.

[5]

Cvitanic J, Zhang J. The steepest descent method for forward-backward SDEs. Electronic Journal of Probability, 2005, 10: 1468-1495

[6]

Delarue F, Menozzi S. A forward-backward stochastic algorithm for quasi-linear PDEs. The Annals of Applied Probability, 2006, 16(1): 140-184

[7]

Douglas J, Ma J, Protter P. Numerical methods for forward-backward stochastic differential equations. The Annals of Applied Probability, 1996, 6(3): 940-968

[8]

E. W. Han J, Jentzen A. Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Communications in Mathematics and Statistics, 2017, 5(4): 349-380

[9]

E. W., Ma, C. and Wu, L., A priori estimates of the generalization error for two-layer neural networks. arXiv:1810.06397, 2018

[10]

E. W. A proposal on machine learning via dynamical systems. Communications in Mathematics and Statistics, 2017, 5(1): 1-11

[11]

Ghadimi S, Lan G. Stochastic first- and zeroth-order methods for nonconvex stochastic programming. SIAM Journal on Optimization, 2013, 23(4): 2341-2368

[12]

Han, J. and Long, J., Convergence of the deep BSDE method for coupled FBSDEs. arXiv: 1811.01165, 2018

[13]

Han, J. and E, W., Deep learning approximation for stochastic control problems. arXiv: 1611.07422, 2016

[14]

Huijskens T P, Ruijter M J, Oosterlee C W. Efficient numerical Fourier methods for coupled forward-backward SDEs. Journal of Computational and, Applied, Mathematics, 2016, 296: 593-612

[15]

Ithapu, V. K., Ravi, S. N. and Singh, V., On the interplay of network structure and gradient convergence in deep learning, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton), IEEE, 2016, 488–495

[16]

Li Q, Chen L, Tai C, E. W. Maximum principle based algorithms for deep learning. Journal of Machine Learning Research, 2017, 18(165): 1-29

[17]

Ma J, Shen J, Zhao Y. On numerical approximations of forward-backward stochastic differential equations. SIAM Journal on Numerical Analysis, 2008, 46(5): 2636-2661

[18]

Malek A, Beidokhti R. Numerical solution for high order differential equations using a hybrid neural network-optimization method. Appl. Math. Comput., 2006, 183(1): 260-271

[19]

Nemirovski A, Juditsky A, Lan G, Shapiro A. Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 2009, 19(4): 1574-1609

[20]

Pardoux E, Peng S. Backward stochastic differential equations and quasilinear parabolic partial differential equations, Stochastic Partial Differential Equations and Their Applications, 1992, Berlin, Heidelberg: Springer-Verlag 200-217

[21]

Rudd, K., Solving Partial Differential Equations Using Artificial Neural Networks, Ph.D. Thesis, Duke University, 2013.

[22]

Ruijter M J, Oosterlee C W. Numerical Fourier method and second-order Taylor scheme for backward SDEs in finance. Applied Numerical Mathematics, 2016, 103: 1-26

[23]

Shao H, Zheng G. Convergence analysis of a back-propagation algorithm with adaptive momentum. Neurocomputing, 2011, 74(5): 749-752

[24]

Sirignano J, Spiliopoulos K. DGM: A deep learning algorithm for solving partial differential equations. Journal of Computational Physics, 2018, 375: 1339-1364

[25]

Pardoux E, Tang S. Forward-backward stochastic differential equations and quasilinear parabolic PDEs. Probability Theory and Related Fields, 1999, 114(2): 123-150

[26]

Xu Y, Yin W. A globally convergent algorithm for nonconvex optimization based on block coordinate update. Journal of Scientific Computing, 2017, 72(2): 700-734

[27]

Zeng, J., Ouyang, S., Lau, T. T. K., et al., Global convergence in deep learning with variable splitting via the Kurdyka-łojasiewicz property. arXiv: 1803.00225, 2018

[28]

Zhang X, Zhang N. A study on the convergence of gradient method with momentum for sigma-pi-sigma neural networks. Journal of Applied Mathematics and Physics, 2018, 6(04): 880-887

[29]

Zou, D., Cao, Y., Zhou, D. and Gu, Q., Stochastic gradient descent optimizes over-parameterized deep ReLU networks. arXiv: 1811.08888, 2018

AI Summary AI Mindmap
PDF

165

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/