An L2 Analysis of Reinforcement Learning in High Dimensions with Kernel and Neural Network Approximation

Jihao Long , Jiequn Han , Weinan E

CSIAM Trans. Appl. Math. ›› 2022, Vol. 3 ›› Issue (2) : 191 -220.

PDF (57KB)
CSIAM Trans. Appl. Math. ›› 2022, Vol. 3 ›› Issue (2) : 191 -220. DOI: 10.4208/csiam-am.SO-2021-0026
research-article

An L2 Analysis of Reinforcement Learning in High Dimensions with Kernel and Neural Network Approximation

Author information +
History +
PDF (57KB)

Abstract

Reinforcement learning (RL) algorithms based on high-dimensional function approximation have achieved tremendous empirical success in large-scale problems with an enormous number of states. However, most analysis of such algorithms gives rise to error bounds that involve either the number of states or the number of features. This paper considers the situation where the function approximation is made either using the kernel method or the two-layer neural network model, in the context of a fitted Q-iteration algorithm with explicit regularization. We establish an $\tilde{O}\left(H^{3}|\mathcal{A}|^{\frac{1}{4}} n^{-\frac{1}{4}}\right)$ bound for the optimal policy with HnHn samples, where HH is the length of each episode and $|\mathcal{A}|$ is the size of action space. Our analysis hinges on analyzing the L2 error of the approximated Q-function using nn data points. Even though this result still requires a finite-sized action space, the error bound is independent of the dimensionality of the state space.

Keywords

Reinforcement learning / function approximation / neural networks / reproducing kernel Hilbert space

Cite this article

Download citation ▾
Jihao Long, Jiequn Han, Weinan E. An L2 Analysis of Reinforcement Learning in High Dimensions with Kernel and Neural Network Approximation. CSIAM Trans. Appl. Math., 2022, 3(2): 191-220 DOI:10.4208/csiam-am.SO-2021-0026

登录浏览全文

4963

注册一个新账户 忘记密码

References

AI Summary AI Mindmap
PDF (57KB)

131

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/