Proximal policy optimization with an integral compensator for quadrotor control

Huan HU , Qing-ling WANG

Front. Inform. Technol. Electron. Eng ›› 2020, Vol. 21 ›› Issue (5) : 777 -795.

PDF (8776KB)
Front. Inform. Technol. Electron. Eng ›› 2020, Vol. 21 ›› Issue (5) : 777 -795. DOI: 10.1631/FITEE.1900641
Orginal Article
Orginal Article

Proximal policy optimization with an integral compensator for quadrotor control

Author information +
History +
PDF (8776KB)

Abstract

We use the advanced proximal policy optimization (PPO) reinforcement learning algorithm to optimize the stochastic control strategy to achieve speed control of the “model-free” quadrotor. The model is controlled by four learned neural networks, which directly map the system states to control commands in an end-to-end style. By introducing an integral compensator into the actor-critic framework, the speed tracking accuracy and robustness have been greatly enhanced. In addition, a two-phase learning scheme which includes both offline- and online-learning is developed for practical use. A model with strong generalization ability is learned in the offline phase. Then, the flight policy of the model is continuously optimized in the online learning phase. Finally, the performances of our proposed algorithm are compared with those of the traditional PID algorithm.

Keywords

Reinforcement learning / Proximal policy optimization / Quadrotor control / Neural network

Cite this article

Download citation ▾
Huan HU, Qing-ling WANG. Proximal policy optimization with an integral compensator for quadrotor control. Front. Inform. Technol. Electron. Eng, 2020, 21(5): 777-795 DOI:10.1631/FITEE.1900641

登录浏览全文

4963

注册一个新账户 忘记密码

References

RIGHTS & PERMISSIONS

Zhejiang University and Springer-Verlag GmbH Germany, part of Springer Nature

AI Summary AI Mindmap
PDF (8776KB)

Supplementary files

FITEE-0777-20010-HH_suppl_1

FITEE-0777-20010-HH_suppl_2

1246

Accesses

0

Citation

Detail

Sections
Recommended

AI思维导图

/