Implementations of TD3 and DDPG Reinforcement Learning Techniques for Tuning PID Controller of TRMS System

Sevilay Tufenkci , Baris Baykant Alagoz , Gurkan Kavuran , Celaleddin Yeroglu , Norbert Herencsar , Shibendu Mahata

Journal of Systems Science and Systems Engineering ›› : 1 -20.

PDF
Journal of Systems Science and Systems Engineering ›› :1 -20. DOI: 10.1007/s11518-025-5693-5
Article
research-article

Implementations of TD3 and DDPG Reinforcement Learning Techniques for Tuning PID Controller of TRMS System

Author information +
History +
PDF

Abstract

Reinforcement Learning (RL) is a learning method that utilizes interactions between agents and their environments, providing a valuable tool for controller design through simulations. However, traditional industrial systems such as PID control loops have yet to fully embrace the advantages of RL algorithms for effectively tuning controllers. This study presents an experimental initiative demonstrating the implementation of an RL-driven method for optimal PID controller tuning to address challenges in rotor control, explicitly focusing on the Twin-Rotor Multi-Input Multi-Output System (TRMS). Rotor control presents a complex challenge involving aerodynamics and external disturbances. The research implements two RL algorithms, namely the Deep Deterministic Policy Gradient (DDPG) and the Twin Delay Deep Deterministic Policy Gradient (TD3), in a tailored simulation environment to train RL agents to achieve optimal PID control dynamics. Results of simulation and experimental studies indicate that RL algorithms can be implemented for PID controller tuning when the simulation environment for training the RL algorithms well-represent the dominating dynamics and control complications of real-world systems. In this case, both the simulation and experimental results are in good-agreement.

Keywords

Deep reinforcement learning / Direct Current (DC) motor / PID controller / Deep Deterministic Policy Gradient (DDPG) / twin-rotor multi-input multi-output system

Cite this article

Download citation ▾
Sevilay Tufenkci, Baris Baykant Alagoz, Gurkan Kavuran, Celaleddin Yeroglu, Norbert Herencsar, Shibendu Mahata. Implementations of TD3 and DDPG Reinforcement Learning Techniques for Tuning PID Controller of TRMS System. Journal of Systems Science and Systems Engineering 1-20 DOI:10.1007/s11518-025-5693-5

登录浏览全文

4963

注册一个新账户 忘记密码

References

[1]

AbdulwahhabO W, AbbasN H. A new method to tune a fractional-order PID controller for a twin rotor aerodynamic system. Arabian Journal for Science and Engineering, 2017, 42: 5179-5189

[2]

AbushawishA, HamadehM, NassifA. PID controller gains tuning using metaheuristic optimization methods: A survey. Int J Comput, 2020, 14: 87-95

[3]

AhmadS, ShaheedM, ChipperfieldA, TokhiM. Nonlinear modelling of a twin rotor MIMO system using radial basis function networks. Proceedings of the IEEE 2000 National Aerospace and Electronics Conference. NAECON 2000. Engineering Tomorrow (Cat. No. 00CH37093), 2000313320

[4]

AhmedQ, BhattiA, IqbalS. Nonlinear robust decoupling control design for twin rotor system. Proceedings of 2009 7th Asian Control Conference, 2009937942

[5]

AlagozB B, AtesA, YerogluC. Auto-tuning of PID controller according to fractional-order reference model approximation for DC rotor control. Mechatronics, 2013, 23(7): 789-797

[6]

AstromK JPID Controllers: Theory, Design, and Tuning, 1995

[7]

AtesAExperimental determination of mathematical model of prototype helicopter system and controller design, 2013, Malatya, Turkey. Inonu University, Graduate School of Natural and Applied Sciences.

[8]

BellmanR. A Markovian decision process. Journal of Mathematics and Mechanics, 1957, 6: 679-684

[9]

BelmonteL M, MoralesR, Fernandez-CaballeroA, SomolinosJ A. Robust decentralized nonlinear control for a twin rotor MIMO system. Sensors, 2016, 1681160

[10]

BuczS, KozakovaAAdvanced Methods of PID Controller Tuning for Specified Performance, 2018PID Control for Industrial Processes

[11]

ChangW D, HwangR C, HsiehJ G. A multivariable on-line adaptive PID controller using auto-tuning neurons. Engineering Applications of Artificial Intelligence, 2003, 16(1): 57-63

[12]

ChenJ, HuangT C. Applying neural networks to on-line updated PID controllers for nonlinear process control. Journal of Process Control, 2004, 14(2): 211-230

[13]

ChengY H, WeiS, et al.. A proposal of adaptive PID controller based on reinforcement learning. Journal of China University of Mining and Technology, 2007, 17(1): 40-44

[14]

ChungT, et al.. Controlling bicycle using deep deterministic policy gradient algorithm. Proceedings of 2017 14th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI), 2017413417

[15]

de La BourdonnayeF, TeulièreC, ChateauT, TrieschJ. Within reach? Learning to touch objects without prior models. Proceedings of 2019 Joint IEEE 9th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob), 20199398

[16]

FujimotoS, HoofH, MegerD. Addressing function approximation error in actor-critic methods. Proceedings of International Conference on Machine Learning, 201815871596

[17]

GhouriU H, ZafarM U, BariS, KhanH, KhanM U. Attitude control of quad-copter using deterministic policy gradient algorithms (DPGA). Proceedings of 2019 2nd International Conference on Communication, Computing and Digital Systems (C-CODE), 2019149153

[18]

GuanZ, YamamotoT. Design of a reinforcement learning PID controller. IEEJ Transactions on Electrical and Electronic Engineering, 2021, 16(10): 1354-1360

[19]

HafnerR, RiedmillerM. Reinforcement learning in feedback control: Challenges and benchmarks from technical process control. Machine Learning, 2011, 84: 137-169

[20]

HouZ, ChiR, GaoH. An overview of dynamic-linearization-based data-driven control and applications. IEEE Transactions on Industrial Electronics, 2016, 64(5): 4076-4090

[21]

HungJ Y, GaoW, HungJ C. Variable structure control: A survey. IEEE Transactions on Industrial Electronics, 2002, 40(1): 2-22

[22]

HuuT D, IsmailI B. Modelling of twin rotor MIMO system. Proceedings of 2016 2nd IEEE International Symposium on Robotics and Manufacturing Automation (ROMA), 201616

[23]

JinM, LavaeiJ. Stability-certified reinforcement learning: A control-theoretic perspective. IEEE Access, 2020, 8: 229086-229100

[24]

JuangJ G, LiuW K, LinR W. A hybrid intelligent controller for a twin rotor MIMO system and its hardware implementation. ISA Transactions, 2011, 50(4): 609-619

[25]

KennedyJ, EberhartR. Particle swarm optimization. Proceedings of ICNN’95–International Conference on Neural Networks, 1995, 4: 1942-1948

[26]

LaraD, RomeroG, SanchezA, LozanoR, GuerreroA. Robustness margin for attitude control of a four rotor mini-rotorcraft: Case of study. Mechatronics, 2010, 20(1): 143-152

[27]

LeeJ, YooC, ParkY S, ParkB, LeeS J, GweonD G, ChangP H. An experimental study on time delay control of actuation system of tilt rotor unmanned aerial vehicle. Mechatronics, 2012, 22(2): 184-194

[28]

MaitiR, SharmaK D, SarkarG. PSO based parameter estimation and PID controller tuning for 2-DOF nonlinear twin rotor MIMO system. International Journal of Automation and Control, 2018, 12(4): 582-609

[29]

PandeyS K, LaxmiV. Control of twin rotor MIMO system using PID controller with derivative filter coefficient. Proceedings of 2014 IEEE Students’ Conference on Electrical, Electronics and Computer Science, 201416

[30]

ParkD, YuH, Xuan-MungN, LeeJ, HongS K. Multi-copter PID attitude controller gain auto-tuning through reinforcement learning neural networks. Proceedings of the 2019 2nd International Conference on Control and Robot Technology, 20198084

[31]

PhamT H, De MagistrisG, TachibanaR. OptLayer-practical constrained optimization for deep reinforcement learning in the real world. Proceedings of 2018 IEEE International Conference on Robotics and Automation (ICRA), 201862366243

[32]

PhanitejaS, DewanganP, GuhanP, SarkarA, KrishnaK M. A deep reinforcement learning approach for dynamically stable inverse kinematics of humanoid robots. Proceedings of 2017 IEEE International Conference on Robotics and Biomimetics (ROBIO), 201718181823

[33]

PratapB. Neuro sliding mode controller for twin rotor control system. Proceedings of 2012 Students Conference on Engineering and Systems, 201215

[34]

RahidehA, BajodahA H, ShaheedM H. Real time adaptive nonlinear model inversion control of a twin rotor MIMO system using neural networks. Engineering Applications of Artificial Intelligence, 2012, 25(6): 1289-1297

[35]

RahidehA, ShaheedM. Hybrid fuzzy-PID-based control of a twin rotor MIMO system. Proceedings of IECON 2006 - 32nd Annual Conference on IEEE Industrial Electronics, 20064853

[36]

RahidehA, ShaheedM. Mathematical dynamic modelling of a twin-rotor multiple input-multiple output system. Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering, 2007, 221(1): 89-101

[37]

RahidehA, ShaheedM H. Robust model predictive control of a twin rotor MIMO system. Proceedings of 2009 IEEE International Conference on Mechatronics, 200916

[38]

RahidehA, ShaheedM H, HuijbertsH J C. Dynamic modelling of a TRMS using analytical and empirical approaches. Control Engineering Practice, 2008, 16(3): 241-259

[39]

RamalakshmiA, ManoharanP. Non-linear modeling and PID control of twin rotor MIMO system. Proceedings of 2012 IEEE International Conference on Advanced Communication Control and Computing Technologies (ICACCCT), 2012366369

[40]

RechtB. A tour of reinforcement learning: The view from continuous control. Annual Review of Control, Robotics, and Autonomous Systems, 2019, 2(1): 253-279

[41]

SangalliS, ErdilE, HötkerA, DonatiO, KonukogluE. Constrained optimization to train neural networks on critical and under-represented classes. Advances in Neural Information Processing Systems, 2021, 34: 25400-25411

[42]

SangyamT, LaohapiengsakP, ChongcharoenW, NilkhamhangI. Path tracking of UAV using self-tuning PID controller based on fuzzy logic. Proceedings of SICE Annual Conference 2010, 201012651269

[43]

SeborgD E, EdgarT F, MellichampD A, DoyleF J IProcess Dynamics and Control, 2016

[44]

SedighizadehM, RezazadehA. Adaptive PID controller based on reinforcement learning for wind turbine control. Proceedings of World Academy of Science, Engineering and Technology, 2008, 27: 257-262

[45]

ShiX, GuoZ, HuangJ, ShenY, XiaL. A distributed reward algorithm for inverse kinematics of arm robot. Proceedings of 2020 5th International Conference on Automation, Control and Robotics Engineering (CACRE), 20209296

[46]

ShinJ, BadgwellT A, LiuK H, LeeJ H. Reinforcement learning - Overview of recent progress and implications for process control. Computers & Chemical Engineering, 2019, 127: 282-294

[47]

SilverD, LeverG, HeessN, DegrisT, WierstraD, RiedmillerM. Deterministic policy gradient algorithms. Proceedings of International Conference on Machine Learning, 2014, 32(1): 387-395

[48]

SodhiP, KarI. Adaptive backstepping control for a twin rotor MIMO system. IFAC Proceedings Volumes, 2014, 47(1): 740-747

[49]

SuttonR S, BartoA GReinforcement Learning: An Introduction, 1998, Cambridge. MIT Press.

[50]

TaoC W, TaurJ S, ChangY H, ChangC W. A novel fuzzy-sliding and fuzzy-integral-sliding controller for the twin-rotor multi-input–multi-output system. IEEE Transactions on Fuzzy Systems, 2010, 18(5): 893-905

[51]

TohaS F, TokhiM O. Dynamic nonlinear inverse-model based control of a twin rotor system using adaptive neuro-fuzzy inference system. Proceedings of 2009 Third UKSim European Symposium on Computer Modeling and Simulation, 2009107111

[52]

ToumodgeS. Modern control systems. IEEE Control Systems Magazine, 1986, 6556

[53]

TufenkciS, AlagozB B, KavuranG, YerogluC, HerencsarN, MahataS. A theoretical demonstration for reinforcement learning of PI control dynamics for optimal speed control of DC motors by using twin delay deep deterministic policy gradient algorithm. Expert Systems with Applications, 2023, 213119192

[54]

ValluruS K, KumarR, KumarR. Design and implementation of L-PID and IO-PID controllers for twin rotor MIMO system. Proceedings of 2019 International Conference on Power Electronics, Control and Automation (ICPECA), 201915

[55]

VasconcelosJ F, SilvestreC, OliveiraP, GuerreiroB. Embedded UAV model and laser aiding techniques for inertial navigation systems. Control Engineering Practice, 2010, 18(3): 262-278

[56]

WangW Y, MaF, LiuJ. Course tracking control for smart ships based on a deep deterministic policy gradient-based algorithm. Proceedings of 2019 5th International Conference on Transportation Information and Safety (ICTIS), 201914001404

[57]

WangY, TongJ, SongT Y, WanZ H. Unmanned surface vehicle course tracking control based on neural network and deep deterministic policy gradient algorithm. Proceedings of 2018 OCEANS-MTS/IEEE Kobe Techno-Oceans (OTO), 201815

[58]

YuD, ChangT, YuD. A stable self-learning PID control for multivariable time varying systems. Control Engineering Practice, 2007, 15(12): 1577-1587

[59]

ZhongJ, LiY. Toward human-in-the-loop PID control based on CACLA reinforcement learning. Proceedings of Intelligent Robotics and Applications: 12th International Conference, ICIRA 2019, Part III, 2019, 12: 605-613Shenyang, China, August 8–11, 2019

RIGHTS & PERMISSIONS

Systems Engineering Society of China and Springer-Verlag GmbH Germany

PDF

288

Accesses

0

Citation

Detail

Sections
Recommended

/