Cover
Vol. 18 No. 2 (2022)

Published: December 31, 2022

Pages: 117-126

Original Article

Learning the Quadruped Robot by Reinforcement Learning (RL)

Abstract

In this paper, a simulation was utilized to create and test the suggested controller and to investigate the ability of a quadruped robot based on the SimScape-Multibody toolbox, with PID controllers and deep deterministic policy gradient DDPG Reinforcement learning (RL) techniques. A quadruped robot has been simulated using three different scenarios based on two methods to control its movement, namely PID and DDPG. Instead of using two links per leg, the quadruped robot was constructed with three links per leg, to maximize movement versatility. The quadruped robot-built architecture uses twelve servomotors, three per leg, and 12-PID controllers in total for each servomotor. By utilizing the SimScape-Multibody toolbox, the quadruped robot can build without needing to use the mathematical model. By varying the walking robot's carrying load, the robustness of the developed controller is investigated. Firstly, the walking robot is designed with an open loop system and the result shows that the robot falls at starting of the simulation. Secondly, auto-tuning are used to find the optimal parameter like (KP, KI and KD) of PID controllers and resulting shows the robot can walk in a straight line. Finally, DDPG reinforcement learning is proposed to generate and improve the walking motion of the quadruped robot, and the results show that the behaviour of the walking robot has been improved compared with the previous cases, Also, the results produced when RL is employed instead of PID controllers are better.

References

  1. S. Ali, M. Khorram, A. Zamani, and H. Abedini "PD Regulated Sliding Mode Control of a Quadruped Robot ", International Conference on Mechatronics and Automation. IEEE, 2011.
  2. R. B. McGhee and A. A. Frank, University of southern California handbook, vol. 3 “On the stability properties of quadruped creeping gaits,” Math. Biosci., pp. 331–351, CRC 1968.
  3. K. Mitobe, N. Mori, K. Aida, and Y. Nasu, “Nonlinear feedback control of a biped walking robot,” Proc. - IEEE Int. Conf. Robot. Autom., vol. 3, pp. 2865–2870, 1995.
  4. S. Tzafestas, M. Raibert, and C. Tzafestas, “Robust sliding-mode control applied to a 5-link biped robot,” J. Intell. Robot. Syst. Theory Appl., vol. 15, no. 1, pp. 67– 133, 1996.
  5. A. Aldair, A. Al-Mayyahi, and B. H. Jasim, “Control of Eight-Leg Walking Robot Using Fuzzy Technique Based on SimScape Multibody Toolbox,” IOP Conf. Ser. Mater. Sci. Eng., vol. 745, no. 1, 2020.
  6. M. Schlotter, “Multibody System Simulation with SimMechanics,” Analysis, no. May, pp. 1–23, 2003.
  7. R. S.Ali, A. Aldair, and A. K. Almousawi, “Design an Optimal PID Controller using Artificial Bee Colony and Genetic Algorithm for Autonomous Mobile Robot,” Int. J. Comput. Appl., vol. 100, no. 16, pp. 8–16, 2014.
  8. M. F. Aranza, J. Kustija, B. Trisno, and D. L. Hakim, “Tunning PID controller using particle swarm optimization algorithm on automatic voltage regulator system,” IOP Conf. Ser. Mater. Sci. Eng., vol. 128, no. 1, 2016.
  9. Anwer Abdulkareem Ali, Mofeed Turky Rashid," Design PI Controller for Tank Level in Industrial Process", Iraqi Journal for Electrical and Electronic Engineering, DOI: 10.37917, June 2022.
  10. A. Aldair, A. Al-Mayyahi, and W. Wang, “Design of a Stable an Intelligent Controller for a Quadruped Robot,” J. Electr. Eng. Technol., vol. 15, no. 2, pp. 817–832, 2020.
  11. P. Henderson, R. Islam, and P. Bachman “Deep Reinforcement Learning That Matter” arXiv:1709.06560v3[cs.LG], 30 Jan 2019 .
  12. Y. Duan, Xi Chen, and R. Houthooft “Benchmarking Reinforcement Learning For Continuous Control.” arXiv:1709.06560v3[cs.LG], 27 May 2016 .
  13. S. Guha, “Deep Deterministic Policy Gradient (DDPG): Theory and Implementation,” Towar. Data Sci., pp. 1–10, 2020,
  14. N. Heess, and D. TB, S. Sriram, "Emergence of Locomotion Behaviours in Rich Environments" arXiv:1707.02286 [cs.AI], 7 Jul 2017.