A novel online gait optimization approach for biped robots with point-feet

Anjidani, Majid; Jahed Motlagh, M.R.; Fathy, M.; Nili Ahmadabadi, M.

doi:10.1051/cocv/2017034

Anjidani, Majid ¹ ; Jahed Motlagh, M.R.¹ ; Fathy, M.¹ ; Nili Ahmadabadi, M.¹

ESAIM: Control, Optimisation and Calculus of Variations, Tome 25 (2019), article no. 81.

Suite au passage du modèle économique de la revue en S20, le texte intégral des articles des années 2019 et 2020 est accessible uniquement sur le site de l'éditeur et est réservé aux abonnés.

Résumé

Designing a stable walking gait for biped robots with point-feet is stated as a constrained nonlinear optimization problem which is normally solved by an offline numerical optimization method. On the result of an unknown modeling error or environment change, the designed gait may be ineffective and an online gait improvement is impossible after the optimization. In this paper, we apply Generalized Path Integral Stochastic Optimal Control to closed-loop model of planar biped robots with point-feet which leads to an online Reinforcement Learning algorithm to design the walking gait. We study the ability of the proposed method to adapt the controller of RABBIT, which is a planar biped robot with point-feet, for stable walking. The simulation results show that the method, starting a dynamically unstable initial gait, quickly compensates the modeling error and reaches to a walking with exponential stability and desired features in a new situation which was impossible by the past methods.

MR Zbl

DOI : 10.1051/cocv/2017034

Classification : 49J15, 93E35, 68T40
Mots-clés : Legged locomotion, gait optimization, orbital stability

Affiliations des auteurs :

Anjidani, Majid ¹ ; Jahed Motlagh, M.R. ¹ ; Fathy, M. ¹ ; Nili Ahmadabadi, M. ¹

@article{COCV_2019__25__A81_0,
     author = {Anjidani, Majid and Jahed Motlagh, M.R. and Fathy, M. and Nili Ahmadabadi, M.},
     title = {A novel online gait optimization approach for biped robots with point-feet},
     journal = {ESAIM: Control, Optimisation and Calculus of Variations},
     publisher = {EDP-Sciences},
     volume = {25},
     year = {2019},
     doi = {10.1051/cocv/2017034},
     zbl = {1437.49002},
     mrnumber = {4043860},
     language = {en},
     url = {https://www.numdam.org/articles/10.1051/cocv/2017034/}
}

TY  - JOUR
AU  - Anjidani, Majid
AU  - Jahed Motlagh, M.R.
AU  - Fathy, M.
AU  - Nili Ahmadabadi, M.
TI  - A novel online gait optimization approach for biped robots with point-feet
JO  - ESAIM: Control, Optimisation and Calculus of Variations
PY  - 2019
VL  - 25
PB  - EDP-Sciences
UR  - https://www.numdam.org/articles/10.1051/cocv/2017034/
DO  - 10.1051/cocv/2017034
LA  - en
ID  - COCV_2019__25__A81_0
ER  -

%0 Journal Article
%A Anjidani, Majid
%A Jahed Motlagh, M.R.
%A Fathy, M.
%A Nili Ahmadabadi, M.
%T A novel online gait optimization approach for biped robots with point-feet
%J ESAIM: Control, Optimisation and Calculus of Variations
%D 2019
%V 25
%I EDP-Sciences
%U https://www.numdam.org/articles/10.1051/cocv/2017034/
%R 10.1051/cocv/2017034
%G en
%F COCV_2019__25__A81_0

Anjidani, Majid; Jahed Motlagh, M.R.; Fathy, M.; Nili Ahmadabadi, M. A novel online gait optimization approach for biped robots with point-feet. ESAIM: Control, Optimisation and Calculus of Variations, Tome 25 (2019), article no. 81. doi : 10.1051/cocv/2017034. https://www.numdam.org/articles/10.1051/cocv/2017034/

Bibliographie
Cité par

[1] Y. Baudoin and M.K. Habib, Using Robots in Hazardous Environments: Landmine Detection, De-Mining and Other Applications. Woodhead Publishing Limited (2011). | DOI

[2] P. Agarwal, Pei-Hsin Kuo, R.R. Neptune and A.D. Deshpande, A novel framework for virtual prototyping of rehabilitation exoskeletons. IEEE International Conference on Rehabilitation Robotics (2013).

[3] B. Dellon and Y. Matsuoka, Prosthetics, Exoskeletons, and Rehabilitation. IEEE Robotics and Automation Magazine (2007).

[4] Wu Guorong and W. Wei, Application of human-machine interaction in toy design. Information Technology and Artificial Intelligence Conference (ITAIC) (2011).

[5] M. Cefalo and G. Oriolo, Task-Constrained Motion Planning for Underactuated Robots. IEEE International Conference on Robotics and Automation (ICRA). Washington (2015). | DOI

[6] B. D’Andréa-Novel and S. Thorel, Control of non holonomic or under-actuated mechanical systems:The examples of the unicycle robot and the slider. ESAIM: COCV (2016). | Numdam | MR | Zbl

[7] A. Mohammadi, M. Maggiore and L. Consolini, On the Lagrangian structure of reduced dynamics under virtual holonomic constraints. ESAIM: COCV (2016). | Numdam | MR

[8] R. Lecaros and L. Rosier, Control of underwater vehicles in inviscid fluids. ESAIM: COCV 20 (2014) 662–703. | Numdam | MR | Zbl

[9] E.R. Westervelt, J.W. Grizzle, C. Chevallereau, Jun Ho Choi and B. Morris, Feedback Control of Dynamic Bipedal Robot Locomotion. CRC Press, Taylor and Francis Group (2007).

[10] N. Sugimoto and J. Morimoto, Phase-dependent Trajectory Optimization for CPG-based Biped Walking Using Path Integral Reinforcement Learning. 11th IEEE-RAS International Conference on Humanoid Robots, Bled, Slovenia (2011).

[11] J. Yu, M. Tan, J. Chen and J. Zhang, A Survey on CPG-Inspired Control Models and System Implementation. IEEE Transactions on Neural Networks and Learning Systems 25 (2014) 441–455. | DOI

[12] S. Kajita, F. Kanehiro, K. Kaneko, K. Fujiwara, K. Yokoi and H. Hirukawa, A realtime pattern generator for biped walking, In Proc. of the 2002 IEEE International Conference on Robotics and Automation. Washington, D.C. (2002) 317.

[13] S. Kajita, F. Kanehiro, K. Kaneko, K. Yokoi, and H. Hirukawa, The 3D linear inverted pendulum mode: a simple modeling for a biped walking pattern generation, In Proc. of the 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Maui, HI (2001) 23946.

[14] M. Vukobratovic, B. Borovac, D. Surla and D. Stokic, Biped Locomotion. Springer-Verlag, Berlin (1990). | DOI | Zbl

[15] M. Vukobratovic and B. Borovac, Zero-moment point–thirty five years of its life. International Journal of Humanoid Robotics 1 (2004) 157-73. | DOI

[16] K. Hirai, M. Hirose, Y. Haikawa and T. Takenake, The development of Honda humanoid robot, In Proc. of the 1998 IEEE International Conference on Robotics and Automation. Leuven, Belgium (1998) 132126.

[17] R.D. Gregg, Timothy Bretl and M.W. Spong, Asymptotically Stable Gait Primitives for Planning Dynamic Bipedal Locomotion in Three Dimensions, 2010 IEEE International Conference on Robotics and Automation, Anchorage, Alaska, USA (2010). | DOI

[18] A. Goswami, Postural stability of biped robots and the foot-rotation indicator (FRI) point. International Journal of Robotics Research 18 (1999) 52333. | DOI

[19] Y. Hurmuzlu, Dynamics of bipedal gait Part 1: objective functions and the contact event of a planar five-link biped. J. Appl. Mechan. 60 (1993) 3316.

[20] Y. Hurmuzlu, Dynamics of bipedal gait Part 2: stability analysis of a planar five-link biped. J. Applied Mechanics 60 (1993) 33743.

[21] J.W. Grizzle, G. Abba and F. Plestan, Proving asymptotic stability of a walking cycle for a five DOF biped robot model. In Proc. of the 1999. Int. Conf. on Climbing and Walking Robots (1999) 69–81.

[22] J.W. Grizzle, G. Abba and F. Plestan, Asymptotically stable walking for biped robots: Analysis via systems with impulse effects. IEEE Transactions on Automatic Control 46 (2001) 51–64. | DOI | MR | Zbl

[23] J.W. Grizzle, F. Plestan and G. Abba, Poincares method for systems with impulse effects: Application to mechanical biped locomotion, In Proc. of the 1999 IEEE International Conference on Decision and Control, Phoenix, AZ (1999). | DOI

[24] E.R. Westervelt, G. Buche and J.W. Grizzle, Experimental validation of a framework for the design of controllers that induce stable walking in planar bipeds. Int. J. Robotics Res. 23 (2004) 5592. | DOI

[25] E.R. Westervelt, G. Buche and J.W. Grizzle, Inducing dynamically stable walking in an underactuated prototype planar biped, In Proc. of the 2004 IEEE International Conference on Robotics and Automation, New Orleans, LA (2004) 42349.

[26] E.R. Westervelt, J.W. Grizzle and C. Canudas, Switching and PI control of walking motions of planar biped walkers. IEEE Trans. Automatic Control 48 (2003) 308-12. | DOI | MR

[27] C. Chevallereau, E.R. Westervelt and J.W. Grizzle, Asymptotically stable running for a five-link, four-actuator, planar bipedal robot. Int. J. Robotics Res. 24 (2005) 431–464. | DOI

[28] E.R. Westervelt, J.W. Grizzle and D.E. Koditschek, Hybrid zero dynamics of planar biped walkers. IEEE Trans. Automatic Control 48 (2003) 42–56. | DOI | MR | Zbl

[29] C. Chevallereau, J. Grizzle and C.-L. Shih, Asymptotically stable walking of a five-link underactuated 3-D bipedal robot. Robotics, IEEE Trans. 25 (2009) 37–50. | DOI

[30] C. Chevallereau, G. Abba, Y. Aoustin, F. Plestan, E.R. Westervelt, C. Canudas-De-Wit and J.W. Grizzle, RABBIT: A Testbed for Advanced Control Theory, IEEE Control Systems Magazine, Paper number CSM-02-038, Revision June 8 (2003).

[31] E.A. Theodorou, J. Buchli and S. Schaal, A Generalized Path Integral Control Approach to Reinforcement Learning. J. Machine Learning Res. 11 (2010) 3137–3181. | MR | Zbl

[32] J. Buchli, F. Stulp, E. Theodorou and S. Schaal, Learning variable impedance control. Int. J. Robot. Res. 30 (2011) 820–833. | DOI

[33] F. Stulp, E. Theodorou, M. Kalakrishnan, P. Pastor, L. Righetti and S. Schaal, Learning Motion Primitive Goals for Robust Manipulation. IEEE/RSJ International Conference on Intelligent Robots and Systems (2011).

[34] F. Stulp and O. Sigaud, Policy improvement methods: Between blackbox optimization and episodic reinforcement learning, in Journees Francophones sur la Planification, la Decision et l’Apprentissage pour la conduite de systemes (JFPDA) (2012).

[35] R. J. Williams, Simple statistical gradient-following algorithms for connectionist reinforcement learning. Machine Learning (1992). | Zbl

[36] Peters and S. Schaal, Reinforcement learning of motor skills with policy gradients. Neural Networks 21 (2008) 682-97. | DOI

[37] J. Baxter and P.L. Bartlett, Infinite-horizon policy-gradient estimation. J. Artificial Intell. Res. Arch. 15 (2001) 319–350. | MR | Zbl

[38] R.S. Sutton, D. Mcallester, S. Singh and Y. Mansour, Policy gradient methods for reinforcement learning with function approximation, In Vol. 12 of Advances in Neural Information Processing Systems. MIT Press (2000) 1057–1063.

[39] J. Peters and S. Schaal, Natural actor critic, Neurocomputing (2008b).

[40] J. Koeber and J. Peters, Policy search for motor primitives, In Vol. 21 of Advances in Neural Information Processing Systems. (NIPS 2008). Vancouver, BC, Cambridge, MA: MIT Press (2008) 297–304.

[41] A.J. Ijspeert, J. Nakanishi, H. Homann, P. Pastor and S. Schaal, Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors. Neural Comput. 25 (2013) 328–373. | DOI | MR | Zbl

[42] R.F. Stengel, Optimal Control and Estimation, Dover books on advanced mathematics. Dover Publications, New York (1994). | Zbl

[43] F. Stulp and O. Sigaud, Path Integral Policy Improvement with Covariance Matrix Adaptation, 29 th International Conference on Machine Learning, Edinburgh, Scotland, UK (2012).

[44] K. Akbari Hamed, B.G. Buss and J.W. Grizzle, Exponentially stabilizing continuous-time controllers for periodic orbits of hybrid systems: Application to bipedal locomotion with ground height variations. To appear in: Int. J. Robotics Res. (2015). | DOI

[45] K. Akbari Hamed and J.W. Grizzle, Event-Based Stabilization of Periodic Orbits for Underactuated 3-D Bipedal Robots With Left-Right Symmetry. IEEE Trans. Robotics 30 (2014) 365–381. | DOI

Cité par Sources :