Sim-to-Real Learning of Robust Compliant Bipedal Locomotion on Torque Sensor-Less Gear-Driven Humanoid
In deep reinforcement learning, sim-to-real is the mainstream method as it needs a large number of trials, however, it is challenging to transfer trained policy due to reality gap. In particular, it is known that the characteristics of actuators in leg robots have a considerable influence on the reality gap, and this is also noticeable in high reduction ratio gears. Therefore, we propose a new simulation model of high reduction ratio gears to reduce the reality gap. The instability of the bipedal locomotion causes the sim-to-real transfer to fail catastrophically, making system identification of the physical parameters of the simulation difficult. Thus, we also propose a system identification method that utilizes the failure experience. The realistic simulations obtained by these improvements allow the robot to perform compliant bipedal locomotion by reinforcement learning. The effectiveness of the method is verified using a actual biped robot, ROBOTIS-OP3, and the sim-to-real transferred policy archived to stabilize the robot under severe disturbances and walk on uneven terrain without force and torque sensors.
READ FULL TEXT