Not Only Rewards But Also Constraints: Applications on Legged Robot Locomotion

Sdílet
Vložit
  • čas přidán 30. 08. 2023
  • arxiv.org/abs/2308.12517

Komentáře • 9

  • @TextZip
    @TextZip Před 8 měsíci +2

    Hi, the video and the paper are really impressive. I wanted to know what simulation platform was used and if the code will be made public..?

    • @railabkaist9016
      @railabkaist9016  Před 8 měsíci +1

      You can download the simulator at: raisim.com/
      The code will be made public after the paper is accepted

  • @user-le3uf8hw5d
    @user-le3uf8hw5d Před 9 měsíci

    Hello, I hope you're well. While my question isn't directly related to the paper, may I inquire about it? I'm wondering how significant the reality-gap is. Can you consistently expect that every learned policies can be transferred seamlessly to the actual robot? Additionally, would the behavior of both the simulation and the real-world robot be nearly identical? Lastly, if one were to forgo the teacher-student structure, might there be a noticeable decrease in performance?

    • @user-pg6ym1th9
      @user-pg6ym1th9 Před 8 měsíci +1

      Thank you for your interest in our work. What we wanted to claim in this work is to use both rewards and constraints when designing learning-based controllers for complex robotic systems. We used the teacher-student learning framework but it is not limited to it. Other methods such as vanilla learning [1] or asymmetric learning [2] can also be used. In our experience, the sim-to-real gap highly depends on the characteristics of the robotic system you are working on (e.g., actuator mechanism, system software latency, actuation bandwidth) rather than the learning algorithm itself. Based on the characteristics of your system, you should select appropriate methods to solve the sim-to-real gap (e.g., domain randomization, domain adaptation, actuator networks [1]). In our case (Raibo, Mini-cheetah), domain randomization was enough.
      [1] Hwangbo, Jemin, et al. "Learning agile and dynamic motor skills for legged robots." Science Robotics 4.26 (2019): eaau5872.
      [2] Nahrendra, I. Made Aswin, Byeongho Yu, and Hyun Myung. "Dreamwaq: Learning robust quadrupedal locomotion with implicit terrain imagination via deep reinforcement learning." 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023.

    • @user-le3uf8hw5d
      @user-le3uf8hw5d Před 8 měsíci +1

      @@user-pg6ym1th9 Thanks for the detailed discussion! I'm working on a humanoid RL and was wondering if there could be a way to further reduce the reality gap. Thanks for your kind explanation. :)

  • @snuffybox
    @snuffybox Před 9 měsíci +1

    You should link the paper

    • @NowayJose14
      @NowayJose14 Před 8 měsíci

      It's in the description, chief

    • @snuffybox
      @snuffybox Před 8 měsíci

      @@NowayJose14 it wasn't when I commented

  • @marshallmcluhan33
    @marshallmcluhan33 Před 9 měsíci

    Oh thank god they don't just care about collecting more meaningless tokens.