Reinforcement Learning from scratch

Sdílet
Vložit
  • čas přidán 29. 06. 2024
  • How does Reinforcement Learning work? A short cartoon that intuitively explains this amazing machine learning approach, and how it was used in AlphaGo and ChatGPT.
    Part 1 of 3.
    0:00 - intro
    0:13 - pong
    0:28 - the policy
    0:51 - policy as neural network
    1:32 - supervised learning
    2:51 - reinforcement learning using policy gradient
    4:24 - minimizing error using gradient descent
    4:45 - probabilistic policy
    5:01 - pong from pixels
    6:58 - visualizing learned weights
    8:18 - pointer to Karpathy "pong from pixels" blogpost

Komentáře • 43

  • @darthvader4899
    @darthvader4899 Před 3 měsíci +16

    this is video is super underrated. In fact the whole channel is underrated.

  • @themathguy3149
    @themathguy3149 Před 8 měsíci +6

    Your Channel IS SO GREAT, I share with all my eng friends for you to get more visibility!

  • @tushargupta1999
    @tushargupta1999 Před 3 měsíci +2

    This video is amazing. You explained everything in such a simple manner. I am feeling really motivated to learn more about reinforcement learning and neural networks after watching this.

  • @ashketchum1244
    @ashketchum1244 Před 10 měsíci +4

    I don't know how I stumbled upon this video but that was very interesting and intuitive to understand. Thank you.

  • @metaljacket8102
    @metaljacket8102 Před 2 měsíci +2

    This is really awsome! It's the best video that explains DRL in such an easy to understand way!

  • @a.aspden
    @a.aspden Před 9 měsíci +2

    Your videos are great. Looking forward to more!

  • @marcinstrzesak346
    @marcinstrzesak346 Před 9 měsíci +1

    Great video, very helpful, easy to understand.

  • @themax2go
    @themax2go Před 3 měsíci +4

    agi: 1. ai develops understanding of win-loss conditions and sets policy params (inputs & actions) accordingly. 2. ai creates (= designs & builds) training env(s). 3. ai iterates, avals & adjusts policy parameters accordingly 4. done (or validation run(s) w/ human(s))

  • @gmjammin4367
    @gmjammin4367 Před 10 měsíci +1

    Amazing video as always :)!

  • @codybarton2090
    @codybarton2090 Před 24 dny +1

    I agree once you see how it all works it seems like 1s and zeros give me some feed back on r/grand unified theory or cosmo knowledge

  • @cloudysh
    @cloudysh Před 2 měsíci +1

    This was so surprisingly great :3

  • @CptDoge-rn3ou
    @CptDoge-rn3ou Před 8 měsíci +1

    I really like the way you visualize what you are talking about. Thank you for putting in the effort!

  • @moldo800
    @moldo800 Před 5 měsíci +1

    Excellent. Congratulations ❤

  • @swannschilling474
    @swannschilling474 Před 3 dny

    Thanks a lot for this one! 😊

  • @luiseduardocraizer7416
    @luiseduardocraizer7416 Před měsícem +1

    Excellent content!

  • @jameslibby5215
    @jameslibby5215 Před 9 měsíci +6

    Very very underrated channel

  • @mado.madeleine
    @mado.madeleine Před 10 měsíci +1

    Super helpful! Thank you 🙏🏽

  • @nikbivation
    @nikbivation Před 10 měsíci +1

    thank you for this!

  • @mohajeramir
    @mohajeramir Před 2 měsíci +2

    Excellent

  • @jdlopes06
    @jdlopes06 Před 18 hodinami

    Thank you!

  • @ireoluwaTH
    @ireoluwaTH Před 10 měsíci +1

    Thank you!!!

  • @solveigberling1662
    @solveigberling1662 Před 3 měsíci +1

    That was dope

  • @kniv0gaffel
    @kniv0gaffel Před 8 měsíci +1

    Brilliant

  • @BlueBirdgg
    @BlueBirdgg Před 10 měsíci +1

    Can you playlist each one of your topics plz?
    I wanted to post on Twitter(X) your video topics but could only post a single video at a time.
    Great content by the way. Ty very much.
    Your perspective on some topics helped me a lot to get a more intuitive understanding.

    • @g5min
      @g5min  Před 10 měsíci

      Good idea! Here's one on generative AI:
      czcams.com/play/PLWfDJ5nla8UoR8P7AGqVw7ZPjXajUFLMo.html
      Here's one on reinforcement learning
      czcams.com/play/PLWfDJ5nla8UoexEaLqVMw7q3Ft0vRYscL.html
      Here's one on LLMs + text-to-image
      czcams.com/play/PLWfDJ5nla8UoG2mvvHs_OS0asAKC5HJeu.html

    • @BlueBirdgg
      @BlueBirdgg Před 10 měsíci

      @@g5min Ty!

  • @edvinbeqari7551
    @edvinbeqari7551 Před 5 měsíci

    What is your reward function for the pong game? I did a similar pong game and I couldn't get it to learn.

  • @maxim_ml
    @maxim_ml Před měsícem +1

    that was good

  • @axe863
    @axe863 Před 7 měsíci +2

    Simple Reinforcement learning is extremely dangerous in certain nonstationary environments 😅

  • @mineq4967
    @mineq4967 Před 3 měsíci

    but by what number do you change the weights like you never told us

  • @bombur9007
    @bombur9007 Před 2 měsíci

    how many layers should such network have

  • @mind6861
    @mind6861 Před 17 dny +1

    Can we have the code for this

  • @nischalyou
    @nischalyou Před 10 měsíci

    whats the name of this video game ?

  • @gaydemaupassant6263
    @gaydemaupassant6263 Před 16 dny

    Pls o want the code plsss

  • @herikaniugu
    @herikaniugu Před 8 měsíci

    Imagine using reinforcement learning in quantitative finance 😊

  • @FRANKONATOR123
    @FRANKONATOR123 Před 10 měsíci

    Can you share the source code for this project

    • @g5min
      @g5min  Před 10 měsíci

      You can follow the link to the Karpathy site at the end of the video, repeated here:
      karpathy.github.io/2016/05/31/rl/

  • @macratak
    @macratak Před 10 měsíci

    ah yes, reinforcement learning. a fundamental computer graphics technology

    • @g5min
      @g5min  Před 10 měsíci +5

      I think that character/game-AI is pretty central to graphics

    • @pw7225
      @pw7225 Před 10 měsíci +1

      Why so negative?

    • @revimfadli4666
      @revimfadli4666 Před 10 měsíci

      ​@@g5minespecially AI image generation or processing nowadays