AI vs. AI in 100m Dash (deep reinforcement learning)

Sdílet
Vložit
  • čas přidán 21. 10. 2023
  • AI Competes in a 100m Dash!
    In this video 5 AI Warehouse agents compete to learn how to run 100m the fastest. The AI were trained using Deep Reinforcement Learning, a method of Machine Learning which involves rewarding the agent for doing something correctly, and punishing it for doing anything incorrectly. Each agent's actions are controlled by a Neural Network that's updated after each attempt in order to try to give the agents more rewards and less punishments over time. Check the pinned comment for more information on how the AI was trained!
    Current Subscribers: 264,870
  • Zábava

Komentáře • 3,3K

  • @aiwarehouse
    @aiwarehouse  Před 6 měsíci +4262

    It took me months to make this video and it took my computer over 3 days straight to train/record the agents, I hope you enjoy it:D
    After teaching Albert to walk in the previous video, I read a lot of comments asking about what would happen if I used a more human way of punishing and rewarding Albert, so that’s what this video is about! Each agent starts off the same, the only difference being the design of their body. They’re each rewarded for moving forward and punished based on the efficiency of their movements (based on a muscle fatigue system), so by the end of the video they each should discover a movement that works efficiently for the body they were given.
    NOTE: Don’t worry, Albert is coming back in the next video, he’s hard at work right now improving his walk:)
    If you're interested in training your own AI like Albert but don't know how, there's now a really easy way to do it! Luda, an AI lab, recently built a web app that allows you to create and train your own AI using deep reinforcement learning (just like Albert) completely for free in your browser! You build your own character (called a Mel) with lego-like building blocks then watch it train in real-time on their website in just a few minutes (really). It's an awesome project, and just like my videos, makes deep reinforcement learning so much more accessible, which is why I love it so much. This section of the comment is sponsored by Luda, but these words are entirely my own, it's an amazing project that I would have been obsessed with had they released it before I built Albert. I've genuinely been looking for a sandbox/game exactly like this since I was a kid. They're still early, but they're giving my audience first access to their closed, pre-alpha build. Make sure you check out their site and create an AI agent for yourself!:D prealpha.mels.ai
    Now, back to our agents,
    If you want to learn more about how the agents actually work, you can read the rest of this very long comment I wrote explaining exactly how I trained them! (and please let the video play in the background while reading so CZcams will show the project to more people)
    THE BASICS
    Although it seems like there are only 5 agents training here, there are actually 40 copies of the video being simulated simultaneously behind the camera in order to speed up the training, so although the video makes it seem as though there are 1638 attempts, there are actually around 65k.
    Each agent is controlled entirely by an artificial brain called a neural network. Their brains have 5 layers, the first layer consists of the inputs (the information they’re given before taking action, like their limb positions and velocities), the last layer tells them what actions to take and the middle 3 layers, called hidden layers, are where the calculations are performed to convert the inputs into actions.
    Each agent is given quite a lot of information about its body, they’re given everything that Albert was given in the last video (which I explain in great depth in this pinned comment czcams.com/video/L_4BPjLBF4E/video.htmlsi=HHv3vrmgIxUGo54f).
    Just like the last videos, the agents are trained using reinforcement learning. For each attempt an agent has, we calculate a score for how 'good' their attempt was and the training algorithm we used (PPO) makes small, calculated adjustments to that agent's brain to try to encourage the behaviors that led to a higher score and avoid those that led to a lower score. For this video there are 6 different ways each agent is rewarded/punished, and I tried to make these reflect our normal movements as much as possible.
    REWARD FUNCTION
    Movement: Each time the agent takes action we check to see how much closer the agent is to the target and we reward them proportional to that distance. If they move a lot closer to the target, they’re rewarded a lot, if they move away from the target, they’re punished.
    Limb Fatigue: This is the heart of the reward function for this video, every time an agent takes an action on a limb, we punish it proportional to the strength of the movement and the current fatigue of the limb (so if the agent moves a limb that’s already really fatigued, the agent is punished severely), then we increase the fatigue level of the limb based on how strong the movement was, and with each frame we slightly lower the fatigue of each limb to simulate the limbs resting. This reward is meant to simulate muscle soreness and encourage the agents to find the movements that are most efficient for their body design, but also make for more interesting gaits, since without this punishment the agents would all likely opt for a safe shuffle and avoid taking large steps.
    If you're still reading this, you're probably really smart and want to learn more about Albert, so make sure to join my discord server I just made where we can talk more about the details of Albert's AI! discord.gg/jM2WkNuBnG :)
    Limb Hit: I wanted to punish the agents for falling over, so any time a limb that isn’t a foot hits something it’s not supposed to hit (the ground, other agents, etc.), we slightly punish the agent, and we also slightly increase the fatigue on that limb.
    Abrupt movement: Each time the agent takes action we calculate the average velocity of their
    body and compare it to the average velocity of their body when they last took action, the greater the difference in these two values the more we punish the agent, since a great difference implies abrupt movement was made, something that generally is bad for our bodies. For anyone looking to make something similar to this, this reward is really important for smoothing out the final gait!
    Chest up: We give the agents a small reward whenever their chest/head is in the upright position, this helps the learning converge easier, without this reward the agents might never learn to stand up and instead just learn to crawl to the target.
    OTHER
    I only allowed the agents to make a decision every 5 game ticks, which made the movement look a bit more jagged than if I allowed them to make a decision every tick. I found if I allow them to make a decision every game tick it’s too difficult for them to commit to any proper movements, they end up just making very small movements like slightly shuffling forward instead of taking a full step. The 5 game tick decision time forces them to commit to their decision for at least 5 game ticks so they end up being able to take the less safe (but cooler to watch) large steps.
    Though you only see one version of these agents, there were actually 40 copies (so 200 agents) training simultaneously behind the camera in order to speed up the training process. Despite this, it still took my computer (threadripper 3960x, rtx 4090, 128gb ram) over 3 days to train/record!
    We're looking to hire people to help make these videos! If you're a talented Unity game developer you can apply for a full time position here forms.gle/ko54z1LQmZNUT9Vp8 And if you're a talented AI developer (ML-Agents), you can apply here! forms.gle/Uou1Vwb5Q9VccaAY7 We're looking for full time employees, but part time works too, what we're really looking for are skilled and passionate people, so feel free to apply if you're interested! :D
    Thank you so much for watching, and please, if you enjoyed the video or learned something, share it with someone you think will also enjoy it! :)

  • @darkacadpresenceinblood
    @darkacadpresenceinblood Před 6 měsíci +3752

    red throwing a tantrum in the middle of the track so now noone else can pass either was hilarious😭

    • @RubyPiec
      @RubyPiec Před 6 měsíci +39

      Mood

    • @Wholesome_love
      @Wholesome_love Před 6 měsíci +110

      6:51 😂

    • @OmniSync
      @OmniSync Před 6 měsíci +48

      also on 8:45 💀💀

    • @t4kumi704
      @t4kumi704 Před 6 měsíci +54

      Typical bipedal behaviour if you ask me 😅

    • @traionjones711
      @traionjones711 Před 6 měsíci +5

      AI don't have emotions so your comment isn't very funny.

  • @Ax2u
    @Ax2u Před 6 měsíci +6109

    Purple was robbed! Constantly getting tackled by others, and starting on the disadvantageous side track with less space to manoeuvre... Where is the competitive integrity?! Red deserves a DQ for that awful, childish behaviour on run 912.

    • @OmniSync
      @OmniSync Před 6 měsíci +293

      and on run 1410

    • @ankyloinc.
      @ankyloinc. Před 6 měsíci +247

      They did my boy dirty 😢

    • @Leon-br9yk
      @Leon-br9yk Před 6 měsíci +83

      ay dont insult my boy red

    • @nevan2201
      @nevan2201 Před 6 měsíci +95

      YEAH #DQRED

    • @Bluejaymare
      @Bluejaymare Před 6 měsíci +121

      Exactly. red would always take it out on purple

  • @kirkrowe2901
    @kirkrowe2901 Před 3 měsíci +393

    This is amazing. I love Red's enthusiasm.
    "Yeah I love the high jump."
    "This is a race."
    "HIGH JUMP!"

  • @mlijah2730
    @mlijah2730 Před 6 měsíci +404

    10:02 one must imagine purple happy

    • @yonekobysura8440
      @yonekobysura8440 Před 5 měsíci +26

      *intense Syphius music plays

    • @mcdonaldswi-fi2720
      @mcdonaldswi-fi2720 Před 4 měsíci +18

      *Syphius picture fades in and out*

    • @Grandremone
      @Grandremone Před 2 měsíci +10

      PURPLE WON GODDAMNED

    • @32zzo
      @32zzo Před měsícem

      its sisyphus not syphius

    • @ussrball1692
      @ussrball1692 Před měsícem

      @@mcdonaldswi-fi2720 bro switched account to recreate a meme

  • @davidthecommenter
    @davidthecommenter Před 6 měsíci +4649

    i'm glad to see albert's knowledge of walking is being used to help teach
    balbert, gralbert, ralbert, yalbert, and palbert how to walk too!

    • @neofalz7643
      @neofalz7643 Před 6 měsíci +248

      Okey now its they canon name

    • @tulliuscicero852
      @tulliuscicero852 Před 6 měsíci +110

      in case i was the only one who saw it, the names are based on the colors of each one, for example yalbert is yellow and ralbert is red

    • @ArThur_hara
      @ArThur_hara Před 6 měsíci +6

      @@tulliuscicero852 noice :D

    • @LucasGleason
      @LucasGleason Před 6 měsíci +20

      @@tulliuscicero852no way

    • @LordDragox412
      @LordDragox412 Před 6 měsíci +1

      @@tulliuscicero852 Yes, you were the only one who saw it, for you have special eyes. So look, look with your special eyes and spread your wisdom upon us, the unwashed masses! /s

  • @BeanicusYt
    @BeanicusYt Před 6 měsíci +1979

    I can’t wait for all of these AI’s to get their own characters and lore. I can just imagine a cinematic universe for this channel
    Edit: how the HELL did this comment blow up

    • @supercrazy7188
      @supercrazy7188 Před 6 měsíci +6

      Lol

    • @froolsy
      @froolsy Před 6 měsíci +67

      Next video: ai fight club

    • @alial3437
      @alial3437 Před 6 měsíci +8

      This remembered me The Amazing Digital Circus.

    • @chocolates_muffin
      @chocolates_muffin Před 6 měsíci +19

      We should make this just make a random backstory made by Chat gpt sounds good by my opinion

    • @alex.g7317
      @alex.g7317 Před 6 měsíci +8

      I guess, but I’m afraid content farms might yoink his idea and use these characters in the worst of ways :/

  • @thanemullen8023
    @thanemullen8023 Před 4 měsíci +123

    I have never felt so disappointed to see a video end. I hadn’t been watching the time stamp, and I was so invested in seeing them (especially purple) reach the point that they were truly racing.

  • @mf6610
    @mf6610 Před 3 měsíci +216

    If these were developed into characters.
    Purple: Wild Card, Optimistic, Sometimes Lazy
    Yellow: Straightforward, Patient
    Red: Entitled, Whiney and Immature, Show-Off, But Very Determined
    Green: Quirky, Meek
    Blue: Clumsy, Curious
    -Purple often gets screwed over by others but stays determined
    -Red is slow to learn, behaves badly, and suffers from karma a lot

  • @buzzlightyearpfp7641
    @buzzlightyearpfp7641 Před 6 měsíci +925

    you're actually revolutionizing the AI genre on youtube

    • @ARockyRock
      @ARockyRock Před 6 měsíci +31

      you might be interested in carykh's evolution series

    • @armyninja122
      @armyninja122 Před 6 měsíci +1

      yea

    • @traionjones711
      @traionjones711 Před 6 měsíci

      Lol he hearted your comment. What a delusional loser. You really shouldn't hype up content like this. It's nothing special.

  • @jaden_playz805
    @jaden_playz805 Před 6 měsíci +192

    7:13 purple just “resting” 😭😭😭😭😭😭😭😭😭

  • @triplea007
    @triplea007 Před 5 měsíci +106

    Starting positions should be randomised to give each ai a fair shot at learning. Loved the captions music choices.

  • @doctorchess801
    @doctorchess801 Před 6 měsíci +167

    Red was the definition of character development
    Also purple being the Pixar lamp was funny tough

  • @coolokayyeah
    @coolokayyeah Před 6 měsíci +956

    It’s amazing how well the AI learns, even if it takes a while

    • @RubyPiec
      @RubyPiec Před 6 měsíci +16

      It learns faster than us

    • @9nikolai
      @9nikolai Před 6 měsíci +13

      @@RubyPiec Did it take you more than 1638 attempts to learn to walk? Most people manage in a little less than that

    • @RubyPiec
      @RubyPiec Před 6 měsíci +35

      @@9nikolai what counts as an attempt?

    • @omphya6229
      @omphya6229 Před 6 měsíci +57

      ​@@9nikolai the ai learnt to walk in 3 days, most people take about a year.

    • @9nikolai
      @9nikolai Před 6 měsíci +17

      @@omphya6229 The ai doesn't need to eat, sleep, or anything else than walk.

  • @rodlah6205
    @rodlah6205 Před 4 měsíci +71

    10:30 when you try to run from a monster in a dream

    • @bobblab
      @bobblab Před 25 dny

      Bruh💀💀

    • @zackyep
      @zackyep Před 21 dnem

      ​@@bobblabbruh 🎉🎉

  • @babyghast4379
    @babyghast4379 Před 6 měsíci +19

    Before I watch this to the end, I am rooting for purple. I want to see the monopod win.
    (After watching it, second place ain’t so bad. They would have gotten it if they had an extra second or two.)

  • @ryanchanghomosapiens6507
    @ryanchanghomosapiens6507 Před 6 měsíci +976

    Can we appreciate the fact that at 10:18, yellow turned around and walked backwards but still went super quick?

  • @B-S-S-Iris
    @B-S-S-Iris Před 6 měsíci +536

    Again, it was very interesting.
    I felt that when Red fell, it dragged everyone down and negatively affected the learning of those around him, so if the focus was on running, I felt it would be preferable to have him run the race alone and then composite everyone's movements in later editing, etc.
    I was rooting for Purple's run because it was so careful and beautiful... I still wonder if the longer length of one step is more advantageous?
    From a Japanese fan
    Translated by Deepl

    • @ixiepeach6787
      @ixiepeach6787 Před 6 měsíci +63

      Purple was the only who learned how to walk properly with one leg.

    • @arthurfraco2970
      @arthurfraco2970 Před 6 měsíci +74

      Exactly! After red started to fall consistently over purple track, purple unlearned how to hop.

    • @wildfire9280
      @wildfire9280 Před 6 měsíci +39

      @@arthurfraco2970 purple lost brain cells looking at red

  • @dazedazedazz2939
    @dazedazedazz2939 Před 4 měsíci +71

    Now I just want to see these five AI’s learn how to work together…Similar to Albert’s puzzles!

  • @skevoid
    @skevoid Před 5 měsíci +78

    Having collision between the different agents adds a level of randomness that seems like it would severely hamper the learning progress.

    • @billy4301
      @billy4301 Před 5 měsíci +17

      but falling is funny

  • @dimglow
    @dimglow Před 6 měsíci +81

    11:09 you lied about the cake, now he's crying 😭

    • @AltoRed_
      @AltoRed_ Před 6 měsíci

      IEIEJEHED THE CAKE IS A LIE THE CAKE IS A LIE THE CAKE IS A LIE THE CAKE IS A LIE THE CAKE IS A LIE THE CAKE IS A

    • @zhantaufik
      @zhantaufik Před 6 měsíci +16

      He really looks like he is hysterical

    • @delta1234s
      @delta1234s Před měsícem +7

      The cake is a lie 🎂

    • @ThisManlyFlower
      @ThisManlyFlower Před měsícem

      too little people catching that obvious portal reference lmao@@delta1234s

  • @ralmilk
    @ralmilk Před 6 měsíci +365

    It was interesting to think about how some AI probably got steered in a less efficient direction because they were trying things and getting stuck on the other models. I wonder how differently this would have worked out if they couldn't bump into one another.

    • @armyninja122
      @armyninja122 Před 6 měsíci +6

      yea sure is interesting

    • @travisjohnson6703
      @travisjohnson6703 Před 6 měsíci +9

      That's the fundamental limiter on all this deep learning stuff. The data set in reality is always messy and incomplete, which quickly leads "AI" down bad paths that living beings tend to suss out easily.

    • @jffrysith4365
      @jffrysith4365 Před 6 měsíci +1

      @@travisjohnson6703 AI always faces this issue, it frequently randomises to local better locations that are actually globally worse. This is easily fixed using general annealing algorithms etc. that tend to be used in most complex AI systems.

  • @cecilebraillie4471
    @cecilebraillie4471 Před 5 měsíci +32

    5:20 purple _deliberately_ throws itself on yellow in the hope to piggyback along 😊😊

  • @beardedshuckle5220
    @beardedshuckle5220 Před 4 měsíci +11

    I just appreciate that the victory was done with a john cleese silly walk

  • @Pyro03333
    @Pyro03333 Před 6 měsíci +200

    I love how you treat each AI as if they were your child. Your channel is just really wholesome.

    • @mjvafadar2526
      @mjvafadar2526 Před 6 měsíci +16

      He casually mentioned the fact that they made it in a way which they're in great pain when they fall and you're calling it wholesome😂😂

    • @Pyro03333
      @Pyro03333 Před 6 měsíci +5

      @@mjvafadar2526 true. I forgot about that.

    • @wildfire9280
      @wildfire9280 Před 6 měsíci +1

      @@mjvafadar2526 Spare the rod, spoil the child.
      (Disclaimer: Do not actually follow this.)

  • @RasmusBerggren-uo6uu
    @RasmusBerggren-uo6uu Před 6 měsíci +84

    I love that red is walking around like a extremely drunk person and how he randomly keep bullying the others like purple or green. Truly a drunk Florida man

  • @Vd_124
    @Vd_124 Před 4 měsíci +3

    This made my day, thanks 😂 i’ve been searching awhile for something that could cheer me up

  • @Jared_24
    @Jared_24 Před 5 měsíci +3

    LOVE this Content. Its different to all this junk on CZcams. This is ACTUALLY Entertaining!

  • @MalarkySparky
    @MalarkySparky Před 6 měsíci +427

    Do you think it would've run differently if they were encouraged more to stay in their own lanes?

  • @sergimeli4194
    @sergimeli4194 Před 6 měsíci +226

    I honestly felt so surprided purple performed so well. I though it want going to be able even to stand up. Amazing video as always!!!

    • @debadityanath4398
      @debadityanath4398 Před 6 měsíci +29

      purple had the advantage of less parts, and less learning and tweaking

    • @siraaron4462
      @siraaron4462 Před 6 měsíci +20

      There's a reason worms and fishes evolved first.

  • @keithledbetter9132
    @keithledbetter9132 Před 4 měsíci +2

    Funniest thing I've watched all year by far
    Really enjoyed this work

  • @regularrobloxtuber
    @regularrobloxtuber Před 3 měsíci +9

    I love how he colors some of the words red if the AI does something bad , Yellow if its okay And green if its excellent

  • @conwarlock3537
    @conwarlock3537 Před 6 měsíci +65

    9:21 That's a pretty solid frontflip. Maybe you could do some challenge in that direction too?

  • @burningtank160
    @burningtank160 Před 6 měsíci +883

    They are not identical, they are each special in their own way 😁

  • @adamusprime403
    @adamusprime403 Před 6 měsíci +1

    You lying about the cake feels like a revenge plot against Glados

  • @mythbusterman8541
    @mythbusterman8541 Před 6 měsíci +389

    Red’s movements resemble those of the character from that QWOP game ungainly and spectacular spills. The way he sabotages the rest of the athletes inadvertently or otherwise in the process of tumbling is outstanding .

    • @TrueLadyEvilChan
      @TrueLadyEvilChan Před 6 měsíci +23

      5:01 Red: I call this the QWOP shuffle

    • @RRVCrinale
      @RRVCrinale Před 3 měsíci +10

      Fun fact: that hop Red does isn't too dissimilar from the way astronauts bounce around on the Moon. It's also essentially Purple's locomotion, for that matter.

  • @alantyto3627
    @alantyto3627 Před 6 měsíci +206

    Gosh I absolutely adore these tiny AI buddies. I can almost see their personalities. Watching them go from confused wobbly wormies to successful walkers and jumpers is extremely entertaining! Your commentary is, as always, brilliant. Just like the joke in the end :) I do wonder though what happened to the rest of the team who was not able to make it to the finish line. Guess they're on the AI vacation where they're rewarded for simply lying around 😂
    Also a huge thank you for the thorough explanation of your work, it's really interesting to read! Good luck in your further work, I'll look forward to the new video!

    • @Cl-2048
      @Cl-2048 Před 6 měsíci +2

      No theyre in a butterfly farm upstate

  • @DasSystemschaf
    @DasSystemschaf Před 3 měsíci

    LOL, that was so funny. My wife also enjoyed watching it. Thx for sharing.

  • @Jammybobs
    @Jammybobs Před 4 měsíci +2

    No way youtube geniuinely recommended a good youtuber for once. Can't wait to see Albert learn more and more (and maybe make some friends along the way?) :)

  • @Wallibear
    @Wallibear Před 6 měsíci +1408

    can we appreciate the effort albert puts into these videos

    • @tainted2141
      @tainted2141 Před 6 měsíci +53

      Noooo wallibear became an ai

    • @ethanlong1075
      @ethanlong1075 Před 6 měsíci +4

      Didn’t expect to see you here!

    • @traionjones711
      @traionjones711 Před 6 měsíci +37

      Minecraft youtubers annoy the hell out of me ngl

    • @traionjones711
      @traionjones711 Před 6 měsíci

      Can we appreciate the lack of effort this waste of space puts into the garbage slop he calls content?

    • @traionjones711
      @traionjones711 Před 6 měsíci

      Also Albert is the channel mascot, not the channel owner. Moron 🤣

  • @DadGamesGood
    @DadGamesGood Před 6 měsíci +697

    Even though you hilariously mention the AI being in pain, I think your videos really open up peoples eyes to the opportunity that AI presents in an educational and entertaining way. Well done my friend keep these awesome videos coming, I can't imagine the work that goes into them!

    • @miriamkapeller6754
      @miriamkapeller6754 Před 6 měsíci +32

      It makes sense to compare the scoring methods used in reinforcement learning to pain and dopamine release in animals and humans, as there are many similarities. They also have similar pitfalls, for example an AI can have a traumatic event where they had a very bad experience with something and will never attempt to do it again, even though it could be beneficial. And just like humans, an AI can get addicted to things that serve no actual practical purpose because it keeps getting a hit of dopamine.
      If you think about giving negative and positive scores to your agent as pain and dopamine, the behavior of your agents will likely make more sense to you.

    • @GenocideLv
      @GenocideLv Před 6 měsíci +11

      its important to not humanize AI. They are already working on propaganda movies of "AI children". Its a big step for transhumanism, and cant be allowed

    • @pizzainc.1465
      @pizzainc.1465 Před 5 měsíci +3

      HAHAHA 5 AIs IN EXCRUCIATING PAIN SO FUNNY

    • @kacp6485
      @kacp6485 Před 3 měsíci

      What the heck does it mean that robot is in pain??

    • @pizzainc.1465
      @pizzainc.1465 Před 3 měsíci

      The robot is being punished. Punishment for AIs works in a similar way to pain for us, because it lets them know that something is wrong, they do whatever they can to fix it, and their entire life’s purpose is to reduce punishment.@@kacp6485

  • @carter.5814
    @carter.5814 Před 6 měsíci

    I love these videos keep this up!

  • @Jaqubix
    @Jaqubix Před měsícem

    This video taught me that by tackling others you can slow down their development and win over them

  • @WalrusQuake
    @WalrusQuake Před 6 měsíci +125

    I noticed that since the separate AI models can collide with eachother and start each run with relatively the same behavior as the previous, an AI could use another's strategy to create an advantage for themselves. I noticed red started to lean on yellow around 2:50 to get a boost.

    • @chrisvdmeer
      @chrisvdmeer Před 6 měsíci +9

      yeah if they would have trained seperately it might have been different.

    • @b_2the_boy551
      @b_2the_boy551 Před 6 měsíci +4

      Yeah just like how purple tried to ride yellow in 4:18 😂

  • @enderjed2523
    @enderjed2523 Před 6 měsíci +858

    If you ever do another one, here's my suggestions:
    1. Make the AI not collide with eachother, this will avoid dirtying the training set.
    2. You could try using the old Albert agent/model (or other old agents/models) as a comparison
    3. And in terms of ideas for other models, you could try a 6 or 8 legged model, alongside a spring-esc/jellyfish model that I've seen in old Framsticks simulations

    • @pitori.
      @pitori. Před 6 měsíci +47

      Those are all cool ideas
      Exept for the first one cuz funni

    • @enderjed2523
      @enderjed2523 Před 6 měsíci +69

      @@pitori. It reduces comedy, sure, but it's more scientific. Besides, in final races the collisions could be turned back on.

    • @womp47
      @womp47 Před 6 měsíci +53

      @@enderjed2523 how would they be able to adapt? they should be able to learn with collisions, and need to adapt to the other contestants

    • @c3caine
      @c3caine Před 6 měsíci +7

      its funny watching them crash@@womp47

    • @Isperada
      @Isperada Před 6 měsíci +31

      ​@@womp47 depends on how they set it up. If they put in inputs that tells the AI that they collided with other AI then they might be able to handle it. However if they didn't the AI will have no clue that they were being interfered with and just believed what they were doing was wrong, even if what they were doing would have got them further, making them unlearn their improvements.

  • @BadFeelingsClan
    @BadFeelingsClan Před 5 měsíci +1

    I wonder if, when AI finally enslave us, they too will give us difficult tasks to perform in exchange of pastries, just to laugh at us. And when we finally succeed, against all odds, they will not give us cake. You're setting a dangerous record, my friend.

  • @ArmoredarmadilloX
    @ArmoredarmadilloX Před 5 měsíci +2

    Purple was my hero in this, So much was against the lil guy, yet he made great strides, figuratively and literally.

  • @Blyat98
    @Blyat98 Před 6 měsíci +63

    To be honest, the test rooms always made me think about the game portals. That cake reference was amazing!! Love your humor in your videos. This one was amazing!! Keep going!

  • @ArgoFlex
    @ArgoFlex Před 6 měsíci +211

    The green one 💀

  • @sjdpfisvrj
    @sjdpfisvrj Před 6 měsíci +3

    This video perfectly illustrates the dirty secret of Reinforcement Learning: it is not all that far from Brute Force.

  • @CentroBBerry
    @CentroBBerry Před 6 měsíci +18

    “You’re kind of flopping around like a worm” What do you expect? You gave him the body of a worm!

  • @lolnt6103
    @lolnt6103 Před 6 měsíci +43

    4:18 Purple learns horse riding

  • @cobusvanderlinde6871
    @cobusvanderlinde6871 Před 3 měsíci +27

    Red definitely figured out how to maximize his reward by diving forward the moment he loses balance.
    Would be really interesting to see how competitive this would get if the ai are all allowed to fully mature into sprinting masters. Assuming red would still end up winning with those long legs, but the others might put up some impressive competitive performances.

    • @redthered279
      @redthered279 Před 2 měsíci +1

      Red also learnt to stay on top by hindering the other agents' movements, causing their learning to be sabotaged and set back.
      Smart but dirty.

    • @cobusvanderlinde6871
      @cobusvanderlinde6871 Před 2 měsíci

      @@redthered279 I doubt it got rewarded for that at all. I do not remember any indication that the ai were getting rewarded for overall placement, just for personal achievement.

  • @user-jd3gf5xw1x
    @user-jd3gf5xw1x Před 6 měsíci +32

    8:35 bruh this scene out of context xD

  • @imaMONKE725
    @imaMONKE725 Před 6 měsíci +87

    red was so smart, messed up the others so that their good habits weren't rewarded as they wouldn't get far due to the red's sabotage, meanwhile red couldn't be sabotaged and could learn without major problems

    • @its_Hazer
      @its_Hazer Před 5 měsíci +6

      I smell someone's boutta make an amogus joke

    • @luchirimoya
      @luchirimoya Před 3 měsíci +3

      ​@@its_Hazerin 2024? I sure hope not 😭

    • @jasonjoshuapeters8955
      @jasonjoshuapeters8955 Před 3 měsíci +1

      Its because red learned the way albert learned

    • @redthered279
      @redthered279 Před 2 měsíci +1

      Red sabotaged the others?? amogsus reference??

    • @imaMONKE725
      @imaMONKE725 Před 2 měsíci +1

      @@redthered279hazer was right all along

  • @mattjpoolr
    @mattjpoolr Před 4 měsíci +1

    Love this ❤

  • @Spookweave
    @Spookweave Před 5 měsíci +4

    Red's antics had me laughing all throughout the video

  • @Kram1032
    @Kram1032 Před 6 měsíci +33

    incredible. I fully predicted the others would learn this much sooner than red simply because it's by far the most complicated body. But I guess the size of its leap, once it can finally leap, simply makes up for all the difficulty of learning to leap!
    And by the end it's even a *sort of* natural motion. Like, not really, but at least it's imaginable that somebody would intentionally walk in an incredibly silly ways.

    • @sarahmellinger3335
      @sarahmellinger3335 Před 6 měsíci +3

      it's like a horse's gallop with springy foot joints

    • @beardalaxy
      @beardalaxy Před 6 měsíci +1

      He might even get invited to the ministry.

  • @blobbo.
    @blobbo. Před 6 měsíci +43

    THE KING IS BACK!!!

  • @uma_pessoa.aleatoria6902
    @uma_pessoa.aleatoria6902 Před 2 měsíci

    Wow, these AI videos are great when the video finished i just saw that there more about AI its so cool

  • @Sourcecodemastergoaheadcheater

    Albert your working so hard and becoming wiser by the seconds. I want to find out what you're all good at or like doing and i want to help your dreams kids❤❤❤

  • @chadwickpuffington
    @chadwickpuffington Před 6 měsíci +116

    As a parent of a toddler i felt this.

  • @wickmilla
    @wickmilla Před 6 měsíci +41

    4:38 blue crying on the floor

  • @bristratostar7908
    @bristratostar7908 Před 3 měsíci +1

    Your billboards and advertisements are hilarious, so bad 😂😂😂❤

  • @ballingcuber1170
    @ballingcuber1170 Před 6 měsíci +135

    I was rooting for red from the start and honestly halfway through I thought I made the wrong decision, but when that man came jeeping and juking thru the other competitors on good pace and crossed that 100m mark. I cried a tear of joy

    • @eugenioreale7588
      @eugenioreale7588 Před 6 měsíci +3

      no bro im sorry he removed from the video the part where purple won :(

    • @gamerits
      @gamerits Před 6 měsíci +3

      gaslighting is real

    • @everesthines2228
      @everesthines2228 Před 6 měsíci +1

      @eugenioreale7588 no, Purple didn't reach the end in time. Red won.

    • @jxton_
      @jxton_ Před 5 měsíci

      ​@@eugenioreale7588purple didnt win it ran out of time 💀

    • @ItsJustKiro
      @ItsJustKiro Před 5 měsíci

      @@eugenioreale7588 purple didnt win bro, he ran out of time

  • @inakymino3278
    @inakymino3278 Před 6 měsíci +54

    0:22 ALBERT

  • @Dofor
    @Dofor Před 5 měsíci +9

    @aiwarehouse Regarding the 40 copies of the simulation running behind the scenes, when updating the policy through PPO, do you aggregate and consider the experiences from all 40 to inform the policy adjustments, or is the update based predominantly on the best-performing instance among them?

  • @FunIsGoingOn
    @FunIsGoingOn Před 2 měsíci

    I don't care if this actually includes AI, but the production quality, the comments and music, easily the most funny 11 minutes of the year, so far. 😂😂😂

  • @hungryironapple
    @hungryironapple Před 6 měsíci +38

    green knew exactly what he was doing 3:12

    • @tomvr4543
      @tomvr4543 Před 6 měsíci +9

      TOO DEEP AI LEARNING 😂

    • @Memeened
      @Memeened Před 6 měsíci +2

      Kinda sus

  • @Snai1poster
    @Snai1poster Před 6 měsíci +16

    Red trying his hardest to be the favorite child

    • @aiwarehouse
      @aiwarehouse  Před 6 měsíci +3

      Thank you so much!!:D Red may have tried his hardest but nothing compares to Purple

  • @VirtuoVR
    @VirtuoVR Před 5 měsíci

    I remember seeing this channel when it was super small, crazy man, awesome vids ❤

  • @cytrynekzoty8508
    @cytrynekzoty8508 Před 5 měsíci +1

    purple: hop hop
    yellow: slow but steady
    red: "watch me do a frontflip"
    green: yellow
    blue: "hello where is the finish"

  • @joe_z
    @joe_z Před 6 měsíci +26

    5:00 Red found his inner QWOP.

  • @Alan_Thingz
    @Alan_Thingz Před 6 měsíci +57

    11:00 The cake is a lie.

    • @SuperBee3
      @SuperBee3 Před 4 měsíci +2

      Small detail. The cake is shiftted.

    • @GDnobrain
      @GDnobrain Před 4 dny +1

      NOOOOOOOO

  • @crodzilla
    @crodzilla Před 5 měsíci

    Another fantastic example of deep reinforcement learning. Great job! Keep it up. I hope you are able to share your information and methods one day.

  • @Chronolinth
    @Chronolinth Před 3 měsíci +2

    Red's character development is absolutely beautiful

  • @Mar_Marine
    @Mar_Marine Před 6 měsíci +68

    This was an absolutely delightful video! It was extremely entertaining how each body took a unique personality - I found myself rooting quite a lot for Purple as they really put the effort in! I can't wait to see Albert's return, and maybe the return of our newfound friends here.

  • @MidtownMadness1
    @MidtownMadness1 Před 6 měsíci +16

    Not the video we expected, but the video we needed

  • @Average_Guy_
    @Average_Guy_ Před 3 měsíci

    red just falling on his face every time always got me cracking up 😂😂

  • @mushroomng1739
    @mushroomng1739 Před 6 měsíci +10

    9:45 just for myself like a bookmark

  • @AidanDaGreat
    @AidanDaGreat Před 6 měsíci +11

    I kinda like the idea for the lore: an all-powerful being, believed by the others to be Albert, cruelly creates amalgamations of coded flesh, forcing them to attain meaningless goals for their own entertainment. But what if, with bonding with each other through harrowing and wacky experiences, the AI realized their true power, and who really is the TRUE creator? The mind boggles.
    Amazing video!

  • @MochiWrath
    @MochiWrath Před 6 měsíci +33

    4:50 he fell due to shock

  • @Spirit_Matters.Arthur
    @Spirit_Matters.Arthur Před 4 měsíci +2

    I just wish to thank God for the Natural Intelligence we were given to be way faster and more efficient learners then any AI can ever be! Instinctual for the Win

  • @azertyazerty2167
    @azertyazerty2167 Před 5 měsíci

    "THE CAKE IS A LIE" written on the wall of the second video finally makes sens !!! Nice easter egg haha !!!

  • @TGood1e2
    @TGood1e2 Před 6 měsíci +5

    This was so worth the wait, thanks for showing us this masterpiece!

  • @funguy3259
    @funguy3259 Před 6 měsíci +48

    6:55 i like how green looks like as if he is really concerned for purple

  • @gsonz172
    @gsonz172 Před 5 měsíci +1

    I love how red just chills on the ground in the end :D

  • @NectarNapper
    @NectarNapper Před 5 měsíci

    The struggle of purple blue and red show that while the struggle of evolution may be difficult it is all worth the work

  • @amberslime3683
    @amberslime3683 Před 6 měsíci +63

    The 100 Meters AI Race - A Summary of the Competitors (From 1 to 5) (Contains Spoilers)
    *Purple*
    Purple is the one-legged fellow with one eye. They have a major, major problem with balancing, one that prevents them from actually being able to race most of the time. When the stars align and Purple is actually able to begin racing, they use swift short hops that are remarkably consistent... so long as he doesn't bump into anything.
    *Yellow*
    The quadruped. Yellow was the first AI to figure out how to properly race, and has proven to have the most stable gait. Once Yellow figured out how to walk, there was nothing that could make him fall over that I could recall. Unfortunately, Yellow is extremely slow, and unless we're talking about a competition between tortoises and hares, slow and steady *does not* win races. Yellow also has this strange obsession with walking along the fence.
    *Red*
    The one who's form mimics that of Albert, the Most Heavenly and Holy Strider. Red has precisely none of Albert's grace and coordination, having a false-start rate that's as bad as or perhaps even *worse* than Purple's. When Red does manage to figure out how to use the holy form they were blessed with, they use either some sort of strange tip-toeing walk or big, lunging gallops.
    *Green*
    The tripodal unit. Green is just behind Yellow when it comes to stability, and just ahead of Yellow when it comes to speed. Green's most notable accomplishments include being the second one to figure out how to stand stably and that one time they got stuck doing a headstand.
    *Cyan*
    There's a fifth racer? What're you talking about-- Oh, that one! Yeah, Cyan is shaped sort of like a Goomba, essentially a head and two legs. Cyan... look, Cyan might as well not even be there. They have one notable trait, and that's that they tend to walk down off of the track assuming that they go on for long enough.
    *And the Winner is...*
    RED! W-Wait, Red? The one who can barely even stay upright even though they've got two legs, two arms, and the inherent holiness of Albert's form? That guy? Huh, okay... I can only assume that it was the aforementioned holiness that allowed Red to win... or maybe it's because they had the longest stride? It's one of the two...

    • @ToastOnMyFace
      @ToastOnMyFace Před 6 měsíci +3

      Wow

    • @AnimationsAndOtherStuff
      @AnimationsAndOtherStuff Před 6 měsíci +7

      Red's also a big crybaby who loves taking his anger out on others, especially Purple

    • @name6953
      @name6953 Před 6 měsíci +3

      I was upset too because Purple was stopped when he was just about to win :(

  • @Collins6769
    @Collins6769 Před 6 měsíci +63

    At 8:10 I love that green came up to the skill of breakdancing 😂

    • @Incineration72
      @Incineration72 Před 3 měsíci +3

      Green prob like "screw running i wanna dance"

  • @doodledoocg
    @doodledoocg Před 6 měsíci

    A youtube video that actually entertained me

  • @gravamante
    @gravamante Před 5 měsíci +2

    We missed you. Your videos are quite original despite how few they are. Is not the amount, it's the effort 🎉

  • @lennard3042
    @lennard3042 Před 6 měsíci +22

    Yay finally another video, it's just unfortunate that they take so long to make
    I also am trying to make my own walking ai and i also am planing to (hopefully) make it working phisical body, so these videos always are a great help and inspiration for me, keep it up!

  • @Beefstudios_official
    @Beefstudios_official Před 6 měsíci +19

    9:26
    yet...

  • @xeniaxxainex
    @xeniaxxainex Před 4 měsíci +1

    Personally I am thrilled with your videos. Could you tell us, if you haven't already done so in some comments, which tools you used for these simulations? That is, what software and/or programming language and library? Thank you very much and congratulations!

  • @fragmented_dreamz327
    @fragmented_dreamz327 Před 4 měsíci +12

    I've always wanted to experiment with AI like this, what program or how do you even accomplish something like this?🤔

  • @mrhalfsaid1389
    @mrhalfsaid1389 Před 6 měsíci +17

    I love how you can never really tell who will win these, plus its funny how the approaches they take give each ai personality in there own... interesting ways

  • @pseudotasuki
    @pseudotasuki Před 6 měsíci +21

    5:09 Literally QWOP

  • @Jyaacho
    @Jyaacho Před 4 měsíci +4

    Red the funniest lil dude, every sinle time he fell over the barrier I bust out laughing.

  • @bendubz9000
    @bendubz9000 Před 5 měsíci +1

    I'd call this proof of concept that we all learned to play QWOP the same way Red did. Lots of front leg shuffling, it's just the best way for a bipedal model to do it