Can an AI beat me at Super Auto Pets?

BrainTank: Deep Learning

zhlédnutí 78 896

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 24. 07. 2023
A deep Q reinforcement learning AI takes on real humans at Super Auto pets.
Věda a technologie

Komentáře • 151

@Firstname..Lastname Před 8 měsíci ⁺¹⁸⁰
I think the reason the AI favored using cupcakes is that cupcakes are a great way to win early rounds. Since a “win” condition was to only win 5 rounds, it will reach its win condition before cupcakes value starts falling off significantly. I predict it would favor cupcakes less if it’s win condition was 10 trophies, so that cupcakes could not quickly propel it to victory
@mariobazaldua4016 Před 8 měsíci ⁺²⁵
I think the "Win 5 rounds" idea limited the possible strength of the AI
@CAMSLAYER13 Před 8 měsíci ⁺³
Cupcakes aren't the best but on the other hand its a common real person problem to prioritise long term gain too much when you only need to win 10 and the faster you win generally the easier it is. What I'm saying is the AI might be onto something if it had more time, if you bought cupcakes in the right situations you could get a leg up. Its just not information a person could reasonably process without spreadsheets and a lot of time.
@Animeeater25 Před 8 měsíci ⁺⁷
Also remember it is playing against itself, which is also only trying for 55% win rate. I would wager that most real players are pretty much always trying for the 10 wins. When it plays against itself, it never really gets punished for using cupcake because its opponent is also not going for long term strat, only further reinforcing the behavior.
@one_victory6145 Před 6 měsíci
Donno. I disagree.
According to the Q function used in the video, the AIs were optimized to maximize the amount of wins, not to get 5 wins. The problem here is (assuming there is one) is local convergence, or local maxima.
I'm guessing one of the AIs found out that using the cupcake very liberally is a very effective strategy to employ against other AI players who were not competent enough to deal with it at the time.
This probably happened because using the cupcake is a single action, which the AI can easily come across randomly. On the other hand, countering the cupcake is far more difficult, as it requires long-term thinking with chains of actions across turns.
After the cupcake discovery, all the other AIs converged towards that behavior because the cupcake AI kept beating all the others who could not find an equally easy counter. Eventually, thd cupcake strategy got stuck there after millions of self-iterations.
This is one of the biggest and most common limitations of self-play. A possible solution would be to match it against actual human players in the training process, who do play with long-term strategy in mind.
But that is too resource intensive, and could be unpleasant to the devs + players.
@cccant26 Před 8 měsíci ⁺³²⁵
We're doomed if he feeds NL content to the AI 💀
@darwinjackson3560 Před 8 měsíci ⁺⁴⁴
people ain't ready for the high iq Hedgehog tactics
@lcart1064 Před 8 měsíci ⁺²³
Watch the AI start using Bacta Tank strats
@EsotericCat Před 8 měsíci ⁺¹⁷
Lmao not surprised the top comment is ab NL. Our venn disgram of NL community and sap is almost a circle
@Isand-l-manI Před 8 měsíci
What is NL?
@VinnyZoomer Před 8 měsíci ⁺⁵
@@Isand-l-manINorthern Lion
@equi31 Před 10 měsíci ⁺¹⁸⁴
Great video! As a competitive Super Auto Pets player, I can't help but wonder how an AI like this would do in a 1v1. There's a lot more to think about, since you can always see your opponent's team from the previous turn. The AI would have to learn things like repositioning its units, countering specific teams, and sometimes intentionally throwing rounds that it can't win to gain a long-term advantage. I'm not sure how well it would do at that, and it'd be really interesting to see.
@Ohrami Před 8 měsíci ⁺¹
A well-trained AI would do better than any human, obviously.
@equi31 Před 8 měsíci ⁺⁴
@@Ohrami in any other game I'd agree with you, but with a game like this I'd have no idea how to even start creating an ai that matches the creativity of top players
you might still be right though, and if you are then we're all doomed lol
@Ohrami Před 8 měsíci ⁺²
@@equi31 This very video just demonstrated to you the methodology of doing so. There are artificial intelligence go and chess engines that are far superior to any human player in a game that's also far more complex.
@mjb3190 Před 8 měsíci ⁺²
@@equi31finding unique and unusual ways of winning is kind of ai hallmark. Given enough training time, ai will always win
@equi31 Před 8 měsíci
@@Ohrami you're probably right actually, assuming that the rolls are the same on both sides. Sadly though there's not currently a way to do that in the game, and any amount of randomness in a game means that a human can always beat an ai (just look at a game like scrabble)
@fabrizioalvarado4021 Před 8 měsíci ⁺³⁴
Not being able to freeze pets is a huge hindrance in the early game, possibly loosing level ups that can snowball the power level. Even in the late game, imagine having the possibility of leveling up into an early monkey, but forgoing that line because you would not be able to afford it that round without selling other pets.
The fact that it performed so well without so many key mechanics makes your model even more amazing!
@fabrizioalvarado4021 Před 8 měsíci
Ilfm
@RomanQrr Před 8 měsíci ⁺²³
I feel like increasing rewards for the latter rounds could guide the model to do better at late game.
@themurph930 Před 9 měsíci ⁺¹¹⁷
What about freeze food, freeze pets, rearrange for optimal positioning? So many variables to consider!
@alifibrahim5064 Před 8 měsíci ⁺¹⁶
This is what I was wondering! Would love to see a deeper experiment with the full options. Wondering if it will ever go for a 50/50 shop chicken strategy
@alessandrosilvafilho8527 Před 8 měsíci ⁺⁵
@@alifibrahim5064considering it went for the remarkable 50/50 shop can without cat strategy...
@kroepoek3764 Před 8 měsíci ⁺²
He did let it rearrange the pets, but indeed he missed out on the freeze part! Would have probably made a big difference, since it is a big aspect of the game, but would also make the experiment alot more brutal. So many more variables and considering this already took him 4 months I am not surprised he didn’t take it into account
@thvist Před 8 měsíci ⁺²
Huge respect for your work! The video was informative, well structured, funny and in the end there is also quite a bit of tension!!! Instantly subbed and looking forward to where this channel is headed!
@generalgrievous2960 Před 8 měsíci ⁺²
Most videos making an AI don’t go into much depth let alone do it so well. I absolutely loved your video finally feel like I somewhat know how deep Q learning works. Good work!!!
@awkwardsilence4241 Před 7 měsíci ⁺¹
Incredibly well made video! Hope to see your channel grow in the future.
@TheFutureThis Před 8 měsíci
What a sick project. Studying this at uni rn, thanks for an unintentional study video
@Captain_Pine_oo Před 8 měsíci
Great video, kinda love how it feels like a school project, but with video games
@krembananowy Před 8 měsíci ⁺⁶
Usually when doing self-play, AI is competing against ghosts of *all* best performers of the past, instead of only the most recent one This might have helped to avoid catastrophic forgetting.
@grantmnelson Před 11 měsíci ⁺¹⁷
Incredible video dude, it’s honestly crazy that it has so few views. Hopefully the algorithm picks it up soon. It might be worth looking into changing the thumbnail up, because the content itself is genuinely incredible. Maybe change it to center ai more than SAP, I’d check out code bullet’s style on his machine learning vids.
In regards to the video itself, it’s interesting to see how the neural network decided to focus so hard on winning the next round. It seems like that is a result of training with reward only based on the next round maybe? I’m surprised that it opted for cans though under the assumption that it was only trained on the next round though, but it also makes sense assuming that the ai wasn’t trained under the restriction of 5 losses.
Regardless tho again great video, wishing you luck in the algorithm:)
@braintankdeeplearning1540 Před 11 měsíci ⁺⁵
Thanks for the love!
I definitely agree with the thumbnail. I’ve brought it up with some friends and I think we all agree it’s doesn’t show a lot about the actual meat of the video.
And for the algorithm, I didn’t talk about it in the video as much, but the amount the algorithm considers values immediate rewards to potential future rewards is a number we can change. If you look at deep q learning literature, they talk about it as the “epsilon” hyper parameter. In this project I used a high (0.9875) value, as I found it did the best job at making decisions that consider the future. It actually loves slower scaling units like the giraffe and the monkey. The cupcake is a pretty interesting little quirk the model has.
@rainiedash9225 Před 9 měsíci ⁺¹
Eyyy this was super cool and very well made!
@Ciruelatron Před 11 měsíci ⁺⁶
Definitely thought this video would've way more views. Wish you all the success on your CZcams journey 😅 :).
@nottherealtom1428 Před 10 měsíci ⁺¹
Very entertaining vid man, keep it up!!!
@DarkAurora10 Před 8 měsíci
I’m upset that this wasn’t recommended to me sooner. Dope video 😊
@pikapoke1664 Před 8 měsíci ⁺¹
what the hell 74 subs?? this video waaaaaaay too well produced and has such an original idea. I will be sharing this video
@firetaco1 Před 8 měsíci
Such a high quality video, keep it up!
@alexbazyk2344 Před 9 měsíci ⁺¹
cool shit. you deserve more subs man. keep it up!
@Victoria_Huot Před 8 měsíci ⁺¹
That was great! Depending on what you are aiming towards simplifying the explanations could help make your content more accessible to everyone but that depends on what you want your videos to be like
@themetalfleece Před 9 měsíci
Amazing video! Both educational and fun, kudos! :)
@_i_a_n_ Před 8 měsíci ⁺¹
Well written and explained. Nice work.
@HiFisch94 Před 9 měsíci ⁺¹
Really well made video!
@devilix7132 Před 8 měsíci
I had an idea for this but zero skill thanks for letting me see what could have been
@logotzip3967 Před 9 měsíci ⁺¹⁷
Making a series on the AI playing ranked mode would be so fun
@spec9513 Před 8 měsíci
I thought u had 800k subs! Def. Deserve more subscribers 🎉
@Gorbgorbenson Před 8 měsíci
It's so funny that NL has the space dominated that he shows up even in the quick scroll
@angelo6522 Před 8 měsíci
great video, super interesting idea!!
@Cinkom Před 8 měsíci ⁺²
At 20:30 the deer should have been position 1 according to the ai, and it would have resulted in a win.
@empru4553 Před 8 měsíci
Ive been daydreaming about doing this for a time. Thank you for this video.
@aliciadalbey1201 Před 8 měsíci
im a music major who occasionally touches this game and has 0 interest in coding but this was an extremely engaging video even with 0 knowledge of most of the things you were talking about. Great vid!
@als_pals Před 11 měsíci ⁺⁵⁰
Did the ai not have the option to freeze? I know that adds even more complexity though! Amazing work and wonderful video, even if I am a bit thick for some of it 😋
@braintankdeeplearning1540 Před 11 měsíci ⁺³¹
Hey great question! I had the ai originally freezing and unfreezing pets and food in the shop. I took it out for mostly 2 reasons.
1. Adding freezing to the action space made it so the AI had to add so many more actions it needs to try and learn from.
2. The AI could not properly use it very well. It would mostly just randomly freeze and unfreeze. Not very useful.
Glad you’re liking the content and hope you were able to learn something new!
@user-kw5dy1dz4s Před 8 měsíci ⁺⁹
@@braintankdeeplearning1540 perhaps you could have made it so unfreezing cannot be done subsequent to freezing requiring the bot to complete another action.
@notster7114 Před 8 měsíci ⁺⁷
@@braintankdeeplearning1540 Instead of giving the AI the option to end turn when it has less than 3 gold, make it think it can still buy pets/food and if it would rather buy a pet or food than roll, freeze that item and then roll and repeat until 0 gold is left?
@balisa7350 Před 8 měsíci ⁺²
@@notster7114but wouldn't be much of an trained network playing (would be using some human strategies) . I'd say it's a great start training without the freeze and adding it later on
@neotreo3670 Před 11 měsíci ⁺⁴
Nice Video! very underrated
@braintankdeeplearning1540 Před 11 měsíci
Thanks man! Glad you enjoyed!
@Shaqtapus Před 8 měsíci ⁺²
So how did you decide on targeting 5 wins over the more elusive 10 win game that most human players target?
In particular the cupcake strategy you noticed is buffed in a model targeting 5 trophies since each trophy is more valuable and the model is less reliant on scaling and winning later rounds. Which is why humans don't target the cupcake in most arena battles
@tspeck1 Před 8 měsíci
Very well written and presented video! You earned 1 more subscriber, good luck on your youtube career. You mentioned the code available on your website but I can't find that anywhere, as a suggestion, you could put it up in the description, I'd love to look more into how you did this :D
@lachland592 Před 8 měsíci
Absolutely fantastic video! I have wanted to do this for ages but never found the time. Your implementation is impressive. What hardware did you use for training? Have you considered using human teams in the rollout or learning phases? I’m wondering how much of the training is spent just getting it from incomprehensible moves to justifiable moves.
@xwkya Před 7 měsíci ⁺¹
Why is the blog and description gone.. I was curious as for why a transformer was used rather than fully connected layers.
@yeetr926 Před 8 měsíci
Hey could you please add your blog to the description? I'm really interested in learning more about this but I don't see a link to your blog. Thanks!
@salomonnoam6288 Před 8 měsíci
Great video🎉
@nathanriddle1663 Před 8 měsíci
This is awesome. I wonder if this would be worth using for pet nerfs/buffs
@blahalujza Před 8 měsíci
Great video!
@guillermoagullomarti9 Před 8 měsíci ⁺¹
Criminally underrated
@rubengoldman5830 Před 8 měsíci
Considering how many games of SAP the AI played, it probably rolled past more sloths than any human player.
@kylecoleman6221 Před 8 měsíci ⁺²
Dis you not consider freezing a pet or food in the shop aa a possible action at 6:38 becuase that was too difficult to code?
@Outthebots Před 8 měsíci
Also for the game state, i would add more columns like a column for boost_per_to_the_right = 1 or 0 so that it can know more than just the level of the pet
@peternordhaus5590 Před 8 měsíci
I would love to see an AI like this take on inscription kaycee's mod
@Illogical. Před 8 měsíci
It hasn't taken over the internet. It has taken over your personalized video recommendation feed.
@abhinavnarayana6932 Před 8 měsíci
Great video
@TheFutureThis Před 8 měsíci
I don't know if I missed it but did you incorporate freezing items in the shop as an action?
@brainles_wan1222 Před 8 měsíci
would love to see this in the new ranked mode
@Chief_Tyrol_ Před 9 měsíci ⁺¹
Do the other packs! Well done!
@alexanderhustinx828 Před 8 měsíci
What kind of hardware did you use for training?
@MoonMoon-bj9jr Před 8 měsíci
this is incredible
@Graceclaw Před 8 měsíci
Next iteration needs to incorporate Freezing!
@jeanibarbide1647 Před 5 měsíci ⁺¹
can you put this ai on ranked ladder and see what rank it hits after a week
@Cherokeechuck9 Před 9 měsíci
Commenting for the algorithm. Sweet video
@Yes.-_- Před 8 měsíci
In turn this also showcased that the Ai wasn’t fully trained yet, when you showed the graph it also showed it’s stagnation, and this is because it’s still hoping on certain rng strats, as evidently shown it won 50% which is exactly what you trained it for i suppose when you put the win % necessity at 55%, in that case it’s successful.
In a more strategic based thought process of winning 80% plus i imagine if you let it run for months with the addition of letting it look at abilities alongside all the rest, that it Would easily start playing as good if not better than most humans, i imagine this would be seen as cheating tho so once again, looking at what you wanted to achieve and how long it took you, you did an amazing job and I’m definitely looking forward to seeing more from you, keep up the work.
@edgeman1135 Před 8 měsíci
A major difference between the priorities of this AI and of human players is its shortsightedness. I'd love to see what a model could achieve if trained against sampled human teams from all rounds, unfortunately I can't seem to find any databases for that.
@shadowfootball1 Před 8 měsíci
Super interesting!!
@monster4492 Před 7 měsíci
5:12I like how the rarest pet in the game (the sloth) has a lower value than the mosquito
@Outthebots Před 8 měsíci
Hello @braintank, can you turn off the RNG? In the python version of the game, can you find the variable for RNG and just make it a constant seed so that your model perfects 1 game first before it moves on to other games and becomes more generalized?
@rehorizon Před 9 měsíci
could you add some links to the description?
@frimi8593 Před 8 měsíci ⁺¹
While it’s cool that your ai can perform above average in half of the ten games, it’s still notable that the average score in the end was 4.7 wins, which is below the target goad of 5
@AP-dc1ks Před 8 měsíci
Could muffins actually be good, if they shift losses to when you have a monkey or penguin?
@DDethable Před 8 měsíci
So I've been thinking about this video for a few weeks now and I was kinda wondering about how when the ai lost a pokemon it lost a significant amount of points causing it to kinda go into shock right? Well it got me thinking what happens if that goes the other direction? What if you changed to point system a bit. What if when a pokemon gains exp, the ai gains that many points too, gain zeny same points, catch pokemon, levelx10 = points. Lose zeny after loosing a battle, loss in points. The game already tells you how rewarding something should be for the most part why not just implement that?
@Timotheeee1 Před 8 měsíci
when I go to the blog post it's just placeholders
@spoder22 Před 8 měsíci
i like the Brad Owen music
@michaelmoccio2225 Před 8 měsíci
Very cool stuff, but freeze pet/food needs to be an action for the AI for it to have any hope of reaching 10 wins
@voidiac5832 Před 8 měsíci
I believe the inability to freeze pets is likley the reason the ai was unable to fully win a game as freezing pets is a valuable resource
@TunaIRL Před 8 měsíci
More likely it was the fact it wasn't trying to fully win lol
@Duffman19370 Před 7 měsíci
That cursor has some pretty human movements....
@Stewpitt377 Před 8 měsíci
I think you took on an impossible task (I think in total ignorance). This game requires forward thinking and planning, something missing as evidenced by a lack of freezing. It doesn't appear positioning is taken into account which is of monumental importance. Still an awesome video, and beyond impressive what you were able to accomplish.
@PROdotes Před 4 měsíci
A big issue with doing things like this is not having good "bots" to train on. If the AI only fights itself, it develops way to counter itself, not to counter the average player...
Still nice video tho :)
@Timotheeee1 Před 8 měsíci
I wonder how well GPT4 would do with a good prompt
@Catliketheif86 Před 8 měsíci ⁺¹
Preference for cupcakes makes sense because AI striving for 6 wins and game is simpler in earlier rounds
@JollyLad63 Před 8 měsíci
cool stuff
@user-ck1xh4dt8n Před 8 měsíci
How generoud of you to spend 4 months on 30 minutes of our entertainment!
@alessandrosilvafilho8527 Před 8 měsíci
I think it would be better if the buy actions were "buy shop slot x to team slot y", instead of "buy pet x to team slot y" (assuming less possible actions makes it learn faster). I also think health is a very important imput to pass, it could proportion more risk management strategies.
@takenname8053 Před 8 měsíci
Wonder if it can win achievements.
@cocothecocobo1860 Před 8 měsíci
so effectively its just a repeated brute force of all outcomes and then makes data, then draws basic conclusions that are quantified (Ex: Camel good, Camel is rated 95/100, i should pick camel)? if so, this is the type of thing that would be very interesting to apply to other turn based games. Obvi would be much slower/harder with more complex games, but i think it would be interesting to see in online card games.
@jnakhoul Před 8 měsíci
Here’s the major ethical question, did you teach it that it has to take sloth no matter what?
@DeadHawk23 Před 8 měsíci
I'd say make wining 10 rounds per game the goal. It would HAVE to see what gets the best late game strats. Idk maybe I missed where he said he did that if he did but to me it seems this one is gunning for 5 wins which is why cupcakes is good.
@10Krill Před 8 měsíci
What about freeze?
@theroundtomato Před 8 měsíci
what is your website url?
@user-kw5dy1dz4s Před 8 měsíci ⁺⁴
How did you reward the bot did you reward it for 6 wins? Because humans play for wins. Might explain cupcakes
@richyggkiubov6456 Před 8 měsíci
From every 10 games you only get a win ratio of 4.7. In my opinion that is a little bit bad ratio but it is still a very good ratio for an IA
@wonkadonk Před 8 měsíci
Ya i was waiting for ai to come mess up the game lol
Seems like an easy game for a computer to play
Nice job, hope it doesn't get too popular tho
@JustinDoesntLookAt Před 9 měsíci ⁺¹
I feel like the ai could have done even better if it weighted pets based on general competitive usage/rating. Starting with some initial values hand set by humans, and tweaking as it learns
@rehorizon Před 9 měsíci
not sure it would help since the game is more about combos and not individual pets for the most part
@PoGuy12 Před 8 měsíci
@@rehorizontbf early deer, monkey or other pets are almost always good takes even if there’s no really synergy yet
@sageoblouk6782 Před 8 měsíci
This is not the type of channel I will subscribe to, but it is so well done I can at least give you a like and comment.
@viCuber Před 8 měsíci
Freezing?
@Nightamre__ Před 7 měsíci
Can the bot freez?
@plasmaflyingpig1869 Před 8 měsíci
Also you forgot about killing an animal with a pill, freezing animals/food
@DeadHawk23 Před 8 měsíci
Fuck that white background lol. It basically blinded me so I had to turn on flux.
@joshfoe1724 Před 8 měsíci ⁺¹
The average round win rate is only 4.7, wouldn’t this mean it’s not better than the average person if the benchmark is 5 round wins? Or did you exclude averaging to exclude any outliers in such a small sample size of ten games?
@joshfoe1724 Před 8 měsíci
Great video tho, really enjoyed it
@Graceclaw Před 8 měsíci ⁺²
The benchmark was based on deep runs rather than average wins - sometimes your strategy just does not work. But what matters is how many high-win runs you can make (see: NorthernLion and his "believer" bets in Twitch Chat)
@Graceclaw Před 8 měsíci ⁺¹
Also, I'd be surprised if the average human player gets even 4.7 wins per game on average. Maybe so, but I doubt they do significantly better from an average perspective. 😊
@frogs3613 Před 8 měsíci
@@Graceclaw I think you're probably right that the average human player doesn't get that much but that's because many little kids and new players play the game. If you took the average of people above the age of say 15 with more than 5 hours in the game it would probably be closer to an average of 7 trophies per game. Personally, I think it's a very odd idea to have the AI aim for 5 trophies in a game considering a game isn't won unless you get 10. It would be like having a chess AI that is only good at openings, it wouldn't be beating many people. I'm sure it's probably due to a limit of the AI that I don't understand because I'm not knowledgeable in the subject whatsoever.
@TunaIRL Před 8 měsíci
@@frogs3613 Like he mentioned in the video any added complexity increases the time it takes to train exponentially. He probably didn't want to spend years on this.
@Hojrak512 Před 8 měsíci
5:28 I'd rather sell the fish and put the camel behind the elephant
@EuMoroNaEsquina Před 8 měsíci
0:04 OLHA O ANDRE ALI KKKK
@super_7710 Před 8 měsíci ⁺¹
What about positioning?
@braintankdeeplearning1540 Před 8 měsíci
Didn’t go super deep into it because the video was already getting long. I used a greedy search strategy to order the pets. Essentially I went through and manually gave “points” to desirable qualities. IE a mammoth at the front is good because it buffs all your pets behind. That could be 5 points. Then the pets were put in order one by one, putting them in the position that maximized the sum of all the points!
@super_7710 Před 8 měsíci
@@braintankdeeplearning1540 what do you mean by greedy search strategy?
@RinInABin Před 8 měsíci
@@super_7710It's a term to describe algorithms in computer science! what it boils down to is it looks for the result with the highest immediate benefit, without taking into account the future.
@kanskjeetmenneske8845 Před 11 měsíci ⁺²
coment for the algoritem
@pedrov.8087 Před 8 měsíci
Now add freezing
@theunknown4834 Před 8 měsíci
How about freezing

Další v pořadí

Automatické přehrávání

Cat Is Still Too Strong | Super Auto Pets