AI Plays Trackmania - Map5 2:04:91

Linesight

zhlédnutí 8 756

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 8. 06. 2023
The AI is trained via reinforcement learning.
Game: Trackmania Nations Forever (TMNF)
Map: tmnf.exchange/trackshow/10460245
Replay (.gbx file): drive.google.com/file/d/1hp1M...
Hry

Komentáře • 66

@lordnoom4919 Před rokem ⁺⁴¹
nice that it even figured out a rammstein hit to start a drift. Good work right here
@wazthatme Před rokem ⁺⁴
This makes me so happy to see I could watch AI learning to play games all day
@m.i.c.h.o Před rokem ⁺⁷
It could, in fact, neo slide.
Writual.
@gugus8081 Před rokem ⁺⁴
This is impressive, I'm not even sure I can beat that RTA... Keep it up !
@exlpt2234 Před rokem ⁺⁴
This is insane, great work!
@linesight-rl Před rokem ⁺³
Thanks a lot!
@Metcoler Před rokem
Nice work! It had to take a lot of effort. It is very impressive, that car can initialize a drift and drives very close to walls, and hit apexes like nothing. Big respect for this piece of work. Keep it up!
@linesight-rl Před rokem
Thanks a lot! It has indeed been a lot of work, and we're still working on it! Next steps includetraining on more varied and less boring maps. We'll post progress videos along the way 🙂
@corbanizer7376 Před rokem
Keep on going dude. This is sick
@eddyreising6567 Před rokem
very impressive work!
@Arcsinx Před rokem ⁺¹
Crazy ! Yosh should see this
@okty8372 Před rokem ⁺⁴
is the AI able to generalize it's "driving skills" to other maps ? Amazing work btw ! (i m really interested in IA and love TM, so it's perfect content for me :) )
@linesight-rl Před rokem ⁺⁴
We'll find out soon enough :)
@heavysaur149 Před rokem ⁺⁶
I wonder what inputs are put in ? Is it based of field vision (like what is sees via the camera) or is it based of coordinates (it already knows all the map and can view his position on it) ?
And do you input what it outputs the frame before ? (like to know if he continues his drift or not)
And speed ? rotation of the car + where it goes (to know if he drifts) ?
I have so many questions
@linesight-rl Před rokem ⁺¹¹
Inputs contain a screenshot of what is displayed by the game, the relative position of a few checkpoints on the centerline of the circuit in front of the car, the agent's previous action, the car's speed, and the direction of the gravity vector.
@livingroom5899 Před rokem
Better than I will ever be.
@curcodes Před rokem
really good work
@curcodes Před rokem
I got a challenge: next time, test your future AI on this. And see if reach 2.04 in
@Gryffins90 Před rokem ⁺¹
Excellent project that I always wanted to try myself. I've seen your other response to the other comment asking for contribution. I'm also interested in helping (data scientist myself) so get in touch if you're willing to extend the team. I've a 2080ti available at home.
One suggestion for future video is to also show the keyboard input (only the 4 keys) in addition to the tree of input as it is more similar to how human display their inputs.
@PassiveIZ Před rokem
Thats crazy after just 2700 runs and 30hrs
@Sagosmurfen Před rokem
Neo slide god!! 😮
@OPEK. Před rokem ⁺¹
I’m interested to see how it handles random ramsteins and landing bugs tbh
@pinipilla Před rokem
Those are not random, trackmania physics are deterministic its just changes too much with a little input change, which is not a problem for a machine
@Linck192 Před rokem ⁺¹
Why did you make the AI output these groups of inputs instead of 4 values, one for each direction?
@linesight-rl Před rokem ⁺¹
This is a requirement of DQN-like methods : each action is associated with a single value, and you pick the action with the highest value. DQN does not handle picking multiple actions at the same time.
@gaiekkurvanov1841 Před rokem ⁺⁶
Which algorithm is used ?
@linesight-rl Před rokem ⁺⁵
This is value-based reinforcement learning.
We use a mixture of Implicit Quantile Networks, with N-steps and dueling networks. We also implemented Prioritized Experience Replay, Persistent Advantage Learning, Noisy layers for exploration and Quantile options (QUOTA), but those bricks are currently not used.
@rFey Před rokem ⁺¹
Idk if this would be possible but i would love to see another angle to take ML/AI with trackmania. Feed it thousands of TASes or WRs on a bunch of maps with lots of different turns, block combinations, drifts whatnot and then see if it can get good times on real maps. My layman brain sees this as way more complicated so it probably is but yknow a man can dream
@linesight-rl Před rokem ⁺⁴
What you are describing is called "supervised learning" where an AI is fed expert information and tries to reproduce the behavior of that expert.
In this video, we use another technique called "reinforcement learning" where the AI does not need to receive good runs, it is able to learn alone.
Supervised learning is generally easier, but has the drawbacks that it requires huge amounts of replays and that it will never become better than the expert it tries to mimic.
Reinforcement learning may be more difficult, but it can theoretically find strategies that were never shown to him.
@rFey Před rokem
@@linesight-rl My idea was to use the information from supervised learning on random maps the AI hasn't "seen" but then i realized that wouldn't work if you couldn't also feed it block information or make some wild machine vision solution 🤔
@RadiantDarkBlaze Před rokem
@@linesight-rl Is it possible to do something like starting a training run for a map as supervised learning, then switching the same training run to reinforcement once it reaches a certain fitness on the supervised part; so that it can surpass the player who provided the replays for that map for the supervised part as it goes about the reinforcement part?
@ryans3979 Před rokem ⁺¹
@@RadiantDarkBlaze The idea you have does exist, it's typically called pre-training or sometimes bootstrapping. It's where you train a model with one method (so supervised learning could work), and then it has somewhat of a baseline behavior. In the case of supervised learning it might learn how to imitate some of the various tech that TASs use. Then, you can further train it using a different method to allow it to refine itself and improve past its current level.
The issues with that strategy are that, like linesight mentioned, you'd have to feed it a massive amount of replays. It's likely you don't have thousands upon thousands of TAS runs for a single map, so you'll need to feed it random TAS runs of other maps. If you do that, you have to deal with negative transfer, where what the tech and skills it learns from other maps might interfere, you don't want it trying to use glitches that are impossible or useless on a simple map like this. It's harder to make a generalized AI than it is a specific AI, and that's what you'd be doing with the supervised learning. That's a broader task than this AI which is just running on a very simple map. It could work in theory though, it's just more time consuming and more computationally expensive to implement.
@RadiantDarkBlaze Před rokem
@@ryans3979 Would something like taking a single good human replay, and putting it through the brute-forcer tool while saving every single tiny improvement to eventually gather 10k+ technically-unique replays work for generating a supervised learning set for a map? Or is there express reason 10k+ (human or TAS) replays of a specific map are needed? I do think it's necessary to only train a specific net on a single specific track, I was never thinking the idea could be used for making a generalized all-rounder net.
@pixelmalfunction1772 Před 9 měsíci
is ur ai on the leaderboard the 1 with the sub 1:50 cause that would be impressive if it found cut and if it didnt then i beat the ai by 2 sec but it prob did
@vjproject Před 10 měsíci
Porque no se ve completamente las marcas de neumaticos? Modificado o baja calidad😅
@Stunde0Null0 Před rokem ⁺⁴
Wirtual taking a L. kekw
@jorishenger1240 Před rokem ⁺⁵
What if you let this AI loose on E02
@linesight-rl Před rokem ⁺¹
I guess we'll have to try :)
@jorishenger1240 Před rokem ⁺¹
@@linesight-rl would be amazing to see, would the small jumps be a problem?
@linesight-rl Před rokem ⁺²
@@jorishenger1240 We're currently testing on more complex maps. Neither jumps, slopes, borderless roads seem to be a problem.
@jorishenger1240 Před rokem ⁺²
@@linesight-rl amazing to see that tech has come so far that this is done by a person, not even a company or smth. So cool
@11DowningStreet Před 10 měsíci
how does this work? it looks really cool
@barakeel Před rokem ⁺¹
What was the reward when it was not able to finish the track yet?
@linesight-rl Před rokem ⁺²
Simple question, simple answer: nothing. Neither a reward nor a punishment.
This will likely trigger the question "what's the reward then ?". It's mostly progress along the track.
I think we'll start to add voice-overs or some explanations in the next videos, look out for them :)
@ibozz9187 Před rokem ⁺²
Are those neoslides or normal drifts?
@lordnoom4919 Před rokem
looks to me like most are release drifts
@fontur5119 Před rokem
most of them are neoslides
@pekatour Před rokem
@@lordnoom4919 Aka neo drift
@lordnoom4919 Před rokem
@@pekatour nope u dont need to release during a neo. Since neo = steering --> stop steering --> start braking --> steer again. All while holding down acceleration
@pekatour Před rokem
@@lordnoom4919 mb
@xtraz9814 Před rokem
Hello people from Wirtual videos
@lucacu3587 Před rokem ⁺²
any wirtual vid watchers here???
@user-dh8oi2mk4f Před rokem ⁺¹
Yes
@ArrakisMusicOfficial Před rokem
What GPU? :)
@linesight-rl Před rokem ⁺¹
Nvidia 3060
@ArrakisMusicOfficial Před rokem ⁺¹
@@linesight-rl How did you manage to get it learn so quickly? 2900 runs is ridiculously low amount for how good it got. You must have used very good priors, how did you do it? Careful reward modelling? Or really good initial policy? Or really good exploration policy? What RL method did you use? :)
@201pulse Před rokem
Hi linesight I'm an experienced data scientist and I would be interested in helping and contributing to this project. Some time I actually wanted to do the same so I might have some cool ideas. Are you interested?
@linesight-rl Před rokem
Hi, thank you for your interest. While it is always helpful to have another person's perspective, this is a rapidly evolving 2-person project. At least in the short term, we prefer to keep it small.
We will probably have a more open approach in the future and welcome contributions. You're welcome to ask again in a few videos' time!
@linesight-rl Před rokem
How should we contact you when we are more open to contributions?
@zillion8954 Před rokem
now train it on a acual map
@Queen_Elizabeth249 Před rokem ⁺¹
I wonder if KarjeN could defeat this AI
@user-go5ee4cs3c Před rokem ⁺¹
At first i thought human would be faster, but the length of the map...
@Queen_Elizabeth249 Před rokem
@@user-go5ee4cs3c true
@mk-ej3cz Před rokem ⁺¹
For sure he could
@hayabusa10055 Před rokem
@@mk-ej3cz as of now yes easily, but there's no telling how far it can be pushed maybe even to the point that AI makes TAS runs itself without your help
@ozzehh Před rokem
wirtual sucks compared

Další v pořadí

Automatické přehrávání