Deep Q-Networks Explained!

Q-Learning Explained - A Reinforcement Learning Technique

Proximal Policy Optimization | ChatGPT uses this

Getting her riled up for no reason 🤣

KONČÍM CESTU NA OLYMPII A ZÁVODNÍ KARIÉRU

Ne vždycky to jde napoprvé😅💕

Monte Carlo in Reinforcement Learning

CodeEmporium

zhlédnutí 9 857

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 27. 08. 2024
Let's talk about how Monte Carlo methods can be used in reinforcement learning
RESOURCES
[1] Other Monte Carlo Video: • Running Simulations as...
PLAYLISTS FROM MY CHANNEL
⭕ Reinforcement Learning: • Reinforcement Learning...
Natural Language Processing: • Natural Language Proce...
⭕ Transformers from Scratch: • Natural Language Proce...
⭕ ChatGPT Playlist: • ChatGPT
⭕ Convolutional Neural Networks: • Convolution Neural Net...
⭕ The Math You Should Know : • The Math You Should Know
⭕ Probability Theory for Machine Learning: • Probability Theory for...
⭕ Coding Machine Learning: • Code Machine Learning
MATH COURSES (7 day free trial)
📕 Mathematics for Machine Learning: imp.i384100.ne...
📕 Calculus: imp.i384100.ne...
📕 Statistics for Data Science: imp.i384100.ne...
📕 Bayesian Statistics: imp.i384100.ne...
📕 Linear Algebra: imp.i384100.ne...
📕 Probability: imp.i384100.ne...
OTHER RELATED COURSES (7 day free trial)
📕 ⭐ Deep Learning Specialization: imp.i384100.ne...
📕 Python for Everybody: imp.i384100.ne...
📕 MLOps Course: imp.i384100.ne...
📕 Natural Language Processing (NLP): imp.i384100.ne...
📕 Machine Learning in Production: imp.i384100.ne...
📕 Data Science Specialization: imp.i384100.ne...
📕 Tensorflow: imp.i384100.ne...

Komentáře • 24

@reginakim9265 Před 29 dny
Thanks for your intuitive explanation about Monte Carlo! It was so helpful for me to get the concept
@Akshaylive Před 9 měsíci ⁺³
One important reason to use MC methods is cases where we do not have access to the markov decision process (MDP). The example in this video does have a known MDP so this can be solved using bellman equations as well.
@devinbrown9925 Před 4 měsíci ⁺³
For Quiz Time 1 at 3:47, Shouldn't the answer be B: 0.5 sq units.
I think the entire premise is that you know the area of a region, you know the ratio of balls dropped in both regions, and the ratio of balls dropped equals the ratio of area. Therefore you can use this information to determine the unknown area.
@syeshwanth6790 Před 8 měsíci ⁺¹⁰
0.5 sq units.
The area of square = 1*1 = 1 sq unit.
Half of the balls dropped fell into the diamond, which means the diamond occupies half the area of the square (Area of diamond = (1/2) * 1 sq unit = 0.5 sq unit).
@CodeEmporium Před 8 měsíci ⁺¹
Ding ding ding. That’s correct :)
@syeshwanth6790 Před 8 měsíci ⁺¹
@@CodeEmporium
Question 2)
B. Frank was updating Q-values based on observed rewards from simulation.
@syeshwanth6790 Před 8 měsíci
Loved the way decision making of a robot using Q table was explained in this video.
@CodeEmporium Před 8 měsíci
Glad the explanation was good. Thanks for the comment :)
@AakashKumarDhal Před 4 měsíci ⁺²
Answer for Quiz2: Option 'B' frank was updating Q values based on observed rewards from simulated episodes.
@NG-ec2th Před 6 měsíci ⁺⁶
In S1 (8:08) the greedy action is to go up, actually...
@AshmaBhagad Před 3 měsíci
There is no grid to go up in s1 where it starts there are only two options right and down.
@NG-ec2th Před 3 měsíci ⁺¹
@@AshmaBhagad Then why is there a payoff value in up?
@0xabaki Před 6 měsíci ⁺²
I would use Monte Carlo to predict if there will be food at the office tomorrow because It's so unpredictable when I have to bring in food lol
@florianneugebauer3042 Před 5 měsíci ⁺¹
where does the number of the states is coming from? where is state 17??
@WeeHooTM Před 4 měsíci ⁺³
8:09 I stopped watching when he thinks 1.5 is greater than 2.1 lmao
@AshmaBhagad Před 3 měsíci ⁺²
In state s1 the agent didnt actually have the option to go up. So maybe thats why 2.1 doesnt matter because the agent can only select the best action depending on the state it is in.
At the start he clearly told that the environment has 9 grids(states)
@BizillionAtoms Před 9 měsíci ⁺³
I think you should include the answers of the quizes in the video at some point. Also in 8:00 you said the highest is 1.5 but it is 2.1.
Most importantly, I think these moments for frank where cringe and it distracted me from focusing. Target audience is not kids most likely (at least I think so), so they would consider it cringe too. No offense
@refraindrerondo Před 8 měsíci ⁺³
found it funny, not a kid, but helped me concentrate more😂
@Falcon8856 Před 8 měsíci ⁺⁵
Didn't find it funny, am a kid, but appreciate the light humor and effort put into these videos. Didn't really distract me.
@servicer6969 Před 3 měsíci ⁺³
Stop being such a hater. The reason why 1.5 is the highest is because action with assumed reward 2.1 is illegal in state 1 (you can't move up because of the wall).
-p.s. using the word cringe is cringe
@chinmaibarai1750 Před 9 měsíci
Are you from Bharat 🇮🇳
@ayoubelmhamdi7920 Před 9 měsíci ⁺¹
this is the difficult way to teach Monte Carlo 😂
@swphsil3675 Před 9 měsíci ⁺¹
difficult for absolute beginners I think, otherwise the video was easy to follow for me.
@ayoubelmhamdi7920 Před 9 měsíci
@@swphsil3675do you think that the Monte-Carlo should start learned from how the random could be gives a situation that control by math probability, starting how a coin could have two possibility, no one known if he will win or not to Play many times to be any one will 50% of times

Další v pořadí

Automatické přehrávání

Deep Q-Networks Explained!

Deep Q-Networks Explained!

Q-Learning Explained - A Reinforcement Learning Technique

Q-Learning Explained - A Reinforcement Learning Technique

Proximal Policy Optimization | ChatGPT uses this

Proximal Policy Optimization | ChatGPT uses this

Getting her riled up for no reason 🤣

Getting her riled up for no reason 🤣

KONČÍM CESTU NA OLYMPII A ZÁVODNÍ KARIÉRU

KONČÍM CESTU NA OLYMPII A ZÁVODNÍ KARIÉRU

Ne vždycky to jde napoprvé😅💕

Ne vždycky to jde napoprvé😅💕

IKON A SÉGRA HRAJÍ KÁMEN NŮŽKY PAPÍR CHALLENGE O JÍDLO V BAZÉNĚ ! 😂🍕🍟 #shorts

IKON A SÉGRA HRAJÍ KÁMEN NŮŽKY PAPÍR CHALLENGE O JÍDLO V BAZÉNĚ ! 😂🍕🍟 #shorts

Q-learning - Explained!

Q-learning - Explained!

Q-Learning: Model Free Reinforcement Learning and Temporal Difference Learning

Q-Learning: Model Free Reinforcement Learning and Temporal Difference Learning

A Very Simple Transformer Encoder for Time Series Forecasting in PyTorch

A Very Simple Transformer Encoder for Time Series Forecasting in PyTorch

Embeddings - EXPLAINED!

Embeddings - EXPLAINED!

Monte Carlo And Off-Policy Methods | Reinforcement Learning Part 3

Monte Carlo And Off-Policy Methods | Reinforcement Learning Part 3

Reinforcement Learning through Human Feedback - EXPLAINED! | RLHF

Reinforcement Learning through Human Feedback - EXPLAINED! | RLHF

Elements of Reinforcement Learning

Elements of Reinforcement Learning

Monte Carlo Simulation

Monte Carlo Simulation

天使救了路飞！#天使#小丑#路飞#家庭

天使救了路飞！#天使#小丑#路飞#家庭

7 Nejhorších Katastrof v Česku

7 Nejhorších Katastrof v Česku

So brutal REVENGE 😂😭🔥 @BrutalAssaultOFFICIAL #youtube #festival #comedy #metal #corpsepaint

So brutal REVENGE 😂😭🔥 @BrutalAssaultOFFICIAL #youtube #festival #comedy #metal #corpsepaint

Mikuláš Černák: PŘÍBĚH BOSSE (celý dokument)

Mikuláš Černák: PŘÍBĚH BOSSE (celý dokument)

C’est qui le plus fort 😂

C’est qui le plus fort 😂

Getting her riled up for no reason 🤣

Getting her riled up for no reason 🤣

Táta ČR na dovolené u moře 🏝️ #selixinho

Táta ČR na dovolené u moře 🏝️ #selixinho

7 Days Stranded In A Cave

7 Days Stranded In A Cave