Large Language Models from scratch
Vložit
- čas přidán 24. 07. 2024
- How do language models like chatGPT and Palm work? A short cartoon that explains transformers and the tech behind LLMs.
Part 2: • Large Language Models:...
0:05 - autocomplete
0:44 - search query completion
1:03 - language modeling with probabilities
1:59 - time series and graphs
2:34 - text generation
3:43 - conditional probabilities
3:52 - trigrams
4:49 - universal function approximation
5:19 - neural networks
6:33 - gradient descent
7:03 - back propagation
7:24 - network capacity
This is seriously *really* good, I've not seen someone introduce high level concepts by-example so clearly (and nonchalantly!)
What have they done? amazing stuff
Agree. I have it on one of my playlists now.
This is so good. I can't believe it has so few views.
Same, brillant explaination on NN
Was just about to write the same.
if you really think so, post the link to this video on your social media.
So few views... If a Kardashian posts a brain fart it gets more views from the unwashed masses. That is the sad reality.
Very few study about it
This is an excellent articulation. We need part 3, 4, and 5
Finally! Someone who knows how explain complexity with simplicity.
Being able to visualize this so simply is legendary. You're doing amazing work. Subbed
You have made it so easy to see and understand - it puts into place all the complicated explanations that exist out there on the net.
this is excellently done, I'm very grateful for you putting this together.
Wow. This is so well presented. And a different take that gets to the real intuition.
These visuals were SO HELPFUL in introducing and understanding some foundational ML concepts.
Great video. "energy function" instead of error function, but a great explanation of gradient descent and backprop in a super short time. Excellent job!
I really liked your explanation of how "training a network" is performed. Made it a lot easier to understand
This was awesome. I don't think I could adequately explain how this all works yet, but it fills in so many gaps. Thank you for this video!
Awesome video! I really appreciated your explanation and representation of neural networks and how the number of nodes and weights affect the accuracy.
That might be the best, most concise and impactful neural network introduction I have seen to date
i generally don't subscribe to any channels but this one deserves one. This takes a lot of understanding and love for the subject to do these kind of videos. thank you very much
possibly the best explanation of LLM i've ever seen. accurate, pointed and concise
This is the best explanation of Large Language Models. I hope your channel gets more subscribers!
Brilliant! A truly example of intelligence and simplicity to explain! Thanks a lot.
Stunning video of absolutely high and underrated quality !!!!
Thanks so much, for this !
Great video. The example of the network with too few curve functions to recreate the graph really helped me understand how more or fewer nodes affects the accuracy of the result.
I've been watching a lot of videos on LLMs and the underlying mathematics. This explanation is PHENOMENAL. Not dumbed down, not too long, and uses concepts of existing maths and graphing that cement the concept perfectly.
Straight away subscribed .... i would really love these videos in my feed daily.❤
Very clear and concise explanation! Excellent work!
Best and simplest explanation I have ever come across. Thank you sir
you sir, deserve my subscription. This was so good.
Holy shit. This is one of the best CZcams videos I've seen all year so far. Bravo 👏👏👏
if there is an Oscar for best tutorial on the internet, this video deserves it !
Really great explanation of LLM! Just earned a subscriber and I'm looking forward to more of your videos :)
This is an insanely good explanation. Subscribed.
You are so good at explaining it! Please keep doing it.
such clean and lucid explanation. amazing
Amazingly insightful. Fantastically well explained. Thanks !
Thanks Steve, this explanation is just... Brillant! 😊
Very nice illustration and fantastic explanation. Thanks
nice concise video explaining what is a large language model
Excellent. Some of the best work I've seen. Thanks.
The best content ever I saw about the subject. Super dense and easy.
Fascinating and such wonderful explanation. Thank you very much!
This is insanely good. I've understood things in 8 minutes that I could not understand after entire classes
Agree with the other comments, so clear and easy to understand. I wish all teaching material was this good...
Unbelievably good video. Great work.
I had not considered exactly how words related to eachother in automated texts and this video explained that concept in a really clear and concise way.
This is awesome. Very good Illustrations.
I have been working on ways to explain LLMs to people in the humanities for the past year. You've done it in 5 brilliant minutes. From now on, I'm just going to hand out this URL.
I loved this. Clarity = real understanding= respect for the curiosity and intelligence of the audience.
Requests: Would like more depth about "back propagation", and on to why so many "layers" and so on...!!!!
Probably one of the best explanations I've come across. :)
What an amazing lecture on LLM! Loved the example Markov chain model with the bob Dylan lyrics, that was actually a fun homework exercise in one of my grad school courses. This really helped me understand neural networks, which are so much more complex.
Incredibly well explained! Thanks a lot!
This is a very good video! Excellent explanation on Large Language Models!
Very well explained. Thank you for the video!
Wait a minute all day I try to understand what are neural networks and you have explained all parts so easily wow 😮 it obviously 🙄 imply that I have struggled to learn all of these terms so far but I finally have found a good explanation of back-propagation, gradient-descent, error functions and such 🎉🎉🎉🎉
Thanks, what a video, in 8 minutes I have learnet so much, and very well explained with graphics indeed.
What a fantastic tutorial! Thank you! Liked and subscribed!
Wow, this video was really informative and fascinating! It's incredible to think about how much goes into building and training a language model. I never realized that language modeling involved so much more than just counting frequencies of words and sentences. The explanation of how neural networks can be used as universal approximators was particularly interesting, and it's amazing to think about the potential applications of such models, like generating poetry or even writing computer code. I can't wait for part two of this video!
Thank you so much! Very well and simply explained!
Really great description 👌
Eventhough I knew all this stuff, it is still nice to watch and listen to a good explanation of these fundamental ML concepts.
You are a genius, thank you for this amazing video!
Clean and clear explaination
Clearly explained! I will use it.
The content is gem. Thank you for this.
how is it possible that i’ve watched a ton of videos trying to understand LLMs from the likes of universities and big tech companies yet this simple video in comic sans explains everything in the most direct and concise manner possible !?
Great way to explain a complex idea ⚡️
Incredible. Thank you
Wow, what a fantastic explanation!
Simply amazing, so intuitive..omg subscribed
This is t he best explanation of LLMs I've seen
Great explanation of an advanced topic
Easy to understand explanation of large language models 👍
Brilliantly explained !
Absolutely brilliant..great examples
This is so good
I’m inspired to go back and learn Fourier and Taylor series
Wow .. what an explanation sir ❤
Thank you 🙏
fantastic video, thank you!!!
Very nicely explained!!! It would be great if you could do a video on creating GPT (especially after witnessing the human-like speech of chatGPT)
Wow. This is incredible!!
Simple and clear, kudos!
This was incredibly good!
Amazing Video!
This is fantastic. Thank you for sharing.
wonderful, thank you so much for sharing
Seems really really cool
Great explanation. Thank you very much
This was actually amazing
Really well explained!!
Very nicely done
Simply superb explanation
This video is a must watch
very well explained!
Great ! Onto part 2 😃
This is awesome! I think this a one of great video i found till so for on Machine learning!
This is really good.
This is uncut gold.
Great video 👍!
This reminds me of Markov Chain (MC). I read in some probability book long time ago that MC had been used to calculate probability of the next letter in a (randomly chosen) word.
It is exactly a Markov matrix, also known as a probability matrix, that is all the rows are probability vectors (nonnegative real numbers that sum to 1), like that used to define a Markov chain. If it depends only on the previous word, it's a
1-step chain (the usual kind); if on 3 words it's a k-step Markov chain, which can be re-coded as a 1-step chain by replacing the alphabet of symbols (words) by triples of symbols (triples of words). In fact, Markov himself used this model back in the 1930's to try to generate text. I found that out from this
great talk by a UC Berkeley CS prof, author of the basic textbook in the field, and also of the "pause" letter:
czcams.com/video/T0kALyOOZu0/video.html&ab_channel=GoodTimesBadTimes
Markov himself invented this to predict first words, and then word pairs in Eugene Onegin!
(a 2-step version). Chat GPT is a 32,000 step version, but they have to train it
stochastically or it would be way too much computation to use actual frequencies...
this is gold, thanksss sir
Fantastic. Please teach more
You are a legend.
Great video! A lot covered super concisely. There is one minor issue I noticed though at 5:40 which you're probably aware of. Usually a bias term b is added inside the activation functions to get S(wx+b). Without this bias you severely limit the capacity/expressiveness of the network. For example, if we take S to be the ReLU function (0 if x
YES -- you are very observant 🙂 The video glosses over the bias parameter.
Outstanding!
Super clear - need to circulate this around my teams adjacent to the scientists at work
Thanks for showing what a neural network function looks like