Large Language Models from scratch

Sdílet
Vložit
  • čas přidán 24. 07. 2024
  • How do language models like chatGPT and Palm work? A short cartoon that explains transformers and the tech behind LLMs.
    Part 2: • Large Language Models:...
    0:05 - autocomplete
    0:44 - search query completion
    1:03 - language modeling with probabilities
    1:59 - time series and graphs
    2:34 - text generation
    3:43 - conditional probabilities
    3:52 - trigrams
    4:49 - universal function approximation
    5:19 - neural networks
    6:33 - gradient descent
    7:03 - back propagation
    7:24 - network capacity

Komentáře • 221

  • @triton62674
    @triton62674 Před rokem +246

    This is seriously *really* good, I've not seen someone introduce high level concepts by-example so clearly (and nonchalantly!)

  • @somethingness
    @somethingness Před rokem +313

    This is so good. I can't believe it has so few views.

    • @adarshraj1467
      @adarshraj1467 Před rokem +6

      Same, brillant explaination on NN

    • @facqns
      @facqns Před rokem +1

      Was just about to write the same.

    • @webgpu
      @webgpu Před rokem +1

      if you really think so, post the link to this video on your social media.

    • @itsm3th3b33
      @itsm3th3b33 Před rokem

      So few views... If a Kardashian posts a brain fart it gets more views from the unwashed masses. That is the sad reality.

    • @phil5053
      @phil5053 Před rokem +2

      Very few study about it

  • @ivocamilleri8913
    @ivocamilleri8913 Před rokem +70

    This is an excellent articulation. We need part 3, 4, and 5

  • @solotron7390
    @solotron7390 Před 3 měsíci +5

    Finally! Someone who knows how explain complexity with simplicity.

  • @JessieJussMessy
    @JessieJussMessy Před rokem +9

    Being able to visualize this so simply is legendary. You're doing amazing work. Subbed

  • @marktahu2932
    @marktahu2932 Před rokem +1

    You have made it so easy to see and understand - it puts into place all the complicated explanations that exist out there on the net.

  • @saqhorov
    @saqhorov Před rokem +26

    this is excellently done, I'm very grateful for you putting this together.

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w Před rokem +6

    Wow. This is so well presented. And a different take that gets to the real intuition.

  • @SethWieder
    @SethWieder Před rokem +7

    These visuals were SO HELPFUL in introducing and understanding some foundational ML concepts.

  • @noahnazareth8248
    @noahnazareth8248 Před rokem +16

    Great video. "energy function" instead of error function, but a great explanation of gradient descent and backprop in a super short time. Excellent job!

  • @ericsun1990
    @ericsun1990 Před rokem +5

    I really liked your explanation of how "training a network" is performed. Made it a lot easier to understand

  • @AncientSlugThrower
    @AncientSlugThrower Před rokem +1

    This was awesome. I don't think I could adequately explain how this all works yet, but it fills in so many gaps. Thank you for this video!

  • @BenKordick
    @BenKordick Před rokem +1

    Awesome video! I really appreciated your explanation and representation of neural networks and how the number of nodes and weights affect the accuracy.

  • @GregMatoga
    @GregMatoga Před rokem +23

    That might be the best, most concise and impactful neural network introduction I have seen to date

  • @narasimhasriharshakanduri3325

    i generally don't subscribe to any channels but this one deserves one. This takes a lot of understanding and love for the subject to do these kind of videos. thank you very much

  • @liamcawley6440
    @liamcawley6440 Před rokem +1

    possibly the best explanation of LLM i've ever seen. accurate, pointed and concise

  • @SaidakbarP
    @SaidakbarP Před rokem +3

    This is the best explanation of Large Language Models. I hope your channel gets more subscribers!

  • @lfmtube
    @lfmtube Před rokem +9

    Brilliant! A truly example of intelligence and simplicity to explain! Thanks a lot.

  • @laStar972chuck
    @laStar972chuck Před rokem +1

    Stunning video of absolutely high and underrated quality !!!!
    Thanks so much, for this !

  • @darinmcbride4003
    @darinmcbride4003 Před rokem

    Great video. The example of the network with too few curve functions to recreate the graph really helped me understand how more or fewer nodes affects the accuracy of the result.

  • @stratfanstl
    @stratfanstl Před 7 měsíci +2

    I've been watching a lot of videos on LLMs and the underlying mathematics. This explanation is PHENOMENAL. Not dumbed down, not too long, and uses concepts of existing maths and graphing that cement the concept perfectly.

  • @saurabhmaurya6380
    @saurabhmaurya6380 Před 10 měsíci +1

    Straight away subscribed .... i would really love these videos in my feed daily.❤

  • @gigabytechanz9646
    @gigabytechanz9646 Před rokem +1

    Very clear and concise explanation! Excellent work!

  • @coderversion1
    @coderversion1 Před rokem +3

    Best and simplest explanation I have ever come across. Thank you sir

  • @funnycompilations8314
    @funnycompilations8314 Před rokem +7

    you sir, deserve my subscription. This was so good.

  • @KalebPeters99
    @KalebPeters99 Před rokem +1

    Holy shit. This is one of the best CZcams videos I've seen all year so far. Bravo 👏👏👏

  • @TheAkiller101
    @TheAkiller101 Před rokem +5

    if there is an Oscar for best tutorial on the internet, this video deserves it !

  • @BeSlaying
    @BeSlaying Před rokem +1

    Really great explanation of LLM! Just earned a subscriber and I'm looking forward to more of your videos :)

  • @looppp
    @looppp Před rokem +2

    This is an insanely good explanation. Subscribed.

  • @govinddwivedi582
    @govinddwivedi582 Před rokem +1

    You are so good at explaining it! Please keep doing it.

  • @wokeclub1844
    @wokeclub1844 Před rokem +1

    such clean and lucid explanation. amazing

  • @DimitriMissentos
    @DimitriMissentos Před rokem +1

    Amazingly insightful. Fantastically well explained. Thanks !

  • @pierrickguillard-prevert4213

    Thanks Steve, this explanation is just... Brillant! 😊

  • @ravinatarajan4894
    @ravinatarajan4894 Před rokem +2

    Very nice illustration and fantastic explanation. Thanks

  • @jankshtt
    @jankshtt Před rokem +2

    nice concise video explaining what is a large language model

  • @marlinhowley9858
    @marlinhowley9858 Před rokem +1

    Excellent. Some of the best work I've seen. Thanks.

  • @lipefml7200
    @lipefml7200 Před rokem +1

    The best content ever I saw about the subject. Super dense and easy.

  • @sudhindrakopalle7071
    @sudhindrakopalle7071 Před rokem +1

    Fascinating and such wonderful explanation. Thank you very much!

  • @AVV-A
    @AVV-A Před rokem +4

    This is insanely good. I've understood things in 8 minutes that I could not understand after entire classes

  • @erethir
    @erethir Před rokem +1

    Agree with the other comments, so clear and easy to understand. I wish all teaching material was this good...

  • @mirabirhossain1842
    @mirabirhossain1842 Před rokem +1

    Unbelievably good video. Great work.

  • @samwynn3183
    @samwynn3183 Před rokem

    I had not considered exactly how words related to eachother in automated texts and this video explained that concept in a really clear and concise way.

  • @zhenli8674
    @zhenli8674 Před rokem +1

    This is awesome. Very good Illustrations.

  • @JohnLaudun
    @JohnLaudun Před 2 měsíci +1

    I have been working on ways to explain LLMs to people in the humanities for the past year. You've done it in 5 brilliant minutes. From now on, I'm just going to hand out this URL.

  • @lopezb
    @lopezb Před rokem +1

    I loved this. Clarity = real understanding= respect for the curiosity and intelligence of the audience.
    Requests: Would like more depth about "back propagation", and on to why so many "layers" and so on...!!!!

  • @WoonCherkLam
    @WoonCherkLam Před rokem +1

    Probably one of the best explanations I've come across. :)

  • @zeeg404
    @zeeg404 Před rokem +8

    What an amazing lecture on LLM! Loved the example Markov chain model with the bob Dylan lyrics, that was actually a fun homework exercise in one of my grad school courses. This really helped me understand neural networks, which are so much more complex.

  • @A338800
    @A338800 Před rokem +1

    Incredibly well explained! Thanks a lot!

  • @stanleytrevino6735
    @stanleytrevino6735 Před rokem

    This is a very good video! Excellent explanation on Large Language Models!

  • @LittleShiro0
    @LittleShiro0 Před rokem +1

    Very well explained. Thank you for the video!

  • @Luxcium
    @Luxcium Před rokem +3

    Wait a minute all day I try to understand what are neural networks and you have explained all parts so easily wow 😮 it obviously 🙄 imply that I have struggled to learn all of these terms so far but I finally have found a good explanation of back-propagation, gradient-descent, error functions and such 🎉🎉🎉🎉

  • @junglemandude
    @junglemandude Před 3 měsíci +2

    Thanks, what a video, in 8 minutes I have learnet so much, and very well explained with graphics indeed.

  • @Bhuvana2020
    @Bhuvana2020 Před rokem +1

    What a fantastic tutorial! Thank you! Liked and subscribed!

  • @dylan_curious
    @dylan_curious Před rokem +2

    Wow, this video was really informative and fascinating! It's incredible to think about how much goes into building and training a language model. I never realized that language modeling involved so much more than just counting frequencies of words and sentences. The explanation of how neural networks can be used as universal approximators was particularly interesting, and it's amazing to think about the potential applications of such models, like generating poetry or even writing computer code. I can't wait for part two of this video!

  • @go_better
    @go_better Před 11 měsíci

    Thank you so much! Very well and simply explained!

  • @gregortidholm
    @gregortidholm Před rokem +1

    Really great description 👌

  • @peregudovoleg
    @peregudovoleg Před rokem

    Eventhough I knew all this stuff, it is still nice to watch and listen to a good explanation of these fundamental ML concepts.

  • @ashleygesty7671
    @ashleygesty7671 Před rokem +3

    You are a genius, thank you for this amazing video!

  • @smithwill9952
    @smithwill9952 Před rokem +2

    Clean and clear explaination

  • @marcusdrost8452
    @marcusdrost8452 Před 7 měsíci +1

    Clearly explained! I will use it.

  • @hrishabhchoudhary2700
    @hrishabhchoudhary2700 Před rokem +1

    The content is gem. Thank you for this.

  • @mmarsbarr
    @mmarsbarr Před rokem

    how is it possible that i’ve watched a ton of videos trying to understand LLMs from the likes of universities and big tech companies yet this simple video in comic sans explains everything in the most direct and concise manner possible !?

  • @stan.corston
    @stan.corston Před 2 měsíci +1

    Great way to explain a complex idea ⚡️

  • @kahoku451
    @kahoku451 Před rokem +2

    Incredible. Thank you

  • @darrin.jahnel
    @darrin.jahnel Před rokem +1

    Wow, what a fantastic explanation!

  • @quantphobia2944
    @quantphobia2944 Před rokem +1

    Simply amazing, so intuitive..omg subscribed

  • @DanielTateNZ
    @DanielTateNZ Před rokem +3

    This is t he best explanation of LLMs I've seen

  • @jacobwilliams4182
    @jacobwilliams4182 Před rokem +1

    Great explanation of an advanced topic

  • @angelatatum8642
    @angelatatum8642 Před rokem

    Easy to understand explanation of large language models 👍

  • @shreeramshankarpattanayak7409

    Brilliantly explained !

  • @ammarlahori
    @ammarlahori Před 10 měsíci +1

    Absolutely brilliant..great examples

  • @_sudipidus_
    @_sudipidus_ Před 5 měsíci +1

    This is so good
    I’m inspired to go back and learn Fourier and Taylor series

  • @sumitpawar000
    @sumitpawar000 Před 4 měsíci +1

    Wow .. what an explanation sir ❤
    Thank you 🙏

  • @eriksteen84
    @eriksteen84 Před rokem +1

    fantastic video, thank you!!!

  • @kazimafzal
    @kazimafzal Před rokem +3

    Very nicely explained!!! It would be great if you could do a video on creating GPT (especially after witnessing the human-like speech of chatGPT)

  • @mbrochh82
    @mbrochh82 Před rokem +1

    Wow. This is incredible!!

  • @lucamatteobarbieri2493
    @lucamatteobarbieri2493 Před rokem +1

    Simple and clear, kudos!

  • @Larock-wu1uu
    @Larock-wu1uu Před rokem

    This was incredibly good!

  • @CUDA602
    @CUDA602 Před rokem +2

    Amazing Video!

  • @ryusei323
    @ryusei323 Před 11 měsíci +1

    This is fantastic. Thank you for sharing.

  • @DavideVarvello
    @DavideVarvello Před rokem +2

    wonderful, thank you so much for sharing

  • @davhung8004
    @davhung8004 Před 8 měsíci +1

    Seems really really cool

  • @KayYesYouTuber
    @KayYesYouTuber Před rokem +1

    Great explanation. Thank you very much

  • @perseusgeorgiadis7821
    @perseusgeorgiadis7821 Před rokem +1

    This was actually amazing

  • @magunciero
    @magunciero Před rokem +1

    Really well explained!!

  • @LeviNotik
    @LeviNotik Před rokem +2

    Very nicely done

  • @sumitarora5861
    @sumitarora5861 Před rokem +1

    Simply superb explanation

  • @berbudy
    @berbudy Před rokem +1

    This video is a must watch

  • @YusanTRusli
    @YusanTRusli Před 10 měsíci +1

    very well explained!

  • @danjsy
    @danjsy Před rokem

    Great ! Onto part 2 😃

  • @DanishKhan-sh1fe
    @DanishKhan-sh1fe Před rokem +1

    This is awesome! I think this a one of great video i found till so for on Machine learning!

  • @theunknown2090
    @theunknown2090 Před rokem +2

    This is really good.

  • @BenyThePooh2
    @BenyThePooh2 Před rokem +1

    This is uncut gold.

  • @user-vc6uk1eu8l
    @user-vc6uk1eu8l Před rokem +2

    Great video 👍!
    This reminds me of Markov Chain (MC). I read in some probability book long time ago that MC had been used to calculate probability of the next letter in a (randomly chosen) word.

    • @lopezb
      @lopezb Před rokem +2

      It is exactly a Markov matrix, also known as a probability matrix, that is all the rows are probability vectors (nonnegative real numbers that sum to 1), like that used to define a Markov chain. If it depends only on the previous word, it's a
      1-step chain (the usual kind); if on 3 words it's a k-step Markov chain, which can be re-coded as a 1-step chain by replacing the alphabet of symbols (words) by triples of symbols (triples of words). In fact, Markov himself used this model back in the 1930's to try to generate text. I found that out from this
      great talk by a UC Berkeley CS prof, author of the basic textbook in the field, and also of the "pause" letter:
      czcams.com/video/T0kALyOOZu0/video.html&ab_channel=GoodTimesBadTimes
      Markov himself invented this to predict first words, and then word pairs in Eugene Onegin!
      (a 2-step version). Chat GPT is a 32,000 step version, but they have to train it
      stochastically or it would be way too much computation to use actual frequencies...

  • @pointblank3409
    @pointblank3409 Před rokem +3

    this is gold, thanksss sir

  • @RK-fr4qf
    @RK-fr4qf Před 11 měsíci +1

    Fantastic. Please teach more
    You are a legend.

  • @cbandle5050
    @cbandle5050 Před rokem +4

    Great video! A lot covered super concisely. There is one minor issue I noticed though at 5:40 which you're probably aware of. Usually a bias term b is added inside the activation functions to get S(wx+b). Without this bias you severely limit the capacity/expressiveness of the network. For example, if we take S to be the ReLU function (0 if x

    • @g5min
      @g5min  Před rokem

      YES -- you are very observant 🙂 The video glosses over the bias parameter.

  • @jwilliams8210
    @jwilliams8210 Před rokem +1

    Outstanding!

  • @gregpavlik6474
    @gregpavlik6474 Před rokem +3

    Super clear - need to circulate this around my teams adjacent to the scientists at work

  • @ApocalypsoTron
    @ApocalypsoTron Před rokem +1

    Thanks for showing what a neural network function looks like