Essential Matrix Algebra for Neural Networks, Clearly Explained!!!

Sdílet
Vložit
  • čas přidán 27. 05. 2024
  • Although you don't need to know matrix algebra to understand the ideas behind neural networks, if you want to code them or read the latest manuscripts about the field, then you'll need to understand matrix algebra. This video teaches the essential topics in matrix algebra and shows how a neural network can be written as a matrix equation, and then shows how understand PyTorch documentation, error messages and the equations for Attention, which is the fundamental concept behind ChatGPT.
    Note: If you want to learn more about neural networks...
    • The Essential Main Ide...
    ...backpropagation...
    • Neural Networks Pt. 2:...
    ...the ReLU activation function...
    • Neural Networks Pt. 3:...
    ...tensors...
    • Tensors for Neural Net...
    ...SoftMax...
    • Neural Networks Part 5...
    ...Transformers and Attention...
    • Transformer Neural Net...
    If you'd like to support StatQuest, please consider...
    Patreon: / statquest
    ...or...
    CZcams Membership: / @statquest
    ...buying my book, a study guide, a t-shirt or hoodie, or a song from the StatQuest store...
    statquest.org/statquest-store/
    ...or just donating to StatQuest!
    paypal: www.paypal.me/statquest
    venmo: @JoshStarmer
    Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
    / joshuastarmer
    0:00 Awesome song and introduction
    2:35 Introduction to linear transformations
    5:57 Linear transformations in matrix notation
    7:34 Matrix multiplication
    11:03 Matrix multiplication consolidates a sequence of linear transformations
    13: 46 Order matters for matrix multiplication
    15:18 Transposing a matrix
    16:37 Matrix notation and equations
    18:51 Using matrix equations to describe a neural network
    24:26 nn.Linear() documentation explained
    26:38 1-D vs 2-D error messages explained
    27:17 The matrix equation for Attention explained
    #StatQuest #neuralnetworks #matrixalgebra

Komentáře • 130

  • @statquest
    @statquest  Před 5 měsíci +8

    To learn more about Lightning: lightning.ai/
    Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/

    • @bigbacktor
      @bigbacktor Před 4 měsíci

      Hi, i would like to buy the book in color. Does someone know if that is posible ? It seems to me that in amazon is in black and white and on Lulu is on color. Is that right ? I am from spain.

    • @statquest
      @statquest  Před 4 měsíci

      @@bigbacktor They are all in color. And there is a version that is translated into spanish if you are interested.

    • @bigbacktor
      @bigbacktor Před 4 měsíci +1

      @@statquest Thanks a lot!! BAM you just sold another book :)

    • @statquest
      @statquest  Před 4 měsíci +1

      @@bigbacktor Hooray!!! Thank you very much for supporting StatQuest! BAM! :)

  • @user-do2pr3ue9x
    @user-do2pr3ue9x Před 5 měsíci +10

    What you do is nothing short of a miracle! Immense gratitude

  • @user-gx9hk8gt3k
    @user-gx9hk8gt3k Před 5 měsíci +11

    You have been staying the course for a long, long time. It's not so easy ! Keep up the good work!

  • @bin4ry_d3struct0r
    @bin4ry_d3struct0r Před 5 měsíci +10

    All visual learners are blessed by the great Josh Starmer!

  • @Why_I_am_a_theist
    @Why_I_am_a_theist Před 5 měsíci +5

    World is a better place with Josh🎉

  • @kitkitmessi
    @kitkitmessi Před měsícem +2

    this is freaking amazing! Would love to see more math lessons like this

  • @free_thinker4958
    @free_thinker4958 Před 4 měsíci +5

    Could you please make a video about QLORA? ❤ You're our savior when it comes to understanding complex concepts, thank you man

    • @statquest
      @statquest  Před 4 měsíci +1

      I'll keep that in mind.

    • @TheTruthOfAI
      @TheTruthOfAI Před 3 měsíci

      that would be lovely, perhaps LORA itself holds a strong glue potential across neural networks, will be looking forward for such amazing video

  • @user-cf8rq6wk8w
    @user-cf8rq6wk8w Před 4 měsíci +3

    Hi Josh... would you please make a video and explain the differences between different statistical tests like t, z, chi... I want to know the differences and when to use each.

  • @varshinibalaji3535
    @varshinibalaji3535 Před 4 měsíci +1

    Wow, just the perfect video I was looking for! Loved all the Taylor references, and music puns.

    • @statquest
      @statquest  Před 4 měsíci

      Hooray!!! You're the first person to mention the Taylor references in a comment. BAM!!! :)

    • @varshinibalaji3535
      @varshinibalaji3535 Před 4 měsíci +1

      @@statquest I was looking for it, since you mentioned something with Taylor was coming soon, in one of my Linkedin Posts :) Plus, I've been reading a lot of academic papers lately, So needed a better context on matrix transformations to interpret the math better! So, Double BAM, indeed!

  • @ShadArfMohammed
    @ShadArfMohammed Před 5 měsíci +3

    Baaam this is good :D I have been waiting for this, to be honest, I had the feeling that one day you would make such a tutorial. Your content is great.

  • @ageofkz
    @ageofkz Před 3 měsíci +1

    Really amazing work! This set of videos (neural network playlist) has really helped me in my uni coursework and project! My groupmates and I are planning to get a statquest triple bam hoodie each haha!

    • @statquest
      @statquest  Před 3 měsíci

      That's awesome!!! TRIPLE BAM! :)

  • @ritshpatidar
    @ritshpatidar Před 5 měsíci +6

    I was thinking about taking a course to learn matrix algebra yesterday. Thanks for posting this video. It is really helpful and it is like a wish came true.

  • @arenashawn772
    @arenashawn772 Před 5 měsíci +2

    I found this channel when searching for a clear explanation of central limit theorem on Google (after doing some simulation in R using sample size much less than 30 and being intrigued by the results I got) and I just want to say I love the content so much! (And the ukulele episode ❤) I’ve recently started some machine learning classes on coursera and EdX, and I must say the explanation you have here in these episodes are SO MUCH BETTER AND MORE TO THE POINT/BETTER DEFINED than the multi thousand dollar classes (I’m surely glad I chose to audit them first!) taught by professors from Harvard or Engineers working for Google/IBM. So much better!… ❤❤❤
    Just want to say thank you and Merry Christmas! I know I will be going through these videos one by one in the coming months…

    • @statquest
      @statquest  Před 5 měsíci +1

      Thank you very much!!! I'm so happy you enjoy my videos. BAM! :)

    • @arenashawn772
      @arenashawn772 Před 5 měsíci +1

      ⁠@@statquestI really did and binge watched a bunch… But I must say I now enjoy your songs even more 😂 Just bought all your albums on bandcamp - they are awesome! That going back to Cali song just had me rolling off my chairs at the end of it… I relocated from San Francisco Bay Area to Florida panhandle not long ago so that song really struck a cord with me 😂😂😂

    • @statquest
      @statquest  Před 4 měsíci

      @@arenashawn772 Thank you very much! I'm glad you enjoy the tunes and the videos. I hope the move went well! :)

  • @techproductowner
    @techproductowner Před 2 měsíci +1

    You will be known and remembered for the next 1000 years ..

  • @AyumiFortunex
    @AyumiFortunex Před 3 měsíci +1

    Absolutely fantastic explanation again

  • @ilirhajrullahu4083
    @ilirhajrullahu4083 Před 5 měsíci +5

    Very nice video! Thank you for uploading such helpful material :). It would be great if you made a video on vector and matrix calculus. These are important topics in NNs too :).

    • @statquest
      @statquest  Před 5 měsíci +1

      I'll keep that in mind.

    • @gocomputing8529
      @gocomputing8529 Před 5 měsíci +1

      Thanks for the great video! Also the topic proposed here would also be super interesting, so I hope you could do it someday

  • @randr10
    @randr10 Před 5 měsíci +2

    Thank you for this video. I think I understand what a transformer is now.

  • @TheTruthOfAI
    @TheTruthOfAI Před 3 měsíci +1

    i love your videos, it helped me so much.. learned a lot.. i was able to make UNA thanks to your learnings :)

    • @statquest
      @statquest  Před 3 měsíci +1

      Triple bam! Congratulations!

  • @BrianPondiGeoGeek
    @BrianPondiGeoGeek Před 5 měsíci +2

    Tripple Bam for sure. Amazing explanation.

  • @anmolarora2599
    @anmolarora2599 Před 3 měsíci +1

    Thank you for explaining it so simply even a novice like me can understand it.

  • @kartikchaturvedi7868
    @kartikchaturvedi7868 Před 5 měsíci +1

    Superrrb Awesome Fantastic video

  • @haitematik5832
    @haitematik5832 Před 3 měsíci +1

    Man u deserve a thousand times more subscribers

  • @NJCLM
    @NJCLM Před 5 měsíci +3

    Very good video ! You should remake one of the transformer videos with the matrix notation as you done at the end of this vide.

    • @statquest
      @statquest  Před 5 měsíci +7

      I'm working on it right now. Hopefully it will be ready soon.

    • @NJCLM
      @NJCLM Před 5 měsíci

      @@statquest take your time and thanks you very much, your content is so much valuable !

  • @siddhanthbhattacharyya4206
    @siddhanthbhattacharyya4206 Před měsícem +1

    Quadruple bam! (One bam for me finally understanding)

  • @louisnemzer6801
    @louisnemzer6801 Před 5 měsíci +5

    Squatch: So it's all just matrix multiplication?
    Josh: Always has been

  • @amjadiqbal478
    @amjadiqbal478 Před 2 měsíci +1

    Quite good. ❤

  • @kujohjotaro3017
    @kujohjotaro3017 Před 4 měsíci +1

    Your video is just a lifesaver to me and my essay! Could you make a video on the Glove model in NLP?

  • @slash_29
    @slash_29 Před 5 měsíci +2

    Please part 2 with more details, and new terms

  • @PavanKumar-pt2sh
    @PavanKumar-pt2sh Před 5 měsíci +2

    Can you please create a video on multi-modal transformer architecture?

    • @statquest
      @statquest  Před 5 měsíci +1

      I'll keep that in mind.

    • @free_thinker4958
      @free_thinker4958 Před 4 měsíci +1

      I hope he does it, he's our savior when it comes to understanding complex concepts

  • @deveshbhatt4063
    @deveshbhatt4063 Před 5 měsíci +4

    Triple Bam🎉❤

  • @pran441
    @pran441 Před 5 měsíci +1

    Joshua your teaching was fantastic, but I couldn't quite grasp the concept.

    • @statquest
      @statquest  Před 4 měsíci

      What time point (minutes and seconds) was confusing?

  • @anonymousgirl1463
    @anonymousgirl1463 Před 4 měsíci

    Hey Josh! I love your channel and I was thinking about buying a study guide. What is the difference between watching one of your playlists and buying a study guide? Do you cover exactly the same in both and buying the study guide is for support/like a donation or is there any difference?

    • @statquest
      @statquest  Před 4 měsíci

      They are the same. The difference is that some people like to have the study guides for offline use or adding their own notes to. In some ways, the study guides are like "cheat sheets" - everything in a video is condensed to about 3 to 5 pages.

  • @user-ut3sy6hy4p
    @user-ut3sy6hy4p Před 2 měsíci +1

    thanks for ur effort, ur videos helped me so much, but could u plz tell us how lghm works

    • @statquest
      @statquest  Před 2 měsíci

      Do you mean Light Gradient Boost? LightGBM?

    • @user-ut3sy6hy4p
      @user-ut3sy6hy4p Před 2 měsíci +1

      I mean LightGBM
      @@statquest

  • @Ahmed_Issaoui
    @Ahmed_Issaoui Před 4 měsíci +1

    hello statquest, what software do you use to create your videos ?
    (your answer is really useful to me)

    • @statquest
      @statquest  Před 4 měsíci

      I give away all of my secrets in this video: czcams.com/video/crLXJG-EAhk/video.html

  • @ps3301
    @ps3301 Před 5 měsíci +2

    Could you explain the math behind a basic liquid neuron and show how it differs from other neuron ?

  • @magtazeum4071
    @magtazeum4071 Před 4 měsíci

    Hi Josh, Could you do video on time series clustering , and time series analysis please?

  • @nivcohen961
    @nivcohen961 Před 3 měsíci +1

    You are awsome

  • @vigneshvicky6720
    @vigneshvicky6720 Před 5 měsíci +10

    We want yolo series mainly yolov8 from scratch

    • @statquest
      @statquest  Před 5 měsíci +2

      I'll keep that in mind.

    • @anickkhan
      @anickkhan Před 4 měsíci +2

      Please Professor, it’s an earnest request. Lots of Love from Bangladesh ❤❤

  • @ivant_true
    @ivant_true Před 5 měsíci +1

    great

  • @jarsal_firahel
    @jarsal_firahel Před měsícem

    What about a video on MAMBA architecture ? That would be really BAAAM

  • @user-ob4eh9li9e
    @user-ob4eh9li9e Před 3 měsíci

    Can u please discuss about stochastic gradient boosting for classification?. I'm having trouble understanding that 😢

    • @statquest
      @statquest  Před 3 měsíci

      I have a whole series of videos on Gradient Boosting. You can find them here: statquest.org/video-index/

  • @shouvikdey7078
    @shouvikdey7078 Před 5 měsíci

    Do more videos related to GAN etc.

  • @davidmurphy563
    @davidmurphy563 Před 5 měsíci +2

    Ok, it always annoyed me that when you're doing matrix vector (col) multiplication they always write the matrix first, then the vector. It never occured to me until you said so just now that the cols and rows aren't valid tensor operations if you write them the other way round... Doh! It doesn't look nice though.
    Btw, why did you use a row vector and a transverse matrix? I would always use a col vector. Col space transforms are the default for me and you can picture the latent space.
    The only times I'd use rows is if I have a system of linear equations.

    • @statquest
      @statquest  Před 5 měsíci +1

      I agree that the matrix * column looks bad. And I chose to do row * matrix because that is what they used in the PyTorch documentation.

    • @davidmurphy563
      @davidmurphy563 Před 5 měsíci +1

      @@statquestGlad it's not just me that thinks it looks backwards! :)) But you're of course right; 2x2 * 2x1 is a valid operation whereas 2x1 * 2x2 is, strictly speaking, undefined.
      Oh, a tip you may (or may not!) find a useful teaching tool:
      I always look at matrix multiplication in terms of a series of dot product operations. Once the student understands that the dot product outputs a scalar expressing the likeness of two vectors (eg whether two normalised vectors pointing the same way) then rather than just mechanically running an algorithm - the student can see that it's plotting the vector in the new space by comparing its likeness to the space's basis vectors one axis at a time. That's why I think it's always handy to see a square matrix as a series of basis vectors.
      So, if you're going from an orthonormal basis to one where, say, y is mirrored - {{1, 0}, {0, -1}} - then it's quite apparent why taking the dot product for each spatial dimension will plot the vector upside-down. You could show an image flipping to drive the point home.
      I just think that's intuitive and why we're multiplying and adding across columns and rows.
      At least that's how I like to see it.

  • @aga5979
    @aga5979 Před 8 dny

    Could you do a series on "attention is all you need " paper ? Thank you Sir.

    • @statquest
      @statquest  Před 7 dny +1

      This video walks you through the concepts in that paper: czcams.com/video/zxQyTK8quyY/video.html
      And this video goes through the math: czcams.com/video/KphmOJnLAdI/video.html

    • @aga5979
      @aga5979 Před 7 dny +1

      @@statquest thank you so much!!

  • @midhileshmomidi3120
    @midhileshmomidi3120 Před 4 měsíci +1

    Can we book on these concepts as well

    • @statquest
      @statquest  Před 4 měsíci

      I'm writing it right now.

  • @Nono-de3zi
    @Nono-de3zi Před 5 měsíci

    Thanks Josh. But naughty, naughty, the stage is not just rotating, it is flipping. Which you can also encode in matrices of course ;-)

    • @statquest
      @statquest  Před 5 měsíci

      I'm not sure I understand what you mean by flipping in addition to rotating as stage left and stage right are maintained through out each change.

    • @Nono-de3zi
      @Nono-de3zi Před 5 měsíci

      @@statquest The drawing of the stage is asymmetrical (one edge is slightly erased). When you did the slides you flipped it instead of rotating it. As a result, Statsquatch is sometimes on one side, sometimes on the other. I know it was not on purpose 🙂 Thanks for the excellent vid as usual.

    • @statquest
      @statquest  Před 5 měsíci

      @@Nono-de3zi I'm still confused because statsquach is always on stage left.

    • @Nono-de3zi
      @Nono-de3zi Před 5 měsíci

      But the *stage* is flipped :-)

    • @statquest
      @statquest  Před 5 měsíci

      @@Nono-de3zi If it was flipped, then wouldn't stage left stay on top and stage right stay on the bottom?

  • @lifeisbeautifu1
    @lifeisbeautifu1 Před 2 měsíci +1

    BAM!

  • @harryliu1005
    @harryliu1005 Před 2 měsíci +1

    after 10000000 years, scientists found fossil record of statquest, then he said " BAM!"

  • @bossgd100
    @bossgd100 Před 5 měsíci +1

    Woaw

  • @nutzeeer
    @nutzeeer Před 5 měsíci

    14:05 matrix multiplication cant be rearranged, as matrix multiplication is a sequence of calculations. is this indicated by using X as a multillication symbol and not •? Becaus in school we used • to indicate multiplications.

    • @nutzeeer
      @nutzeeer Před 5 měsíci

      ah no the x is not signifying order. but I would like that to be visible from writing alone, without the helpful explanation.

    • @nutzeeer
      @nutzeeer Před 5 měsíci

      i wonder why matrices are turned sideways like that. it would feel easier for me to multiply rows with rows.

    • @statquest
      @statquest  Před 5 měsíci +1

      This is explained, although I'm guessing not to your satisfaction, at 10:58. It has to do with the ability to combine transformations. For more details, see: math.stackexchange.com/questions/271927/why-historically-do-we-multiply-matrices-as-we-do

  • @liuze9280
    @liuze9280 Před 5 měsíci

    Great video, but i don't quite understand 25:25...

    • @statquest
      @statquest  Před 5 měsíci

      It just means that PyTorch stores the weights differently than we used in the earlier examples and in order to get the same math, we have to transpose the PyTorch weight matrix.

  • @werewolfprogrammer
    @werewolfprogrammer Před 5 měsíci

    Hi, I am trying to start a youtube channel to make tutorial videos about data science related topics. I want to make the videos about things that are less popular but still important, since I found that it can be quite difficult to start off with these things since most information is in difficult to comprehend papers. My starting point will be social network analysis and natural language processing as that is my main interest and expertise. However, I am interested in finding more topics so I am starting by doing research on different channels that make tutorials for data science, AI, machine learning, statistics, natural language processing, graph theory or network analysis.
    So for anybody in the comments that reads this message, could you help me out by replying with any youtube creators that do something related to these topics, or any other digital platform like Brilliant. If you know a topic that is similar to the ones I mentioned that would also be a great thing to share. Or if you know of better places to share this message. Or any other helpfull tips.
    Thanks everybody for the help. If this message is regarded as spam also please say so and I will remove it.
    The topics again:
    -data science
    -AI
    -machine learning
    -statistics
    -natural language processing
    -graph theory
    -network analysis

  • @tsunningwah3471
    @tsunningwah3471 Před 5 měsíci

    zhina!

  • @samiotmani9092
    @samiotmani9092 Před 2 měsíci +1

    97 videos finished … small bam 🥲