Neural Network from Scratch | Mathematics & Python Code

Sdílet
Vložit
  • čas přidán 26. 07. 2024
  • In this video we'll see how to create our own Machine Learning library, like Keras, from scratch in Python. The goal is to be able to create various neural network architectures in a lego-fashion way. We'll see how we should architecture the code so that we can create one class per layer. We will go through the mathematics of every layer that we implement, namely the Dense or Fully Connected layer, and the Activation layer.
    😺 GitHub: github.com/TheIndependentCode...
    🐦 Twitter: / omar_aflak
    Same content in an article: towardsdatascience.com/math-n...
    Chapters:
    00:00 Intro
    01:09 The plan
    01:56 ML Reminder
    02:51 Implementation Design
    06:40 Base Layer Code
    07:55 Dense Layer Forward
    10:42 Dense Layer Backward Plan
    11:23 Dense Layer Weights Gradient
    14:59 Dense Layer Bias Gradient
    16:28 Dense Layer Input Gradient
    18:22 Dense Layer Code
    19:43 Activation Layer Forward
    20:46 Activation Layer Input Gradient
    22:30 Hyperbolic Tangent
    23:24 Mean Squared Error
    26:05 XOR Intro
    27:04 Linear Separability
    27:45 XOR Code
    30:32 XOR Decision Boundary
    ====
    Corrections:
    17:46 Bottom row of W^t should be w1i, w2i, ..., wji
    18:58 dE/dX should be computed before updating weights and biases
    ====
    Animation framework from @3Blue1Brown : github.com/3b1b/manim

Komentáře • 272

  • @G83X
    @G83X Před 3 lety +71

    In the backward function of the dense class you're returning a matrix which uses the weight parameter of the class after updating it, surely you'd calculate this dE/dX value before updating the weights, and thus dY/dX?

    • @independentcode
      @independentcode  Před 3 lety +15

      Wow, you are totally right, my mistake! Thank you for noticing (and well catched!). I just updated the code and I'll add a comment on the video :)

    • @independentcode
      @independentcode  Před 3 lety +14

      I can't add text or some kind of cards on top of the video, so I pinned this comment in the hope that people will notice it!

    • @trevorthieme5157
      @trevorthieme5157 Před 2 lety +3

      @@independentcode Why can't you?
      Did the youtube developers remove that awesome function too?
      No wonder I've felt things have been off for so long!

    • @jonathanrigby1186
      @jonathanrigby1186 Před rokem

      Can you plz help me with this .. I want a chess ai to teach me what it learnt
      czcams.com/video/O_NglYqPu4c/video.html

    • @blasttrash
      @blasttrash Před rokem

      just curious what happens if we propagate the updated weights backward like in the video? Will it not work? Or will it slowly converge?

  • @robinferizi9073
    @robinferizi9073 Před 3 lety +38

    I like how he said he wouldn’t explain how a neural network works, then proceeds to explain it

  • @ldx8492
    @ldx8492 Před 7 měsíci +9

    This video, instead of the plethora of other videos on "hOw tO bUiLd A NeUrAl NeTwOrK fRoM sCraTcH", is the literal best. It deserves 84 M views, not 84 k views. It is straight to the point, no 10 minutes explanation of pretty curves with zero math, no 20 minutes introduction on how DL can change the world
    I truly mean it, it is a refreshing video.

    • @independentcode
      @independentcode  Před 7 měsíci +1

      I appreciate the comment :)

    • @ldx8492
      @ldx8492 Před 7 měsíci +2

      @@independentcode Thank you for the reply! I am a researcher, and I wanted to create my own DL library, using yours as base, but expanding it for different optim algorithms, initializations, regularizations, losses etc (i am now just developing it on my own privately), but one day I'll love to post it on my github. How can I appropriately cite you?

    • @independentcode
      @independentcode  Před 7 měsíci +2

      That's a great project! You can mention my name and my GitHub profile: "Omar Aflak, github.com/omaraflak". Thank you!

  • @orilio3311
    @orilio3311 Před 2 měsíci +1

    I love the 3b1b style of animation and also the consistency with his notation, this allows people to learn the matter with multiple explanations while not losing track of the core ideas. Awesome work man

  • @generosonunezarias369
    @generosonunezarias369 Před 3 lety +5

    This might be the most intuitive explanation of the backpropagation algorithm on the Internet. Amazing!

  • @faida.6665
    @faida.6665 Před 3 lety +44

    This is basically ASMR for programmers

    • @nikozdev
      @nikozdev Před rokem +1

      I almost agree, the only difference is that I can’t sleep thinking about it

    • @tanker7757
      @tanker7757 Před 7 měsíci +1

      @@nikozdevbruh I fall asleep and allow my self to hallucinate in math lol

    • @nalcow
      @nalcow Před 4 měsíci

      I felt relaxed definetly :D

  • @rubenfalvert5540
    @rubenfalvert5540 Před 3 lety +16

    Probably the best explaination of neural network of CZcams ! The voice and the musique backside is realy soothing !

  • @wagsman9999
    @wagsman9999 Před rokem +3

    Not only was the math presentation very clear, but the Python class abstraction was elegant.

  • @rogeliogarcia8730
    @rogeliogarcia8730 Před 2 lety +13

    Thanks for making such great quality videos. I'm working on my Ph.D., and I'm writing a lot of math regarding neural networks. Your nomenclature makes a lot of sense and has served me a lot. I'd love to read some of your publications if you have any.

  • @ardumaniak
    @ardumaniak Před rokem +12

    The best tutorial on neural networks I've ever seen! Thanks, you have my subscription!

  • @aflakmada6311
    @aflakmada6311 Před 3 lety +14

    Very clean and pedagogical explanation. Thanks a lot!

  • @mohammadrezabanakermani2924

    It is the best one I've seen among the explanation videos available on CZcams!
    Well done!

  • @neuralworknet
    @neuralworknet Před rokem +5

    Best tutorial video about neural networks i've ever watched. You are doing such a great job 👏

  • @darshangowda309
    @darshangowda309 Před 3 lety +61

    This could be 3Blue1Brown for programmers! You got yourself a subscriber! Great video!

  • @samirdaniels
    @samirdaniels Před rokem +1

    This was the best mathematical explanation on CZcams. By far.

  • @adealtas
    @adealtas Před 2 lety +5

    THANK YOU !
    This is exactly the video I was looking for.
    I always struggled with making a neural network, but following your video, I made a model that I can generalize and it made me understandexactly the mistakes I made in my previous attempts.
    It's easy to find on youtube videos of people explaining singular neurons and backpropagation, but then quickly going over the hard part: how do you compute the error in an actual network, the structural implementation and how it all ties together.
    This approach with separating the Dense layer from the activation layer also makes things 100x clearer, and many people end up smacking them both in the same class carelessly.
    The visuals make the intuition for numpy also much much easier. It's always a thing I struggled with and this explained why we do every operation perfectly.
    even though I was only looking for one video, after seeing such quality, I HAVE to explore the rest of your channel ! Great job.

    • @independentcode
      @independentcode  Před 2 lety +3

      Thank you so much for taking the time to write this message! I went through the same struggle when I wanted to make my own neural networks, which is exactly why I ended up doing a video about it! I'm really happy to see that it serves as I intended :)

  • @rishikeshkanabar4650
    @rishikeshkanabar4650 Před 2 lety +1

    This is such an elegant and dynamic solution. Subbed!

  • @MichaelChin1994
    @MichaelChin1994 Před 2 lety +4

    Thank you so very, very, very much for this video. I have been wanting to do Machine Learning, but without "Magic". It drives me nuts when all the tutorials say "From Scratch" and then proceed to open Tensor Flow. Seriously, THANK you!!!

    • @independentcode
      @independentcode  Před 2 lety +3

      I feel you :) Thank you for the comment, it makes me genuinely happy.

  • @bernardcrnkovic3769
    @bernardcrnkovic3769 Před 2 lety +3

    Absolutely astonishing quality sir. Literally on the 3b1b level. I hope this will help me pass the uni course. SUB!

  • @marvinmartin1373
    @marvinmartin1373 Před 3 lety +5

    Amazing approach ! Very well explained. Thanks!

  • @SleepeJobs
    @SleepeJobs Před 11 měsíci

    This video really saved me. From matrix representation to chain rule and visualisation, everything is clear now.

  • @lowerbound4803
    @lowerbound4803 Před 2 lety +3

    Very well-done. I appreciate the effort you put into this video. Thank you.

  • @_skeptik
    @_skeptik Před 2 lety +1

    This is a so high quality content. I have only basic knowledge of linear algebra and being a non-native speaker I could fully understand this

  • @lucasmercier5813
    @lucasmercier5813 Před 3 lety +5

    Impressive, lot of information but remains very clear ! Good job on this one ;)

  • @user-nk8ry3xs5u
    @user-nk8ry3xs5u Před 8 měsíci

    Thank you very much for your videos explaining how to build ANN and CNN from scratch in Python: your explanations of the detailed calculations for forward and backward propagation and for the calculations in the kernel layers of the CNN are very clear, and seeing how you have managed to implrment them in only a few lines of code is very helpful in 1. understanding the calculations and processes, 2. demistifying the what is a black box in tensorflow / keras.

  • @swapnilmasurekar5431
    @swapnilmasurekar5431 Před 2 lety

    This video is the best on CZcams for Neural Networks Implementation!

  • @samuelmcdonagh1590
    @samuelmcdonagh1590 Před 9 měsíci

    jesus christ this is a good video and shows clear understanding. no "i've been using neural networks for ten years, so pay attention as i ramble aimlessly for an hour" involved

  • @imgajeed
    @imgajeed Před 2 lety +1

    Thank you, that's the best video I have ever seen about neural networks!!!!! 😀

  • @ANANT9699
    @ANANT9699 Před rokem +1

    Wonderful, informative, and excellent work. Thanks a zillion!!

  • @shafinmahmud2925
    @shafinmahmud2925 Před 2 lety +1

    There are many solutions on the internet...but i must say this one is the best undoubtedly...👍 cheers man...pls keep posting more.

  • @erron7682
    @erron7682 Před 3 lety +1

    This is the best channel for learning deep learning!

  • @rumyhumy
    @rumyhumy Před rokem

    Man, I love you. How many times i tried too do the multilayer nn on my own, but always faced thousand of problems. But this video explained everything. Thank you

  • @aashishrana9356
    @aashishrana9356 Před 2 lety +1

    one of the best video i have ever seen.
    struggled alot to understand this and you have explained so beautifully
    you made me fall in love with the neural network which i was intimidating from.
    thank you so much.

    • @independentcode
      @independentcode  Před 2 lety

      Thank you for your message, it genuinely makes me happy to know this :)

  • @e.i.l.9584
    @e.i.l.9584 Před 8 měsíci

    Thank you so much, my assignment was so unclear, this definitely helps!

  • @vtrandal
    @vtrandal Před 2 lety

    Thank you! Well done! Absolutely wonderful video.

  • @yiqiangjizhang
    @yiqiangjizhang Před 3 lety

    This is so ASMR and well explained!

  • @cankoban
    @cankoban Před 2 lety

    I loved the background music. It gives peaceful mind. I hope, you will continue to make videos, very clear explanation

  • @omegaui
    @omegaui Před dnem

    Such a great video. Really helped me to understand the basics.

  • @marisakirisame659
    @marisakirisame659 Před rokem

    This is a very good approach to building neural nets from scratch.

  • @black-sci
    @black-sci Před 4 měsíci

    best video, very clear-cut. Finally I got the backpropagation and derivatives.

  • @salaheddinelachkar5683
    @salaheddinelachkar5683 Před 3 lety +2

    That was helpful, thank you so much.

  • @_sarps
    @_sarps Před 2 lety

    This is really dope. The best by far. Subscribed right away

  • @sythatsokmontrey8879
    @sythatsokmontrey8879 Před 2 lety

    Thanks you so much for your contribution in this field.

  • @huberhans7198
    @huberhans7198 Před 3 lety

    Very nice and clean video, keep it up

  • @anhtuanmai537
    @anhtuanmai537 Před rokem +1

    I think the last row's indices of the W^T matrix at 17:55 must be (w1i, w2i,...,wji).
    Still the best explannation i have ever seen btw, thank you so much. I dont know why this channel is still so underrated, looking forward to seeing your new videos in the future

    • @independentcode
      @independentcode  Před rokem +1

      Yeah I know, I messed it up. I've been too lazy to add a caption on that, but I really should. Thank you for the kind words :)

  • @spritstorm9037
    @spritstorm9037 Před rokem

    actually,you saved my life, thanks for doing these

  • @RAHULKUMAR-sx8ui
    @RAHULKUMAR-sx8ui Před rokem

    you are the best 🥺❤️..wow.. finally i able to understand the basics thanks

  • @macsiaproduction7823
    @macsiaproduction7823 Před 3 měsíci

    Thank you for really great explanation!
    Wish you will make even more 😉

  • @naheedray
    @naheedray Před 2 měsíci

    This is the best video i have seen so far ❤

  • @ti4680
    @ti4680 Před 2 lety

    Finally found the treasure. Please do more video bro. SUBSCRIBED

  • @shivangitomar5557
    @shivangitomar5557 Před rokem

    Amazing explanation!!

  • @erikasgrim2871
    @erikasgrim2871 Před 3 lety

    Amazing tutorial!

  • @chrisogonas
    @chrisogonas Před rokem

    That was incredibly explained and illustrated. Thanks

  • @TheAstralftw
    @TheAstralftw Před 2 lety

    Dude this is amazing

  • @prem7676
    @prem7676 Před rokem

    Awesome man!!

  • @lakshman587
    @lakshman587 Před rokem

    Thank you so much for the video!!!

  • @nudelsuppe3dsemmelknodel990

    You are the only youtuber I sincierly want to return. We miss you!

  • @mr.anderson5077
    @mr.anderson5077 Před 2 lety

    Keep it up .please make a deep learning and ml series for future.

  • @vilmospalik1480
    @vilmospalik1480 Před 5 měsíci

    this is a great video thank you so much

  • @snapo1750
    @snapo1750 Před rokem

    Thank you very very much for this video....

  • @princewillinyang5993
    @princewillinyang5993 Před 2 lety

    Content at it's peak

  • @areegfahad5968
    @areegfahad5968 Před rokem

    Amazing!!

  • @aiforchange1801
    @aiforchange1801 Před rokem

    Big Fan of you from today !

  • @ionutbosie6017
    @ionutbosie6017 Před 2 lety

    after 1000 videos watched, i think i get it now, thanks

  • @arvindh4327
    @arvindh4327 Před 2 lety

    Only 4 video and you have avove 1k subs,
    Please continue your work 🙏🏼

  • @Gabriel-V
    @Gabriel-V Před 2 lety

    Clear, to the point. Thank you. Like (because there are just 722, and have to be a lot more)

  • @rehelm3114
    @rehelm3114 Před 6 měsíci

    Well done

  • @zozodejante8350
    @zozodejante8350 Před 3 lety

    I love u , best ML video ever

  • @yakubumshelia1668
    @yakubumshelia1668 Před rokem

    Very educational

  • @Xphy
    @Xphy Před 2 lety

    Whyyyy you don't have 3Million subscriptions you deserve it ♥️♥️

  • @quantumsoul3495
    @quantumsoul3495 Před rokem

    That is so satisfying

  • @ShadabAlam-jz4vl
    @ShadabAlam-jz4vl Před rokem

    Best tutorial💯💯💯💯

  • @cicADA001
    @cicADA001 Před 3 lety +2

    your voice is calming and relaxing, sorry if that is weird

    • @independentcode
      @independentcode  Před 3 lety +2

      Haha thank you for sharing that :) Maybe I should have called the channel JazzMath .. :)

  • @819rajiv
    @819rajiv Před 2 lety +1

    grate video
    thank you

  • @willlowtree
    @willlowtree Před 2 lety

    great stuff

  • @IzUrBoiKK
    @IzUrBoiKK Před rokem

    I would like alot if u continue your channel bro

  • @pkurian1211
    @pkurian1211 Před 3 lety

    thanks a ton for this amazing video on neural networks, this is the best i have seen so far 😊. Can you please also give a hint how to update your code to make it a Binary Neural Network?

  • @zakariax2966
    @zakariax2966 Před 2 lety

    Awsome Video

  • @AynurErmis-vp9lq
    @AynurErmis-vp9lq Před 3 měsíci

    BEST OF BEST THANK YOU

  • @nikozdev
    @nikozdev Před rokem

    I developed my first neural network in one night yesterday. that could not learn because of backward propagation, it was only going through std::vectors of std::vectors to get the output. I was setting weights to random values and tried to guess how to apply backward propagation from what i have heard about it.
    But it failed to do anything, kept guessing just as I did, giving wrong answers anyway.
    This video has a clean comprehensive explanation of the flow and architecture. I am really excited how simple and clean it is.
    I am gonna try again.
    Thank you.

    • @nikozdev
      @nikozdev Před rokem +1

      I did it ! Just now my creature learnt xor =D

  • @blasttrash
    @blasttrash Před rokem

    amazing video. one thing we could do is to have layers calculate inputs automatically if possible. Like if I give Dense(2,8), then the next layer I dont need to give 8 as input since its obvious that it will be 8. Similar to how keras does this.

  • @tangomuzi
    @tangomuzi Před 3 lety

    I think most of the ML PhDs dont aware of this abstraction. Simply the best.

    • @independentcode
      @independentcode  Před 3 lety +2

      I don't know about PhDs since I am not a PhD myself, but I never found any simple explanation of how to make such an implementation indeed, so I decided to make that video :)

    • @tangomuzi
      @tangomuzi Před 3 lety +1

      @@independentcode I think you should keep going video seris and show how capable this type of abstraction. Implemnting easiliy almost every type of neural nets.

    • @independentcode
      @independentcode  Před 3 lety +1

      Thank you for the kind words. I did actually take that a step further, it's all on my GitHub here: github.com/OmarAflak/python-neural-networks
      I managed to make CNNs and even GANs from scratch! It supports any optimization method, but since it's all on CPU you get very quickly restricted by computation time. I really want to make series about it, but I'll have to figure out a nice way to explain it without being boring since it involves a lot of code.

    • @edilgin622
      @edilgin622 Před 2 lety

      @@independentcode GANs would be great also you could try to do RNNs too and maybe even some reinforcement learning stuff :D

  • @nathanlove4449
    @nathanlove4449 Před rokem

    Yeah, this is awesome

  • @onurkrmz9206
    @onurkrmz9206 Před 2 lety

    this is an amazing video which explains so perfectly how neural networks work. I appreciate and thank you for all the effort energy you put in this video and it is shame that your work did not receive enough views that it deserves. I believe you use manim to make animations like 3b1b, dont you?

    • @independentcode
      @independentcode  Před 2 lety

      Thanks a lot for the kind comment 😌 I'm glad if the video helped you in any way :) Yes it is indeed Manim!

    • @onurkrmz9206
      @onurkrmz9206 Před 2 lety

      sir please keep up with your videos I learn a lot

  • @filatnicolae2883
    @filatnicolae2883 Před rokem +2

    In your code you compute the gradient step for each sample and update immediately. I think that this is called stochastic gradient descent.
    To implement full gradient descent where I update after all samples I added a counter in the Dense Layer class to count the samples.
    When the counter reached the training size I would average all the stored nudges for the bias and the weights.
    Unfortunately when I plot the error over epoch as a graph there are a lot of spikes (less spikes than when using your method) but still some spikes.
    My training data has (x,y) and tries to find (x+y).

    • @gregynardudarbe7009
      @gregynardudarbe7009 Před 11 měsíci

      Would you be able to share the code? This is where the part where I’m confused.

  • @vanshajchadha7612
    @vanshajchadha7612 Před 5 měsíci

    This is one of the best videos to really understand the vectorized form of neural networks! Really appreciate the effort you've put into this.
    Just as a clarification, the video is considering only 1 data point and thereby performing SGD, so during the MSE calculation Y and Y* are in a way depicting multiple responses at the end for 1 data point only right? So for MSE it should not actually be using np.mean to sum them up?

  • @ramincybran
    @ramincybran Před 3 měsíci

    whiteout any doubt best explanation of NN ive ever seen - why you stop your productivity my friend ?

  • @filippeczek9099
    @filippeczek9099 Před 3 lety +3

    Great stuff! I find it even better than the one from 3b1b. Can you think of any way the code can be checked with matrices outside the learning set?

    • @independentcode
      @independentcode  Před 3 lety +1

      Thank you!
      If you mean to use the network once it has trained to predict values on other inputs, then yes of course. Simply run the forward loop with your input. You could actually make a predict() function that encapsulates that loop since it will be the same for any network.

  • @SleepeJobs
    @SleepeJobs Před 11 měsíci

    Great tutorial . Btw what is your editor font ?

  • @tnield9727
    @tnield9727 Před 2 lety

    This is absolutely amazing thank you. Is there any chance you can open source the manim animations too?
    I'm interested in that almost as much as the neural network library design!

  • @blasttrash
    @blasttrash Před rokem

    how can we update this to include mini-batch gradient descent? Especially how will the equations change?

  • @leandrofrutuoso985
    @leandrofrutuoso985 Před 3 lety

    This indeed is the better explanation of the math behind the neural networks I've found on the internet, could I please use your code on github in my final work for college?

    • @independentcode
      @independentcode  Před 3 lety +1

      Thank you for the kind words! Other videos are coming up ;)
      Yes of course, it is completely open source.

  • @Zero_Hour
    @Zero_Hour Před 2 lety

    Any chance on a video modifying the training to use SGD/minibatch?

  • @_juaem
    @_juaem Před rokem

    hi, i love this video. Only one question why in the DenseLayer.backward() in the bias and the weigths you use -= insted of =. Why we substract that value?
    The rest is all clear :) Ty

  • @black-sci
    @black-sci Před 2 měsíci

    In tensorflow they use weight matrix W dimensions i x j then take transpose in calculation.

  • @bassmit2304
    @bassmit2304 Před rokem

    when looking at the error and it's derivative wrt some y[i], intuitively I would expect that if I increased y[i] by 1 the error would increase by dE/dy[i], but if I do the calculations the change in the error is 1/n off from the derivative, does this make sense?

  • @AcceleratedVelocity
    @AcceleratedVelocity Před rokem +4

    I noticed that you are using a batch size of one. make a separate Gradiant variable and ApplyGradiants function for batch sizes > 1
    Note 1: also change "+ bias" to "np.add(stuff, bias)" or "+ bias[:,None]
    Note 2: in backpropagation, sum up the biases on axis 0 (I'm pretty sure that the axis is 0) and divide both weights and biases by batch size

    • @Tapsthequant
      @Tapsthequant Před rokem

      Thanks for the tip on the biases.

    • @guilhermealvessilveira8938
      @guilhermealvessilveira8938 Před 9 měsíci

      Thanks for the tip on the biases. (1)

    • @hossamel2006
      @hossamel2006 Před 8 měsíci

      Can you (or someone else) please explain to me what note 1 means.
      Edit: As for note 2, I successfully implemented it (by summing on axis 1), so thanks for the tip.

    • @nahianshabab724
      @nahianshabab724 Před 6 měsíci

      in the case of mini batch / batch gradient descent, would the input to the first layer be a matrix of ( Number_of_Features * Data_Points ) ? in that case, do I need to compute the average of the gradients in back propogation in each layer?

    • @hossamel2006
      @hossamel2006 Před 6 měsíci

      @@nahianshabab724 I guess yes, I saw that in multiple videos, just add a 1/m in the MSE formula.

  • @jigneshagrawal7956
    @jigneshagrawal7956 Před rokem

    what is the use of layer class ? and a great video. Hope you keep posting stuff on your channel

  • @SoftwareDeveloper2217
    @SoftwareDeveloper2217 Před 3 měsíci

    It is the best It is the Beauty because the explanation is great

  • @OmkarKulkarni-wf7ug
    @OmkarKulkarni-wf7ug Před 3 měsíci +1

    How output gradient is calculated and passed into the backward function?