Backpropagation explained | Part 2 - The mathematical notation

Sdílet
Vložit
  • čas přidán 8. 07. 2024
  • We covered the intuition behind what backpropagation's role is during the training of an artificial neural network.
    • Backpropagation explai...
    Now, we're going to focus on the math that's underlying backprop. The math is pretty involved, and so we're going to break it up into bite-sized chunks across a few videos.
    We're going to start out in this video by first quickly recapping how backpropagation is used during the training process. Then, we'll jump over to the math side of things and open our discussion up by going over the notation and definitions that we'll be using for our backprop calculations going forward.
    These definitions and the notation will be the focus of this video. The math underlying backprop all relies heavily on what we'll get introduced to here, so it's crucial that these things are understood before moving forward.
    Lastly, we'll narrow our focus to discuss the several indices that the notation depends on.
    🕒🦎 VIDEO SECTIONS 🦎🕒
    00:00 Welcome to DEEPLIZARD - Go to deeplizard.com for learning resources
    00:40 Outline
    01:26 Backpropagation Recap
    02:50 Definitions and Notation
    07:20 Review
    10:00 Summary
    10:33 Collective Intelligence and the DEEPLIZARD HIVEMIND
    💥🦎 DEEPLIZARD COMMUNITY RESOURCES 🦎💥
    👋 Hey, we're Chris and Mandy, the creators of deeplizard!
    👉 Check out the website for more learning material:
    🔗 deeplizard.com
    💻 ENROLL TO GET DOWNLOAD ACCESS TO CODE FILES
    🔗 deeplizard.com/resources
    🧠 Support collective intelligence, join the deeplizard hivemind:
    🔗 deeplizard.com/hivemind
    🧠 Use code DEEPLIZARD at checkout to receive 15% off your first Neurohacker order
    👉 Use your receipt from Neurohacker to get a discount on deeplizard courses
    🔗 neurohacker.com/shop?rfsn=648...
    👀 CHECK OUT OUR VLOG:
    🔗 / deeplizardvlog
    ❤️🦎 Special thanks to the following polymaths of the deeplizard hivemind:
    Tammy
    Mano Prime
    Ling Li
    🚀 Boost collective intelligence by sharing this video on social media!
    👀 Follow deeplizard:
    Our vlog: / deeplizardvlog
    Facebook: / deeplizard
    Instagram: / deeplizard
    Twitter: / deeplizard
    Patreon: / deeplizard
    CZcams: / deeplizard
    🎓 Deep Learning with deeplizard:
    Deep Learning Dictionary - deeplizard.com/course/ddcpailzrd
    Deep Learning Fundamentals - deeplizard.com/course/dlcpailzrd
    Learn TensorFlow - deeplizard.com/course/tfcpailzrd
    Learn PyTorch - deeplizard.com/course/ptcpailzrd
    Natural Language Processing - deeplizard.com/course/txtcpai...
    Reinforcement Learning - deeplizard.com/course/rlcpailzrd
    Generative Adversarial Networks - deeplizard.com/course/gacpailzrd
    🎓 Other Courses:
    DL Fundamentals Classic - deeplizard.com/learn/video/gZ...
    Deep Learning Deployment - deeplizard.com/learn/video/SI...
    Data Science - deeplizard.com/learn/video/d1...
    Trading - deeplizard.com/learn/video/Zp...
    🛒 Check out products deeplizard recommends on Amazon:
    🔗 amazon.com/shop/deeplizard
    🎵 deeplizard uses music by Kevin MacLeod
    🔗 / @incompetech_kmac
    ❤️ Please use the knowledge gained from deeplizard content for good, not evil.

Komentáře • 65

  • @JimmyCheng
    @JimmyCheng Před 5 lety +53

    best channel on ML ever, clean, crisp, and a beautiful voice !!!

  • @trentonpaul6376
    @trentonpaul6376 Před 5 lety +26

    I was trying to find an entry point into machine learning somewhere on the internet and this course is just that, I wish I had found it sooner

  • @mophez
    @mophez Před rokem +3

    your lectures are a hidden treasure. Currently working in devops and now studying for my AWS machine learning certification and this is great for filling my gaps in this space, along with Andrew Ng's content. thanks

  • @GilangD21
    @GilangD21 Před 6 lety +42

    this channel is gold

    • @deeplizard
      @deeplizard  Před 6 lety +1

      Thanks, Gilang! Glad you think so!

  • @sachyadon
    @sachyadon Před 5 lety +6

    Simply awesome. Not sure why this channel is not mentioned frequently. Intution, deep maths, programming. So many hard concepts explained with such an ease. Respect !!

  • @moosen8249
    @moosen8249 Před 5 lety +26

    I'm amazed at how amazing this channel is, I watch one video and somehow the next video is even more gold. I'm 28 videos into the playlist and every video is so extremely well edited, put together, narrated, and visualized. The content you make turns the most confusing things to visualize into simple ideas and very clearly laid out. Even taking a entire video to explain the definitions and notation is so incredibility valuable, instead of spewing jargon that might as well be another language, you let the viewer be on the same page before starting. I love this channel, I am so happy I found it, and I can't stop sharing it with the people in my class/lab.
    Please continuing making videos because I honestly haven't found another channel that even comes close to the level of clear, concise, explanations that this channel is producing!

    • @deeplizard
      @deeplizard  Před 5 lety +2

      Wow, Moose! Thank you x 1000 for such a thoughtful and genuine comment! We're really glad that you found our channel and you're here now and part of the community :) Thanks so much for sharing the content with your class and lab!

    • @Artaxerxes.
      @Artaxerxes. Před 2 lety

      It jargon only you've not studied highschool math

  • @justchill99902
    @justchill99902 Před 5 lety +7

    I love this teacher. This channel needs more subs man!

  • @Electra_Lion
    @Electra_Lion Před 3 lety +1

    I love your videos, I love your voice. Recommended your channel many times to friends!

  • @deepakec7
    @deepakec7 Před 5 lety +2

    You are a savior ! This series is so amazing !

  • @UmbralOfNight
    @UmbralOfNight Před 5 lety +3

    This video is so awesome!!, thank you for the clear explanations on Backpropagation. You lady are a great teacher!

  • @DM-py7pj
    @DM-py7pj Před 9 měsíci +1

    Love it and especially that you zoom in. Entirely personal, I know, but I prefer minimal use of bright white backgrounds and the spinning/rotating red rectangles, for focusing attention on certain areas, don't need the rotating effect. I find the rotation distracting and a stressor.

  • @robingerster1253
    @robingerster1253 Před 2 lety +1

    I have never subscribed to anything on CZcams before ... but I was just so happy with these cleanly executed and correct tutourials :). One suggestion would be to add certificates to your courses and give people something to work towards.

  • @tymothylim6550
    @tymothylim6550 Před 3 lety +1

    Thank you very much for this video! It was great that you used visualizations to explain the notation! I usually don't like looking at mathematical notations but the visualizations made it fun and enjoyable to understand!

  • @nerkulec
    @nerkulec Před 6 lety +3

    Thank you very much! This is very important

  • @circuithead94
    @circuithead94 Před 3 lety +1

    Very detailed and carefully done. Thank you so much for your work.

  • @dcrespin
    @dcrespin Před rokem

    It may be worth to note that instead of partial derivatives one can work with derivatives as the linear transformations they really are.
    Also, looking at the networks in a more structured manner makes clear that the basic ideas of BPP apply to very general types of neural networks. Several steps are involved.
    1.- More general processing units.
    Any continuously differentiable function of inputs and weights will do; these inputs and weights can belong, beyond Euclidean spaces, to any Hilbert space. Derivatives are linear transformations and the derivative of a neural processing unit is the direct sum of its partial derivatives with respect to the inputs and with respect to the weights. This is a linear transformation expressed as the sum of its restrictions to a pair of complementary linear subspaces.
    2.- More general layers (any number of units).
    Single unit layers can create a bottleneck that renders the whole network useless. Putting together several units in a unique layer is equivalent to taking their product (as functions, in the sense of set theory). The layers are functions of the of inputs and of the weights of the totality of the units. The derivative of a layer is then the product of the derivatives of the units; this is a product of linear transformations.
    3.- Networks with any number of layers.
    A network is the composition (as functions, and in the set theoretical sense) of its layers. By the chain rule the derivative of the network is the composition of the derivatives of the layers; this is a composition of linear transformations.
    4.- Quadratic error of a function.
    ...
    ---
    With the additional text down below this is going to be excessively long. Hence I will stop the itemized previous comments.
    The point is that a sufficiently general, precise and manageable foundation for NNs clarifies many aspects of BPP.
    If you are interested in the full story and have some familiarity with Hilbert spaces please google for our paper dealing with Backpropagation in Hilbert spaces. A related article with matrix formulas for backpropagation on semilinear networks is also available.
    We have developed a completely new deep learning algorithm called Neural Network Builder (NNB) which is orders of magnitude more efficient, controllable, precise and faster than BPP.
    The NNB algorithm assumes the following guiding principle:
    The neural networks that recognize given data, that is, the “solution networks”, should depend only on the training data vectors.
    Optionally the solution network may also depend on parameters that specify the distances of the training vectors to the decision boundaries, as chosen by the user and up to the theoretically possible maximum. The parameters specify the width of chosen strips that enclose decision boundaries, from which strips the data vectors must stay away.
    When using the traditional BPP the solution network depends, besides the training vectors, in guessing a more or less arbitrary initial network architecture and initial weights. Such is not the case with the NNB algorithm.
    With the NNB algorithm the network architecture and the initial (same as the final) weights of the solution network depend only on the data vectors and on the decision parameters. No modification of weights, whether incremental or otherwise, need to be done.
    For a glimpse into the NNB algorithm, search in this platform our video about :
    NNB Deep Learning Without Backpropagation.
    In the description of the video links to a free demo software will be found.
    The new algorithm is based on the following very general and powerful result (google it): Polyhedrons and Perceptrons Are Functionally Equivalent.
    For the conceptual basis of general NNs in see our article Neural Network Formalism.
    Regards,
    Daniel Crespin

  • @pravindurgani1276
    @pravindurgani1276 Před 3 lety +1

    You are 100 times better than the Professor of ML at my uni.

  • @lancelotdsouza4705
    @lancelotdsouza4705 Před 2 lety +1

    Videos have so nicely explained ,especially the math.Thanks so much

  • @vaishnavbharadwaj669
    @vaishnavbharadwaj669 Před 3 lety +2

    amazing work !! thanks u for making ML so easy to understand

  • @RobertsMrtn
    @RobertsMrtn Před 5 lety +2

    Firstly, I would like to congratulate you on such a well presented video. This has clarified the existing technique of back propagation. It seems to me however that the technique would work best if the loss function is linearly proportional to the weights on each layer. The technique works quite well and is used in many current AI applications. It does however require large training data sets and is not as efficient as the brain which does not use back propagation. Thanks again, this will help me in my research. I will definitely be subscribing.

  • @ramiro6322
    @ramiro6322 Před 3 lety +4

    0:00 Introduction
    0:40 Outline
    1:26 Backpropagation Recap
    2:50 Definitions and Notation
    7:20 Review
    10:00 About the next video
    I hope this is useful!

    • @deeplizard
      @deeplizard  Před 3 lety +1

      Perfect, thank you! Added to the description :)

  • @rangarajubharath1095
    @rangarajubharath1095 Před 4 lety +1

    love ur knowledge

  • @deeplizard
    @deeplizard  Před 6 lety

    Backpropagation explained | Part 1 - The intuition
    czcams.com/video/XE3krf3CQls/video.html
    Backpropagation explained | Part 2 - The mathematical notation
    czcams.com/video/2mSysRx-1c0/video.html
    Backpropagation explained | Part 3 - Mathematical observations
    czcams.com/video/G5b4jRBKNxw/video.html
    Backpropagation explained | Part 4 - Calculating the gradient
    czcams.com/video/Zr5viAZGndE/video.html
    Backpropagation explained | Part 5 - What puts the “back” in backprop?
    czcams.com/video/xClK__CqZnQ/video.html
    Machine Learning / Deep Learning Fundamentals playlist: czcams.com/play/PLZbbT5o_s2xq7LwI2y8_QtvuXZedL6tQU.html
    Keras Machine Learning / Deep Learning Tutorial playlist: czcams.com/play/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL.html

  • @Otonium
    @Otonium Před 5 lety +1

    Notation is often omitted in videos. Good work

  • @richarda1630
    @richarda1630 Před 3 lety +1

    Whew! Now onto the next ... :)

    • @deeplizard
      @deeplizard  Před 3 lety

      Glad to see you progressing through the content, Richard!

  • @davidmusoke
    @davidmusoke Před 4 lety +1

    Hey, first-time listener here! I'm amazed at the quality of your videos, easy-to-listen-to, calm beautiful voice and with the right pacing of the concepts you are teaching. I like the way you simplify complex concepts to newbies in ML like me (a Ph.D. student in Biomedical Engineering btw). I was wondering about the Scientific notebook you use in these videos...who makes this software? Also, I'm interested in de-noising images. Is it readily possible to train a network to recognize noise within the image, develop a noise template and use it to remove the noise w/o softening the image? Current ML algorithms I've seen, involving mostly reducing MSE blur sharp image details. I'm looking at noisy images from low dose CT scanners as that's the rage now, to try to x-ray reduce deadly dosage to the patient.
    Your channel is gold, as others have said. I watch one video after another without stopping and they are just about the right length before one loses focus. If it wasn't for the fact that we are in Spring break this week, I'd be skipping classes by mistake, as I'm currently grossed watching your videos. Video #24 so far since last night(!)... 14 more to go. Hope to finish this evening. Not sure what to do next :) !
    Thanks again for this great channel, deeplizard ... Cool and ingenious name, btw.

    • @deeplizard
      @deeplizard  Před 4 lety

      Hey David - Haha that's great! Glad you're enjoying content :D
      Regarding the notebook, it was created using MacKichan Software's Scientific Notebook 5.5.
      Regarding denoising images, autoencoders have been used to accomplish this. I've not worked much with them, but I cover them a bit in the episode on unsupervised learning and show a brief denoising example.
      deeplizard.com/learn/video/lEfrr0Yr684
      Regarding where to go next, I'd recommend the Keras series :)
      deeplizard.com/learn/playlist/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL

    • @davidmusoke
      @davidmusoke Před 4 lety

      Thanks for the quick response and helpful links, deeplizard. One more question and that's regarding back-propagation (BP) video. Does one BP to the preceding inner layer L-1 and have weights recalculated or do you BP to the very first inner layer and then have new weights re-calculated for all proceeding layers throughout the network? Thanks again!

  • @ThatIndanBoy
    @ThatIndanBoy Před 4 lety +1

    Amazing

  • @pawansj7881
    @pawansj7881 Před 6 lety +4

    I have observed one topic missing in this playlist!!! " BIAS ", if you add one video which explains the importance of biases that makes this playlist perfect.

    • @deeplizard
      @deeplizard  Před 6 lety

      Thanks for the suggestion, Pawan! I have bias on my list to cover in a future vid!

    • @deeplizard
      @deeplizard  Před 6 lety +5

      Now, there is a video on bias 😎
      czcams.com/video/HetFihsXSys/video.html

    • @pawansj7881
      @pawansj7881 Před 6 lety +1

      Thats great!!! Hitting the point!!! M glad that "deeplizard' has considered my suggestion.

  • @SachaD88
    @SachaD88 Před 4 lety

    Why in theory argument of derivative is sum(w*x) to the activation function and in all code realisations argument of derivative is output of activation function?

  • @sonixdream6792
    @sonixdream6792 Před 3 lety

    Great tutorials, very well done. Thank you for them.
    A small fix for the code in the site, especially j++, otherwise you get an infinite loop :).
    int sum = 0;
    int j = 0;
    while (j < a.length) {
    sum = sum + a[j++];
    }

  • @John-wx3zn
    @John-wx3zn Před 2 měsíci

    Hi Mandy, where does the weight of the connection come from?

  • @elgs1980
    @elgs1980 Před 3 lety

    Where are the weights from layer l to the output?

  • @dufferinmall2250
    @dufferinmall2250 Před 6 lety +3

    Thank you soooooo much PLEASE do a video on LTSM's

    • @deeplizard
      @deeplizard  Před 6 lety +3

      You're welcome! Also, I have LSTMs on my list to cover in a future video!

    • @WesternJiveStudio
      @WesternJiveStudio Před 5 lety

      Great to hear that!

  • @s25412
    @s25412 Před 3 lety

    3:44 you've defined that l and l-1 are indexed as j=0,1,...,n-1 and k=0,1,...,n-1, respectively. But what if layer l and l-1 have different number of nodes?

  • @Waleed-qv8eg
    @Waleed-qv8eg Před 6 lety +1

    Thank you so much for the explanation! Is knowing math operations in details mandatory if I plan to use deep learning for image processing? or just knowing how it works with no math details?

    • @deeplizard
      @deeplizard  Před 6 lety +2

      You're welcome! It isn't required to understand the math in order to build or use a neural network for image processing. For example, you'll see in the Keras playlist (czcams.com/play/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL.html) that we don't use much math when building and coding our networks.
      Developing an understanding for the math does, however, give you a deeper understanding for what otherwise looks like magic. Also, if you do understand the math, then it may help you with designing your architecture, tuning your model, and even troubleshooting when the model is not performing in a way that you'd expect.

    • @Waleed-qv8eg
      @Waleed-qv8eg Před 6 lety +1

      deeplizard yes I watched 3 videos so far but I stopped a little bit to finish this great playlist! Honestly I’m so excited and don’t want this playlist to end so please add more and more videos to this playlist!

    • @deeplizard
      @deeplizard  Před 6 lety

      I'm so happy to hear how much you're enjoying the content! Thanks for letting me know :)
      More videos to come soon!

    • @deepaksingh9318
      @deepaksingh9318 Před 6 lety +1

      الانترنت لحياة أسهل Net4Easy exactly, I also want more abs more videos like these

  • @asdfasdfuhf
    @asdfasdfuhf Před 5 lety +1

    Can you share that "mathematic scientific notebook" that you are reading from in the video?

    • @deeplizard
      @deeplizard  Před 5 lety

      Hey Sebastian - Download access to code files and notebooks are available as a perk for the deeplizard hivemind. Check out the details regarding deeplizard perks and rewards at: deeplizard.com/hivemind
      If you choose to join, you will gain download access to the math notebook from the backprop series here:
      www.patreon.com/posts/22080906
      Note, the notebook was created using MacKichan Software's Scientific Notebook 5.5. The notebook is a .tex file. To open this file, you can download Scientific Viewer for free from the link below, which will allow you to view the notebook but not edit it.
      www.mackichan.com/index.html?products/sv.html~mainFrame
      You may also want to check out Scientific Notebook for purchase or a free trial from the link below if you want to create, edit, and save your own notebooks.
      www.mackichan.com/index.html?products/dnloadreq.html~mainFrame

  • @thespam8385
    @thespam8385 Před 4 lety +1

    {
    "question": "Indices are important to understand the interaction between layers and nodes in the backpropagation algorithm. Where l is the layer index:",
    "choices": [
    "j is the node index for l and k is the node index for l-1.",
    "k is the node index for l and j is the node index for l-1.",
    "j is the node index for l and k is the node index for l+1.",
    "k is the node index for l and j is the node index for l+1."
    ],
    "answer": "j is the node index for l and k is the node index for l-1.",
    "creator": "Chris",
    "creationDate": "2020-04-17T17:36:43.623Z"
    }

    • @deeplizard
      @deeplizard  Před 4 lety +1

      More great questions, thanks Chris!
      Just added your question to deeplizard.com/learn/video/2mSysRx-1c0 :)

  • @s25412
    @s25412 Před 3 lety

    Do you not consider bias? I assume it's excluded for simplicity?

  • @ep9017
    @ep9017 Před 4 lety

    The node indexing implies that every layer has the same amount of nodes which shouldn't be a restriction

  • @rewangtm
    @rewangtm Před 4 lety

    Much better explanation than Andrew Ng!

  • @ismailelabbassi7150
    @ismailelabbassi7150 Před 2 lety +1

    i was loving you but now i love so mush.thank you so mush

  • @DanielMarrable
    @DanielMarrable Před 6 lety +1

    You said, "use the math for backprop moving forward" ?

    • @deeplizard
      @deeplizard  Před 6 lety +3

      Hey Daniel - By "moving forward," I meant this as in "moving forward in our process of learning backprop." I wasn't meaning it in a way that suggests backprop is used in forward manner. Hope this helps clarify! Let me know if it doesn't.

  • @Actanonverba01
    @Actanonverba01 Před 4 lety +1

    During the time 1:24 to 2:50 you are talking a describing things BUT there is Nothing on the screen. You need more pictures that describe your words. Also, the next slide "Definitions & Notation" has way a list of items with no arrows pointing to what it is in the diagram on the right. I was lost... This is a visual media!

    • @deeplizard
      @deeplizard  Před 4 lety

      During that time in the episode, I am summarizing the process that was taught in the previous episode. The corresponding visuals for that process, along with a more detailed explanation, can be found in the previous episode. Additionally, there are corresponding written posts for most videos that you can find on deeplizard.com that may be helpful for you. The one for this video is here:
      deeplizard.com/learn/video/2mSysRx-1c0
      The previous episode I mentioned is here:
      deeplizard.com/learn/video/XE3krf3CQls

  • @fupopanda
    @fupopanda Před 5 lety

    So I assume n is number of nodes in a layer. That's the only logical answer that works here.