Layer Normalization in Transformers | Layer Norm Vs Batch Norm

Sdílet
Vložit
  • čas přidán 27. 06. 2024
  • Layer Normalization is a technique used to stabilize and accelerate the training of transformers by normalizing the inputs across the features. It adjusts and scales the activations, ensuring consistent output distributions. This helps in reducing training time and improving model performance, making it a key component in transformer architectures.
    Share your thoughts, experiences, or questions in the comments below. I love hearing from you!
    ============================
    Did you like my teaching style?
    Check my affordable mentorship program at : learnwith.campusx.in
    DSMP FAQ: docs.google.com/document/d/1O...
    ============================
    📱 Grow with us:
    CampusX' LinkedIn: / campusx-official
    CampusX on Instagram for daily tips: / campusx.official
    My LinkedIn: / nitish-singh-03412789
    Discord: / discord
    E-mail us at support@campusx.in
    ✨ Hashtags✨
    #deeplearning #campusx #transformers #transformerarchitechture
    ⌚Time Stamps⌚
    00:00 - Intro
    02:20 - What is Normalization
    03:50 - What do we normalize?
    05:30 - Benefits of Normalization in DL
    07:10 - Internal Covariate Shift
    12:49 - Batch Normalization Revision
    22:56 - Why don't we use Batch Norm in Transformers?
    38:25 - How does Layer Normalization works?
    43:00 - Layer Normalization in Transformer

Komentáře • 85

  • @abhisheksaurav
    @abhisheksaurav Před 21 dnem +31

    This playlist is like a time machine. I’ve watched you grow your hair from black to white, and I’ve seen the content quality continuously improve video by video. Great work!

    • @animatrix1631
      @animatrix1631 Před 21 dnem

      I feel the same as well but I guess he's not that old

  • @RamandeepSingh_04
    @RamandeepSingh_04 Před 5 dny +3

    Another student added in the waiting list demanding for next video. Thank you sir.

  • @muhammadsheraz177
    @muhammadsheraz177 Před 21 dnem +15

    Please end this playlist as early as possible

  • @ayushrathore2570
    @ayushrathore2570 Před 14 dny +1

    This whole playlist is the best thing I discovered on CZcams! Thank you so much, sir

  • @yashshekhar538
    @yashshekhar538 Před 14 dny +4

    Respected Sir,
    your playlist is the best. Kindly increase the frequency of videos.

  • @ESHAANMISHRA-pr7dh
    @ESHAANMISHRA-pr7dh Před 2 dny

    Respected sir,
    I request you to please complete the playlist. I am really thankful to you for your amazing videos in this playlist. I have recommended this playlist to a lot of my friends and they loved it too. Thanks for providing such content for free🙏🙏

  • @akeshagarwal794
    @akeshagarwal794 Před 18 dny +2

    Congratulations for building a 200k Family you deserve even more reach🎉❤
    We love you sir ❤

  • @GanitSikho-xo2yx
    @GanitSikho-xo2yx Před 2 dny

    Well, I am waiting for your next video. It's a gem of learning!

  • @shreeyagupta5720
    @shreeyagupta5720 Před 21 dnem +2

    Congratulations for 200k sir 👏 🎉🍺

  • @rajnishadhikari9280
    @rajnishadhikari9280 Před 21 dnem

    Thanks for this amazing series.

  • @AmitBiswas-hd3js
    @AmitBiswas-hd3js Před 3 dny

    Please cover this entire Transformer architecture as soon as possible

  • @sahil5124
    @sahil5124 Před 20 dny +1

    this is really important topic. Thank you so much.
    Please cover everything about Transformer architecture

  • @user-nc8nc3lj1c
    @user-nc8nc3lj1c Před 13 dny

    Sir try to complete this playlist as early as possible , you are the best teacher and we want to learn the deep learning concept from you

  • @ai_pie1000
    @ai_pie1000 Před 19 dny +1

    Congratulations Brother for 200k users Family ... 👏👏👏

  • @rb4754
    @rb4754 Před 20 dny

    Congratulations for 200k subscribers!!!!!!!!!!!!!!!!!!

  • @arpitpathak7276
    @arpitpathak7276 Před 21 dnem

    Thank you sir I am waiting for this video ❤

  • @mayyutyagi
    @mayyutyagi Před 19 dny

    Amazing series full of knowledge...

  • @znyd.
    @znyd. Před 19 dny

    Congrats on the 200k subs, love from Bangladesh ❤.

  • @advaitdanade7538
    @advaitdanade7538 Před 21 dnem +3

    Sir please end this playlist fast placement season is nearby😢

  • @hassan_sid
    @hassan_sid Před 21 dnem

    It would be great if you make a video on RoPE

  • @dharmendra_397
    @dharmendra_397 Před 21 dnem

    Very nice video

  • @shibrajdeb5177
    @shibrajdeb5177 Před 14 dny

    sir please upload regular video . This videos help me a lot. please sir upload regular videos

  • @1111Shahad
    @1111Shahad Před 13 dny

    Thank you Nitish, Waiting for your next upload.

  • @muhammadsheraz177
    @muhammadsheraz177 Před 21 dnem +1

    Sir kindly can you tell that when this playlist will complete.

  • @physicskiduniya8054
    @physicskiduniya8054 Před 4 dny

    Bhaiya! Awaiting for your course upcoming videos please try to complete this playlist asap bhaiya

  • @taseer12
    @taseer12 Před 21 dnem +1

    Sir I can't describe your efforts Love from Pakistan

  • @SulemanZeb.
    @SulemanZeb. Před 21 dnem +1

    Please start MLOPs playlist as we are desperately waiting for.......

  • @rose9466
    @rose9466 Před 21 dnem

    Can you give an estimate by when this playlist will be completed

  • @WIN_1306
    @WIN_1306 Před 3 dny

    i am the 300th person to like this video
    sir plzz upload next vidoes
    we are eagerly waiting

  • @saurabhbadole821
    @saurabhbadole821 Před 15 dny

    I am glad that I found this Channel! can't thank you enough, Nitish Sir!
    One more request: If you could create one-shot revision videos for machine learning, deep learning, and natural language processing (NLP).🤌

  • @gurvgupta5515
    @gurvgupta5515 Před 16 dny

    Thanks for this video sir. Can you also make a video on Rotary Positional Embeddings (RoPE) that is used in Llama as well as other LLMs for enhanced attention.

  • @29_chothaniharsh62
    @29_chothaniharsh62 Před 21 dnem

    Sir can you please continue the 100 interview questions on ML playlist?

  • @shubharuidas2624
    @shubharuidas2624 Před 12 dny

    Please also continue with vision transformer

  • @not_amanullah
    @not_amanullah Před 12 dny

    Thanks ❤

  • @not_amanullah
    @not_amanullah Před 12 dny

    This is helpful 🖤

  • @intcvn
    @intcvn Před 4 dny

    complete jaldi sir waiting asf

  • @sagarbhagwani7193
    @sagarbhagwani7193 Před 20 dny

    thanks sir plse complete this playlist asap

  • @anonymousman3014
    @anonymousman3014 Před 9 dny

    Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism.
    I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.

  • @MrSat001
    @MrSat001 Před 21 dnem

    Great 👍

  • @teksinghayer5469
    @teksinghayer5469 Před 21 dnem

    when will you code transformer from scratch in pytorch

  • @Amanullah-wy3ur
    @Amanullah-wy3ur Před 3 dny

    thanks ❤

  • @technicalhouse9820
    @technicalhouse9820 Před 20 dny

    Sir love you so much from Pakistan

  • @virajkaralay8844
    @virajkaralay8844 Před 20 dny

    Absolute banger video again. Appreciate the efforts you're taking for transformers. Cannot wait for when you explain the entire transformer architecture.

    • @virajkaralay8844
      @virajkaralay8844 Před 20 dny

      Also, congratulations for 200k subscribers. May you reach many more milestones

  • @ishika7585
    @ishika7585 Před 21 dnem +1

    Kindly make video on Regex as well

  • @anonymousman3014
    @anonymousman3014 Před 9 dny

    Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism.
    I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost.

  • @WIN_1306
    @WIN_1306 Před 3 dny

    at 46:10 ,why it is zero?
    as beta is added so it will prevent it from becoming zero?

  • @manojprasad6781
    @manojprasad6781 Před 16 dny

    Waiting for the next video💌

  • @barryallen5243
    @barryallen5243 Před 16 dny

    Just ignoring padded rows while performing batch normalization should also work, I feel like it that padded zeros are not the only reason we layer normalization instead of batch normalization.

    • @WIN_1306
      @WIN_1306 Před 3 dny

      how would you ignore padding cols in batch normalisation?

  • @peace-it4rg
    @peace-it4rg Před 6 dny +1

    sir mera doubt that ki mai agar transformer architecture mai batchnorm use karoon kunki jo values matrix mai hai un sabka apna learning rate and bias factor hai
    to jo bias hai uskai karan to zero chala hi jayega fir layer norm kyun. kyunki ham ((x-u)/var)*lambda+bias krtai hi hain to bias to apne aap usko zero nhi hone dega. Please help sir

    • @RamandeepSingh_04
      @RamandeepSingh_04 Před 5 dny

      still it will be a very small number and will affect the result and not represent the true picture of the feature in batch normalization.

    • @WIN_1306
      @WIN_1306 Před 3 dny

      @@RamandeepSingh_04 compared to others who are without padding it will be small, but still sir wrote zero
      but zero to nhi hi hoga

  • @space_ace7710
    @space_ace7710 Před 21 dnem

    Yeah!!

  • @aksholic2797
    @aksholic2797 Před 21 dnem

    200k🎉

  • @SANJAYTYAGI-bk6tx
    @SANJAYTYAGI-bk6tx Před 12 dny

    Sir
    In batch normalization , in your example we have three mean and three variance along with same number of beta and gamma i.e. 3.
    But in layer normalization , we have eight mean and eight variance along with 3 beta and 3 gamma.
    That means number of beta and gamma are same in both batch and layer normalization.
    Is it correct? Pl elaborate on it .

    • @campusx-official
      @campusx-official  Před 12 dny

      Yes

    • @WIN_1306
      @WIN_1306 Před 3 dny

      mean and variance are used for normalisation ,beta and gamma are used for scaling

  • @vikassengupta8427
    @vikassengupta8427 Před 4 dny

    Sir next video ❤❤

  • @user-mw9ny7wc6l
    @user-mw9ny7wc6l Před 12 dny

    Jldi next video dalo sir

  • @adarshsagar9817
    @adarshsagar9817 Před 15 dny

    sir please complete the NLP playlist

    • @WIN_1306
      @WIN_1306 Před 3 dny

      which one?
      how many videos does it have?

  • @titaniumgopal
    @titaniumgopal Před 5 dny

    Sir PDF Update karo

  • @gauravbhasin2625
    @gauravbhasin2625 Před 20 dny

    Nitish, please relook at your covariate shift funds... yes, you are partially correct but how you explained covariate shift is actually incorrect. (example - Imagine training a model to predict if someone will buy a house based on features like income and credit score. If the model is trained on data from a specific city with a certain average income level, it might not perform well when used in a different city with a much higher average income. The distribution of "income" (covariate) has shifted, and the model's understanding of its relationship to house buying needs to be adjusted.)

    • @WIN_1306
      @WIN_1306 Před 3 dny

      ig , the explanation that sir gave and your explanation are same with different example of covariate shift

  • @DarkShadow00972
    @DarkShadow00972 Před 21 dnem

    Bring some coding example bro

  • @ghousepasha4172
    @ghousepasha4172 Před 10 dny

    Sir please complete playlist I will pay 5000 for that

  • @bmp-zz9pu
    @bmp-zz9pu Před 19 dny

    A video after 2 weeks in this playlist.....itna zulam mat karo.....thoda tez kaam kro sirji..............

  • @ashutoshpatidar3288
    @ashutoshpatidar3288 Před 16 dny

    please be a little fast!

  • @Amanullah-wy3ur
    @Amanullah-wy3ur Před 3 dny

    this is helpful 🖤

  • @anonymousman3014
    @anonymousman3014 Před 9 dny

    Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism.
    I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.

  • @anonymousman3014
    @anonymousman3014 Před 9 dny

    Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism.
    I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.

  • @anonymousman3014
    @anonymousman3014 Před 9 dny

    Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism.
    I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.

  • @anonymousman3014
    @anonymousman3014 Před 9 dny

    Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism.
    I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.

  • @anonymousman3014
    @anonymousman3014 Před 9 dny

    Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism.
    I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.

  • @anonymousman3014
    @anonymousman3014 Před 9 dny

    Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism.
    I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.

  • @anonymousman3014
    @anonymousman3014 Před 9 dny

    Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism.
    I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.

  • @anonymousman3014
    @anonymousman3014 Před 9 dny

    Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism.
    I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.

  • @anonymousman3014
    @anonymousman3014 Před 9 dny

    Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism.
    I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.

  • @anonymousman3014
    @anonymousman3014 Před 9 dny

    Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism.
    I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.

  • @anonymousman3014
    @anonymousman3014 Před 9 dny

    Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism.
    I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.

  • @anonymousman3014
    @anonymousman3014 Před 9 dny

    Sir, is transformer architecture completed as I want to cover it ASAP, I have covered the topics till attention mechanism.
    I want to cover the topic in one go. Sir please tell please. And, sir I request to upload all video asap. I want to learn a lot. And thanks for the amazing course at 0 cost. God bless you.