Swin Transformer - Paper Explained

Sdílet
Vložit
  • čas přidán 29. 08. 2024

Komentáře • 25

  • @VedantJoshi-mr2us
    @VedantJoshi-mr2us Před 2 měsíci +2

    By far one of the best + complete, SWIN transformer explanations on the entire Internet.

  • @yehanwasura
    @yehanwasura Před rokem +2

    Really informative, helped me lot to understand many concepts here. Keep up the good work

  • @SizzleSan
    @SizzleSan Před rokem +1

    Thorough! Very comprehensible, thank you.

  • @omarabubakr6408
    @omarabubakr6408 Před rokem

    That's The Most Illustrative Video Of Swin-Transformers on The Internet!

    • @soroushmehraban
      @soroushmehraban  Před rokem

      Glad you enjoyed it 😃

    • @omarabubakr6408
      @omarabubakr6408 Před rokem

      @@soroushmehraban yes abs thx so much, although I Have a Quick Question More Related to PyTorch actually which is in min 12:49 in line 239 in the code 1st what does -1 here means and what does it do exactly with the tensor 2nd from where we get [4,16] the 4 here from where we got it cuz its not mentioned in the reshaping. Thanks in advance.

  • @rohollahhosseyni8564
    @rohollahhosseyni8564 Před rokem

    Very well explained, thank you Soroush.

  • @kundankumarmandal6804
    @kundankumarmandal6804 Před 8 měsíci

    You deserve more likes and subscribers

  • @antonioperezvelasco3297
    @antonioperezvelasco3297 Před 10 měsíci

    Thanks for the good explanation!

  • @user-sw4hm4hh6h
    @user-sw4hm4hh6h Před rokem

    perfect description.

  • @proteus333
    @proteus333 Před 10 měsíci

    Amazing video !

  • @siarez
    @siarez Před rokem

    Great video! Thanks

  • @dslkgjsdlkfjd
    @dslkgjsdlkfjd Před měsícem

    2:43 C would be equal to the number of filters not the number of kernels. In the torch.nn.conv2d operation being performed we have 3 kernels for each input channel and then C number of filters. Each filter having 3 kernels not C number of kernels.

  • @akbarmehraban5007
    @akbarmehraban5007 Před rokem

    I enjoy very much

  • @EngineerXYZ.
    @EngineerXYZ. Před 7 měsíci

    Why channel increasees c to 4c after merging

    • @soroushmehraban
      @soroushmehraban  Před 7 měsíci +1

      Because we downsample the width by 2 and height by 2. That means we have 4x downsampling in spatial resolution that we give it to the channel dimension. It's just a simple tensor reshaping.
      For example 10x10x2 = 200.
      After merging it's 5x5x8 = 200.

  • @Karthik-kt24
    @Karthik-kt24 Před měsícem

    very nicely explained thank you! likes are at 314 so didnt hit like it😁subbed instead