C4W2L07 Inception Network

Sdílet
Vložit
  • čas přidán 29. 08. 2024
  • Take the Deep Learning Specialization: bit.ly/39u2Aa3
    Check out all our courses: www.deeplearni...
    Subscribe to The Batch, our weekly newsletter: www.deeplearni...
    Follow us:
    Twitter: / deeplearningai_
    Facebook: / deeplearninghq
    Linkedin: / deeplearningai

Komentáře • 45

  • @user-tj4ut8ox9r
    @user-tj4ut8ox9r Před 4 lety +47

    Didn't know one could cite memes in science papers 😂 gonna do it in mine

  • @navid2368
    @navid2368 Před 5 lety +44

    Andrew teaching me about memes, epic!
    thanks again for the amazing lecture

  • @sebbecht
    @sebbecht Před 5 lety +22

    Yep I am totally going to find a way to cite a meme in one of my papers!

  • @user-tj4ut8ox9r
    @user-tj4ut8ox9r Před 4 lety +7

    I'm so excited about this channel that I'm actually a little paranoid that it will shut down before I can finish watching all of the videos

    • @ericklestrange6255
      @ericklestrange6255 Před 4 lety +1

      its a very new topic so no much material on it but as time progresses even if this video might be removed more and better videos probably would arise on the matter

  • @ureunz
    @ureunz Před 4 lety +1

    really thanks Andrew. 🙏 Through your video, I totally understood what Inception network!! (including CNN)
    다른 한국 영상들에서는 보기 어려운 체계적이고 + 쉬운 설명입니다 영상 강력추천합니다. 👍

  • @shwethasubbu3385
    @shwethasubbu3385 Před 5 lety +5

    How do you know how small the bottleneck layer should be? And why would one want to shrink the number of channels as done in 8:46, what is the benefit of doing this?
    Also, instead of using a 1x1 as a bottleneck layer and then using 5x5 or 3x3 filters to reduce computational complexity, why can't we just use 1x1 filters throughout to get the required output dimensions?

    • @kishorekumarsingam
      @kishorekumarsingam Před 5 lety +1

      please let me know if anyone fimds the answer

    • @OneAndOnlyBigMax
      @OneAndOnlyBigMax Před 5 lety +2

      From what I've understood from this video the idea is to just reduce the amount of computations needed.
      Think of it as densely packing and combining all the features from the previous layer and proceeding to work with this "dense/tight" features - this is just computationaly cheaper.
      You can't just throw away 3x3, 5x5 filters because they are useful for finding "patterns" in images, like lines, curves, etc. A 1x1 convolution only looks at 1 pixel at a time - and such a network would just be a MLP with extra steps.
      This is just my intuition - feel free to add/correct something.

    • @pushyashah1377
      @pushyashah1377 Před 5 lety +1

      3x3, 5x5 and like those filters used to learn edge detection and many more details whereas 1x1 are a really useful tool for transforming the number of channels without changing the spatial dimensions.

  • @piankk
    @piankk Před 3 lety +3

    Great lecture! Thank you.
    I have a question.
    2:18
    Can I move the 1x1 conv layer to before max-pooling layer to reduce channels, like the one before 3x3 and 5x5 conv?
    What’s the difference?

    • @Matttsight
      @Matttsight Před rokem

      That's the same question i do have , do you got any clarity on that ?

  • @sandipansarkar9211
    @sandipansarkar9211 Před 3 lety

    very good explamnation.need to watch again

  • @EranM
    @EranM Před 5 lety +6

    Know your meme!

  • @scchouhansanjay
    @scchouhansanjay Před 3 lety +1

    I was thinking of the movie as well but I was also thinking that I am stupid to think that such a important paper will have anything to do with a movie.

  • @sinaro93
    @sinaro93 Před rokem

    Thanks a lot for sharing this.

  • @beagle989
    @beagle989 Před 2 lety

    I love you and your videos.

  • @marcocaceres4867
    @marcocaceres4867 Před rokem

    How small the number of filters have to be in the 1 x 1 convolution? Why 192 -> 16 -> 32 instead of 192 -> 1 -> 32?

  • @conorsmyth12358
    @conorsmyth12358 Před 3 lety

    This movie The Inception sounds great.

  • @shaelanderchauhan1963
    @shaelanderchauhan1963 Před 2 lety

    What is the purpose of using Maxpooling if we are not reducing the height and width of the dimension? I guess in previous lectures Andrew Ng said that max pool reduces the dimension such as 28*28*8 to 14*14*8 what is the purpose of applying it here is it to only keep the most important information by maxing it?

    • @aidynabirov7728
      @aidynabirov7728 Před rokem

      Well, the purpose is not the actual reducing the height and width, it's more of taking Features(max values) from the activations, and leaving the unnecessary features. So, that's why, I guess, they are applying MaxPooling, in order to reduce some features, and applying padding at the same time not to reduce the dimension. It's just the assumption, please tell me know, if I am wrong :)))

  • @amineabid7220
    @amineabid7220 Před 4 lety +1

    How do we manage to train the intermediate forks (to do predictions from hidden layers)? do we stop the back propagation at the fork?

    • @amineabid7220
      @amineabid7220 Před 4 lety +1

      Besides, how to define a labeled sample? 1. Is it the same training sample (x,y), in which case what would be the impact of running two back propagation process on the same layers: all layers before the fork?
      2. or is it (x',y) where x' is the activation of the layer just before the fork? And here, I see a problem concerning the fork itself: for the very first training samples, the activation just before the fork deeply depends on the weights of all the previous layers, which are not well trained yet (still a random guess). This means that this activation is not, to some extent, a good representation of the input data in the first place. Using this "bad/distorted" data as an input to the forked network would make us train it on discriminating something that is completely different from our real input data!

    • @mashmesh
      @mashmesh Před 4 lety

      I am also curious about this. I do not understand how the outputs from the forks are put together in end.

  • @strongsyedaa7378
    @strongsyedaa7378 Před 2 lety

    I haven't understood 😕
    Why we are using different filters?

  • @TheSalto66
    @TheSalto66 Před 4 lety

    The film Inception means put inside other people/mind a external thought (virus) , and from that thought start a new way to see world

  • @andrewwilliam2209
    @andrewwilliam2209 Před 4 lety

    what are the max pools for if they don't reduce the size? (SAME max pooling)

  • @dcastudios7185
    @dcastudios7185 Před 5 lety +9

    And I came here from - Inception movie explained 💤

  • @ericklestrange6255
    @ericklestrange6255 Před 4 lety

    thanks, very useful

  • @iammakimadog
    @iammakimadog Před 3 lety

    2:52 What's the purpose of "maxpool -> 1x1 conv"? Seems like applying 1x1 conv directly is strictly better than that, coz some information will be lost in maxpool...

    • @akashkewar
      @akashkewar Před 3 lety

      information is always lost in pooling, we use the max pool to keep important information (it has got "max" in it, and hence we use pixel which has maximum information). We use 1x1 conv to change the depth (channel) of the output by max-pooling so that we could perform concatenation (require the same dimension). Additionally, not every piece of information is important, using too much information could result in overfitting and you have compensated for it either by using the max pool (could be treated as weak regularize, theoretically ), kernel regularizer, dropout and so on.

  • @aione7448
    @aione7448 Před 4 lety +1

    I dint get the Inception , How can we stack convolved image after passing through different filter without pasdding since according to different filter dimension of convolved image too change

    • @MuhammadTalha-fz1it
      @MuhammadTalha-fz1it Před 4 lety

      hey, see in lecture there is same output shape coming from all filter 28*28 and stacking each output onto other .

    • @user-tj4ut8ox9r
      @user-tj4ut8ox9r Před 4 lety

      They are padded

  • @vytasmatas
    @vytasmatas Před 5 lety

    thank you very usefull

  • @itohamza
    @itohamza Před 3 lety

    how to concatenate 4 different data sizes?

  • @alex_316
    @alex_316 Před rokem

    This looks like an autoML but for the Conv NN

  • @DrN007
    @DrN007 Před 4 lety +3

    Better naming would've been: Deep Modular Network

  • @DevilErnest
    @DevilErnest Před 3 lety

    Padding is applied, isn’t it?

  • @trexmidnite
    @trexmidnite Před 3 lety

    VampireNet also good

  • @thovinh5386
    @thovinh5386 Před 4 lety +1

    From now on, I'll expect meme while reading papers

  • @anjithaanju2584
    @anjithaanju2584 Před 3 lety

    Can you make a video about Inception v4?