C4W2L06 Inception Network Motivation

Sdílet
Vložit
  • čas přidán 29. 08. 2024

Komentáře • 78

  • @letsplaionline
    @letsplaionline Před 7 měsíci +1

    you ease our minds that are complicated by other professors and we are thankful for that!! 🙏

  • @arunyadav8773
    @arunyadav8773 Před 6 lety +20

    best explanation available online 👍

  • @harshniteprasad5301
    @harshniteprasad5301 Před rokem +1

    7:59 the value of the convoluted matrix should be 24*24*32, since the 28*28 when convoluted with a 5*5 filter will return (28-5+1) 24.

    • @aangulog
      @aangulog Před rokem +1

      Not necessarily, padding can lead to a matrix of the same dimensions.

    • @harshniteprasad5301
      @harshniteprasad5301 Před rokem

      @@aangulog but it wasnt mentioned that we are using padding , yes you are correct tho we can get that output using padding

    • @aangulog
      @aangulog Před rokem

      @@harshniteprasad5301 Maybe it's implied, because you can say the same for the stride.

  • @rafibasha4145
    @rafibasha4145 Před 2 lety +1

    @8:84,why we need to multiply with output 28*28*16 instead of 1*1*192*16

  • @jimmylee2197
    @jimmylee2197 Před 6 lety +23

    at 2:05, how does a pool operator change the channel size 192 to 32? Does pooling over channel make sense?

    • @hantong3108
      @hantong3108 Před 6 lety +1

      Think NIN was applied after the pooling to make the number of channels match, but not sure why its heights and weights are still 28*28 after pooling

    • @zuozhou8329
      @zuozhou8329 Před 6 lety +3

      Pooling and CONV are acturally similar, the output shape of each filter of them can be calculated by (n(l-1)+2p(l)-f(l))/s(l)+1. Sometimes we want to keep the output as the shape of input, that was saying n(l)=n(l-1) where n(l)=[n(l-1)+2p(l)-f(l)]/s(l)+1. Usually we set s=1, that's simplified as n(l-1)=n(l-1)+2p(l)-f(l)+1, that means we could keep the shape by setting padding as p=0.5[f(l)-1]. For exapmle, if we have a pooling filter shaped 5 by 5, our padding should be 2.

    • @zuozhou8329
      @zuozhou8329 Před 6 lety

      The size of filters in pooling would be 28 * 28 *192 , and the amount of filters would be 32.
      But i dont have ideas for you second question, sorry.

    • @parnianshahkar7797
      @parnianshahkar7797 Před 6 lety +1

      that's because we are using the same padding! And by the way what is a NIN?

    • @xinye8585
      @xinye8585 Před 5 lety

      I am also confused about it.

  • @fumihio
    @fumihio Před 4 lety +8

    How to choose the number of filters? 3:50
    Why 1x1 uses 64, 3x3 uses 128, and so on?

    • @akashkewar
      @akashkewar Před 4 lety +9

      well, there is no hard rule for that, Rule of thumb is, More the number of filters, More feature you are extracting. Also, keep in mind, not all the features are important (some people think more the features are better the model will be, this is not true at all) and this could lead to overfitting and computational overhead. And It is totally problem-specific (choosing filter size). It is hyper-parameter in itself. Lastly, You could do a hyper-parameter search to get the best filter size (that would be insane because you have multiple layers and each layer has a filter).

    • @tanhoang1022
      @tanhoang1022 Před 3 lety

      you can change P(padding) to have the same 28x28

    • @gourabmukhopadhyay7211
      @gourabmukhopadhyay7211 Před 2 lety

      @@akashkewar Hey, could you explain how come 28*28 remains to be 28*28 after 3*3 filters and also for others? I get for 1*1 it remains to be 28*28 as it is (28-1+1)=28.In the similar manner is not it like( 28-3+1)=26 for 3*3?

    • @RohanPaul-AI
      @RohanPaul-AI Před 2 lety +2

      @@gourabmukhopadhyay7211 Good point indeed.
      And this ( 28*28 remains to be 28*28 after 3*3 filters ) is done, by setting padding='same'.
      So every time the output shape will be 28 * 28.
      Checkout out below code see the result.
      ```py
      from keras.layers import Conv2D
      from keras.models import Sequential
      models = Sequential()
      models.add(Conv2D(32, kernel_size=(3, 3), activation='relu', input_shape=(28, 28, 192), padding='same'))
      models.add(Conv2D(64, kernel_size=(3, 3), activation='relu', padding='same'))
      models.add(Conv2D(128, kernel_size=(3, 3), activation='relu', padding='same'))
      models.summary()
      ```
      OUTPUT
      ```
      Model: "sequential_1"
      _________________________________________________________________
      Layer (type) Output Shape Param #
      =================================================================
      conv2d_3 (Conv2D) (None, 28, 28, 32) 55328
      conv2d_4 (Conv2D) (None, 28, 28, 64) 18496
      conv2d_5 (Conv2D) (None, 28, 28, 128) 73856
      =================================================================
      Total params: 147,680
      Trainable params: 147,680
      Non-trainable params: 0
      _________________________________________________________________
      ```

    • @gourabmukhopadhyay7211
      @gourabmukhopadhyay7211 Před 2 lety +1

      @@RohanPaul-AI Yes, I also figured that out that padding was same. But still thank you for making time to comment here as it helped me to confirm.

  • @marimbanation4118
    @marimbanation4118 Před 5 lety +24

    give this guy a good mic

  • @SuilujChannel
    @SuilujChannel Před 3 lety +3

    question regarding the 1x1 conv strategy at 6:00
    i understand that this trick reduces the number of parameters. but what i don't understand is how it is comparable to the original 5x5 conv.
    from my understand this would create completely different features because it does not use the original input of the layer but the output of the 1x1 conv. So what's the point?
    Update: Ah okay he mentions this thought at the end of the video. It seems there is no big impact on performance "if you choose the reduction right".

  • @CppExpedition
    @CppExpedition Před 2 lety

    Andrew is HUGEEE!☺

  • @gorgolyt
    @gorgolyt Před 3 lety +2

    Absolutely lucid, as ever. 👏

  • @rehabnafea5058
    @rehabnafea5058 Před 2 lety

    That was very useful for me, thank you so much

  • @user-ie9tb4pr1c
    @user-ie9tb4pr1c Před 5 měsíci

    thank you very very much🥲🥲🥲🥲🥲🥲

  • @suryanarayanan5158
    @suryanarayanan5158 Před 4 lety +4

    what does "same" mean? Does it mean have the same height and width as the previous layer?

    • @alanamonteiro5381
      @alanamonteiro5381 Před 4 lety +5

      Exactly, as he mentions, you will need to add padding for that

    • @Joshua-dl3ns
      @Joshua-dl3ns Před 4 lety

      essentially just applies filter, then pads such that the output image has same width and length as the input

  • @anandtewari8014
    @anandtewari8014 Před 3 lety

    GREAT SIR

  • @ahmeddrief3103
    @ahmeddrief3103 Před rokem

    why did you use max pooling and do the same padding i thoughout the utilization of max pooling is to divide the dimension ??

  • @muhammadharris4470
    @muhammadharris4470 Před 5 lety +11

    3:12 rap right there :D

  • @mike19558
    @mike19558 Před 2 lety

    Really helpful!

  • @shvprkatta
    @shvprkatta Před 3 lety

    amazing sir..thank you

  • @sandipansarkar9211
    @sandipansarkar9211 Před 3 lety

    nice explanation.need to watch again

  • @oktayvosoughi6199
    @oktayvosoughi6199 Před 10 měsíci

    what I can not understand is that how after applying 5x5 or 3x3 filter still we have 28x28 output as we saw in earlier lecture we can found it by nh-2p-f+1/s.

  • @Ghumnewali
    @Ghumnewali Před 2 lety

    Now I gotta watch Inception again.. 🤔

  • @YigitMesci
    @YigitMesci Před 5 lety +1

    What i dont understand is how an input image could have 192 multiple channels..? Is there a common type of usage where inputs are not only consist of R, G and B channels?

    • @kartikmadhira
      @kartikmadhira Před 5 lety +4

      I think the input layer he is talking about is the inception module that resides somewhat deeper in the inception network. If you look closely to the overall inception model, there is a lot of hidden layers before this model kicks in. So it's actually the 'general' inception model that he is talking about rather than the overall architecture itself.

    • @kushalmahindrakar8580
      @kushalmahindrakar8580 Před 5 lety

      If you are familiar with the idea of edge detectors, then these 192 multiple channels are used to detect
      many different features from the image or in other word extract features. I guess you are watching the videos from the middle I suggest you, go through the whole playlist and watch videos one by one.

    • @angelachikaebirim8894
      @angelachikaebirim8894 Před 5 lety +1

      Also don't forget that this 28x28x192 input could be the concatenated output from the previous inception module and probably occurs quite deep in the model so that's why the number of channels is high

    • @codderrrr606
      @codderrrr606 Před 8 měsíci

      I was having the same doubt but here 192 represents the the concatenation of results from different kernals passed over the image

  • @kirandeepsingh9144
    @kirandeepsingh9144 Před 4 lety

    I have a question. Let At first convolution layer if we apply 32 filters on a gray scale image then output of first layer would be 32 matrixes or say 32 filtered images. Then at second layer if we are applying 64 filters then does it mean that we are applying 64 different filters over each of 32 filtered images???? And output of second layer would be 64*32=2048 filtered images???. Plz let it clear if anyone can

    • @manu1983manoj
      @manu1983manoj Před 3 lety

      you apply filter for feature extraction and not to recreate filtered images

    • @kirandeepsingh9144
      @kirandeepsingh9144 Před 3 lety

      @@manu1983manoj then what would it be?

    • @manu1983manoj
      @manu1983manoj Před 3 lety

      @@kirandeepsingh9144 based on filters it will extract features which will give you scaled down martrices. Dimesnsions will depend on the filter dimensions.

    • @sahajpareek6352
      @sahajpareek6352 Před 2 lety

      The 64 filters must have a lower dimensionality than the 32 activation maps...A simple rule is that when you decrease the dimensionality of a filter the no. of activation maps(outputs) from that filter increases keeping in mind a constant stride is taken into account. Basically to extract more precise features out of the input activation maps, you increase the no. of filters and reduce their dimensionality.

  • @strongsyedaa7378
    @strongsyedaa7378 Před 2 lety

    Why he's soo many filters? Can anyone explain me?

  • @kivique519
    @kivique519 Před 5 lety +4

    Why the output dimension is still 28*28

    • @justforfun4680
      @justforfun4680 Před 5 lety +3

      Same Padding. You add the exact amount of padding so that your output dimension is the same as your input

    • @valentinfontanger4962
      @valentinfontanger4962 Před 3 lety +1

      @@justforfun4680 I also think so

  • @manuel783
    @manuel783 Před 3 lety +1

    Inception Network Motivation *CORRECTION*
    At 3:00, Andrew should have said 28 x 28 x 192 instead of 28 x 28 x 129. The subtitles have been corrected.

  • @rushiagrawal9667
    @rushiagrawal9667 Před 5 lety +1

    Would have been nice if a comparison of computations required for 1x1 and 3x3 convolutions were provided

  • @user-qh1rb6kd8h
    @user-qh1rb6kd8h Před 5 lety +1

    Is it a way to reduce 28 * 28 * 16 to the maximum?
    Is it possible to reduce to 28 * 28 * 1?

    • @dom23rd
      @dom23rd Před 5 lety

      Yeah, I'm wondering too.. Is it hurt the data to reduce such a low 3rd dimension at the bottleneck layer?

  • @agneljohn6093
    @agneljohn6093 Před 4 lety

    When I try to work on Coursera . Artificial intelligence using tensorflow . When I run the. Assignment number 3 . It says kernel died and will restract automatically

  • @essamaly5233
    @essamaly5233 Před 2 lety

    There are some nasty and offensive commercials comes during viewing this video, I think Andrew *should* do something about it.

  • @pallawirajendra
    @pallawirajendra Před 5 lety +1

    He keeps skipping most of the topics.

    • @kushalmahindrakar8580
      @kushalmahindrakar8580 Před 5 lety +1

      No, he does not skip any topics. These videos are from coursera and have questions in between the videos so that is the reason there are cuts between the video.