Encoder Decoder Network - Computerphile

Sdílet
Vložit
  • čas přidán 12. 06. 2018
  • Deep Learning continued - the Encoder-Decoder network - Dr Mike Pound. For a background on CNNs it's worth watching this first: • CNN: Convolutional Neu...
    Google Deep Dream • Deep Dream (Google) - ...
    Password Cracking: • Password Cracking - Co...
    Deep Learning & CNNs: • Deep Learning - Comput...
    3D from Selfie: • Selfie to 3D Model - C...
    Papers included in this Computerphile:
    bit.ly/C_FaceAlignment
    bit.ly/C_Landmarks
    bit.ly/C_AaronLongForm
    FCNs, and in a sense encoder decoder networks were first presented here: bit.ly/C_JohnLong
    / computerphile
    / computer_phile
    This video was filmed and edited by Sean Riley.
    Computer Science at the University of Nottingham: bit.ly/nottscomputer
    Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com

Komentáře • 91

  • @normalnews7271
    @normalnews7271 Před 6 lety +63

    I would love a Mike Pound playlist. Or at least I would have if I hadn't already watched all the videos with him.

  • @minihjalte
    @minihjalte Před 6 lety +98

    Great animation work on this episode Sean.

    • @Computerphile
      @Computerphile  Před 6 lety +20

      Thanks :)

    • @enochpatrick4498
      @enochpatrick4498 Před 2 lety

      I dont mean to be off topic but does anyone know a method to get back into an instagram account?
      I somehow lost the account password. I appreciate any assistance you can give me

    • @enochpatrick4498
      @enochpatrick4498 Před 2 lety

      @Jett Dexter I really appreciate your reply. I found the site thru google and I'm trying it out now.
      Seems to take quite some time so I will get back to you later when my account password hopefully is recovered.

    • @enochpatrick4498
      @enochpatrick4498 Před 2 lety

      @Jett Dexter it worked and I now got access to my account again. I'm so happy!
      Thanks so much, you saved my account !

    • @jettdexter6719
      @jettdexter6719 Před 2 lety

      @Enoch Patrick Happy to help xD

  • @kevon217
    @kevon217 Před rokem +2

    Love this channel. Every concept is so intuitively explained.

  • @devontebroncas4967
    @devontebroncas4967 Před 2 lety

    Im writing a proposal reviewing CodeT5 neural architecture and am so confused about encoder-decoder technique mentioned there.
    Super stoked to see a Computerphile video on it!

  • @hart1254
    @hart1254 Před 3 lety +2

    You can feel the passion when he speaks until nearly out of breath

  • @tenseikenzx-3559
    @tenseikenzx-3559 Před 6 lety +6

    Another awesome lecture by Dr Mike Pound :D. Dang I wish you were my ML/AI lecturer back when I was learning this stuff.

  • @wuda-io
    @wuda-io Před 6 lety +39

    This guy is the best

  • @ShubhamPatil-ee3vt
    @ShubhamPatil-ee3vt Před rokem

    Whoa! What an amazing explanation to such complex topic! Loved the articulation!!

  • @__someone__3141
    @__someone__3141 Před rokem

    This is the best explanation about U-net I've ever seen.

  • @ehsankiani542
    @ehsankiani542 Před 4 lety

    Excellent and brief description ever!

  • @undefBehav
    @undefBehav Před 6 lety

    great talk. if Mike could discuss the model interpretability in deep learning models for the next one, that would make my day!

  • @xyZenTV
    @xyZenTV Před 6 lety

    You guys remembered to make this video! Nice!

  • @micahgilbertcubing5911

    I love the increasing collection of twisty puzzles on the shelf in the background

  • @SuperKnallex
    @SuperKnallex Před 6 lety +21

    lol, dat face at 5:08 when he wanted to mention the use for military reasons :D

  • @nullnull6032
    @nullnull6032 Před 5 lety

    you are the best, I can't find this content out of this awesome channel

  • @vadrif-draco
    @vadrif-draco Před rokem

    The GAN relation at the end was pretty helpful

  • @BorisZandona
    @BorisZandona Před 8 měsíci

    Teaching is an art. Thank you so much for this video!

  • @rsage_
    @rsage_ Před 6 lety +7

    It seems like a way to distill an image of identifiable objects in their most basic forms and then using that information to once again layer the identified objects onto less compressed versions of the image. An analog reverse to this might be to have a completed puzzle of an image where you'd identify a few key objects and tag them on a few pieces, then you'd take the puzzle apart and hold on to the key objects and place them in their respective locations on the table. From there, you can start to place the surrounding pieces around each key piece until it's once again understandable.

    • @nobodykid23
      @nobodykid23 Před 6 lety +3

      yeah, that's pretty much summing it
      the other use of encoder-decoder network is in generating synthetic image (by learning the representation in the middle, given by the encoder)

    • @miguelinserni2453
      @miguelinserni2453 Před 6 lety

      And then feeding that into a GAN 😈

    • @HappyBeezerStudios
      @HappyBeezerStudios Před 6 lety

      Very serious key pieces would be the borders and especially the corners.
      And the sky is blue, so blue pieces would usually sit at the top of the puzzle.

  • @Jackisaboss1208
    @Jackisaboss1208 Před 4 lety

    That’s an awesome explanation. Thanks!

  • @SleeveBlade
    @SleeveBlade Před 6 lety +4

    GIVE ME THE KNOWLEDGE DOCTOR POUND

  • @vtrandal
    @vtrandal Před 2 lety +1

    Downsampling by choosing the best of them? The max of them? No. First, the image must be low-pass filtered then simply downsample by discarding pixels. But then I see that you really do want to take the max when downsampling. Very interesting. Your GAN analogy at the end is excellent: the interior is like a generator and the higher resolution layers are like a discriminator.

  • @Henrix1998
    @Henrix1998 Před 6 lety +8

    I think you forgot some colour correction

  • @Zahlenteufel1
    @Zahlenteufel1 Před 6 lety +6

    Very interesting!

  • @rohandvivedi
    @rohandvivedi Před 6 lety

    thank you
    for such great content

  • @GBGSK
    @GBGSK Před 5 lety +2

    Please Computherphile, can we have a playlist for all Dr. Mike Pound video's? :)

  • @rgbplaza5945
    @rgbplaza5945 Před 6 lety +5

    Plant science sounds rad! Also, two Mike Pound videos in one week, I'd rather this type of pound than to win the national lottery!

  • @jeffsnox
    @jeffsnox Před 6 lety +2

    So basically the down up down sampling is doing what two separate systems working collaboratively could do - one to physically locate the item of interest and another to work on it? I'm working on speech recognition from 'images' generated using fast fourier - part of the solution involves locating the part of the image that contains the relevant information before inputting that into the recognition neural net - why would the procedure outlined in the video outperform two independent processes?

  • @lironthethird6710
    @lironthethird6710 Před 6 lety +2

    great channel

  • @davidj3956
    @davidj3956 Před 3 lety

    Great video. You remind me so much of James Acaster.

  • @suryavaraprasadalla8511

    Great work. Keep going.

  • @oscarmulin114
    @oscarmulin114 Před 6 lety +1

    By the way, the reason data is brought from encoder to decoder is because of Unpooling which is the (partial) reverse of Pooling.
    So, pooling takes the maximum pixel in its window. So, in normal convnets it's fine, we don't really need to know which pixel exactly got transferred to next layer.
    However when unpooling in decoder, we need to know where that pixel was in the pooling "window" to more accurately upsample. To accomplish this, we get the index of which pixel got pooled and pass it to Unpooling layer.

    • @juggernaut93
      @juggernaut93 Před 6 lety +1

      Oscar Mulin no, the one shown here works differently, read Jonathan Long's paper about Fully Convolutional Networks

  • @Gilgwathir
    @Gilgwathir Před 6 lety +1

    Holly bannanas... now that whole stacked restricted Boltzmann machine stuff makes sense to me! In the slide deck from my prof there was always this double pyramid structure depicted and i was like WHAAAT? You might literally have saved exam points here!

  • @deltadom33
    @deltadom33 Před 6 lety

    This is fascinating

  • @kennethcarvalho3684
    @kennethcarvalho3684 Před 3 lety

    brilliant idea

  • @JmanNo42
    @JmanNo42 Před 6 lety

    Well this make more sense to me, outline the raw sketch before you look for objects, like room, windows, edges of bookshelf desk, drawers and so on. Mike is the center object that shade the room view. And then break it down from there. Mike is the Blob obscuring the view ;), the neural network is not quite sure what he is but it will find out.

  • @virtuaskimmer6714
    @virtuaskimmer6714 Před 6 lety +1

    I usually just wipe the server with a cloth or something. What difference at this point does it make?

  • @Luffy-1998
    @Luffy-1998 Před 6 lety +6

    While expanding the image from smaller to larger size....how does we map the image?

    • @herp_derpingson
      @herp_derpingson Před 5 lety +3

      It is essentially the inverse of the encoder layer. Say for images, the encoder layer we have convolutional 2D layers and max pool 2d layers. In the decoder layer they are replaced with deconvolutional 2D layers (which are essentially transpose of conv2d) while for max pooling, we can just copy over the intensity of the pixel to the pixels in the next layer for which the max pooling would be responsible for, if it were facing the other direction.

  • @herp_derpingson
    @herp_derpingson Před 5 lety

    1:16 A Max Pool layer cannot move the representation of a dog from the left side of the image to the right. Max pool layers only gather adjacent pixels.

  • @rahuldeora5815
    @rahuldeora5815 Před 6 lety

    How can I make this same animation myself for a similar video? The ones at 2:05?

  • @thetommantom
    @thetommantom Před 6 lety

    I always notice the cubes in the background.

  • @IceMetalPunk
    @IceMetalPunk Před 6 lety +1

    When talking about segmentation, I was hoping he'd mention YOLO (You Only Look Once). It's such an interesting bit of technology, which performs semantic segmentation on each frame of a video in near-realtime, processing each frame only once, hence its name. And it performs quite well for what it's doing! You can find videos of it on CZcams.

  • @SirWilliamKidney
    @SirWilliamKidney Před 3 lety

    Mike Pound: Teaching noobs about computers, when he's not teaching computers about plants. What an interesting person.

  • @radishanim
    @radishanim Před 2 lety

    Is this the same thing as a UNet?

  • @SubhamMahato39
    @SubhamMahato39 Před 6 lety

    next video about GAN please !

  • @MsizeB
    @MsizeB Před 2 lety

    helpful thank you!

  • @armyofthewolves
    @armyofthewolves Před 6 lety +1

    Dr. Pound looks like the child of Zach Woods and Elijah Wood.
    "Dr. Mike Pounds Wood"

  • @fburton8
    @fburton8 Před 6 lety +3

    Oh, wheat! Lots of wheat... fields of wheat... a tremendous amount of wheat!

    • @qwertyTRiG
      @qwertyTRiG Před 6 lety

      fburton8 Perfect for running through.

    • @samre3006
      @samre3006 Před 3 lety

      That's what we eat. Wheat!

  • @usama57926
    @usama57926 Před rokem

    Where can i watch previous video?

  • @MrSerozka
    @MrSerozka Před 6 lety

    I did not understand anything, but it's very interesting

  • @josephdere3654
    @josephdere3654 Před 5 lety

    u are the best !

  • @LucaBovelli
    @LucaBovelli Před měsícem

    so thats basically a u-net?

  • @_mvr_
    @_mvr_ Před 6 lety +3

    Do a video on ML solving captchas?

  • @levmatta
    @levmatta Před 6 lety +3

    I think this video was heavily manipulated, it is almost like a green screen is being used.

    • @VoteScientist
      @VoteScientist Před 6 lety

      levmatta Yes - on the far right through the window is a white plane with his reflection. Visible intermittently.

  • @user-mv5yw5zy8f
    @user-mv5yw5zy8f Před 6 lety

    can you add subtittles?

  • @NotMarkKnopfler
    @NotMarkKnopfler Před 6 lety +23

    Him: "this is only one dimension I've drawn here but it's actually two dimensions"
    Me: "okay I give up!"

    • @CGoody564
      @CGoody564 Před 6 lety +3

      NotMarkKnopfler lol it's not that hard. The width of the tip of the marker is the width itself, despite him only drawing a "single" line with seemingly no intended width.

    • @oskarkeurulainen6414
      @oskarkeurulainen6414 Před 6 lety +2

      It's actually 4 dimensions because you also have the colour channels and the data batch

    • @drdca8263
      @drdca8263 Před 6 lety +3

      He just drew it 1d because it is easier to draw. Just imagine the 2d thing that corresponds to the 1d thing.

    • @drdca8263
      @drdca8263 Před 6 lety

      He just drew it 1d because it is easier to draw. Just imagine the 2d thing that corresponds to the 1d thing.

    • @CGoody564
      @CGoody564 Před 6 lety +1

      Oskar Keurulainen not really, because he is only representing the spatial dimensions as he is talking about spatial downsizing.

  • @jelletje8
    @jelletje8 Před 6 lety +5

    color correction

    • @mikejohnstonbob935
      @mikejohnstonbob935 Před 6 lety +1

      with color correction, aside from semantic segmentation, you'd also want gradient information to avoid that aliasing when you apply some filter. In this case, it's probably easier to use traditional image processing techniques as gradient and color information is available before you build that convolution pyramid.

    • @ciano5475
      @ciano5475 Před 6 lety

      I think he is referring to the unusual color calibration of the video.

  • @mockingbird3809
    @mockingbird3809 Před 5 lety

    I Wish He Could Be My Professor. If so, I will Sleep at his Room's couch and Learn Great Stuff.

  • @Fly0High
    @Fly0High Před 6 lety +13

    This is such beautiful, interesting and useful engineering but I cannot for one second stop thinking of the millions of ways it can be wrongfully used. It's a shame really.

    • @IceMetalPunk
      @IceMetalPunk Před 6 lety +5

      Been watching too much dystopian sci-fi?

    • @Fly0High
      @Fly0High Před 6 lety +3

      sci-fi? You're funny. Actually a couple of weeks back the BBC did a program about how police in the US are using computer software (I assume neural networks) to predict crimes. Search for "BBC The Enquiry: can computers predict crime?"

    • @vaibhav2k13
      @vaibhav2k13 Před 5 lety

      Why is that so bad? That can lead to a decrease in crime. As long as the agencies are bound by law to keep that information to themselves I don't see a problem with it.

  • @zacharieetienne5784
    @zacharieetienne5784 Před 6 lety

    ok

  • @pyramydseven
    @pyramydseven Před 5 lety

    Those making the move from analog to ip video, specifically in regulated industries, would benefit using this video, to explain to their cheap ass check writers, why bubbke gun and duct tape is not a sustainable solution.

  • @DarthMakroth
    @DarthMakroth Před 6 lety +1

    143rd!!!

  • @DarthMakroth
    @DarthMakroth Před 6 lety +1

    3rd comment XD first 7 min

  • @connorfulcher1823
    @connorfulcher1823 Před 6 lety +1

    49 views, wow.