Encoder Decoder Network - Computerphile
Vložit
- čas přidán 12. 06. 2018
- Deep Learning continued - the Encoder-Decoder network - Dr Mike Pound. For a background on CNNs it's worth watching this first: • CNN: Convolutional Neu...
Google Deep Dream • Deep Dream (Google) - ...
Password Cracking: • Password Cracking - Co...
Deep Learning & CNNs: • Deep Learning - Comput...
3D from Selfie: • Selfie to 3D Model - C...
Papers included in this Computerphile:
bit.ly/C_FaceAlignment
bit.ly/C_Landmarks
bit.ly/C_AaronLongForm
FCNs, and in a sense encoder decoder networks were first presented here: bit.ly/C_JohnLong
/ computerphile
/ computer_phile
This video was filmed and edited by Sean Riley.
Computer Science at the University of Nottingham: bit.ly/nottscomputer
Computerphile is a sister project to Brady Haran's Numberphile. More at www.bradyharan.com
I would love a Mike Pound playlist. Or at least I would have if I hadn't already watched all the videos with him.
Great animation work on this episode Sean.
Thanks :)
I dont mean to be off topic but does anyone know a method to get back into an instagram account?
I somehow lost the account password. I appreciate any assistance you can give me
@Jett Dexter I really appreciate your reply. I found the site thru google and I'm trying it out now.
Seems to take quite some time so I will get back to you later when my account password hopefully is recovered.
@Jett Dexter it worked and I now got access to my account again. I'm so happy!
Thanks so much, you saved my account !
@Enoch Patrick Happy to help xD
Love this channel. Every concept is so intuitively explained.
Im writing a proposal reviewing CodeT5 neural architecture and am so confused about encoder-decoder technique mentioned there.
Super stoked to see a Computerphile video on it!
You can feel the passion when he speaks until nearly out of breath
Another awesome lecture by Dr Mike Pound :D. Dang I wish you were my ML/AI lecturer back when I was learning this stuff.
This guy is the best
Whoa! What an amazing explanation to such complex topic! Loved the articulation!!
This is the best explanation about U-net I've ever seen.
Excellent and brief description ever!
great talk. if Mike could discuss the model interpretability in deep learning models for the next one, that would make my day!
You guys remembered to make this video! Nice!
I love the increasing collection of twisty puzzles on the shelf in the background
lol, dat face at 5:08 when he wanted to mention the use for military reasons :D
you are the best, I can't find this content out of this awesome channel
The GAN relation at the end was pretty helpful
Teaching is an art. Thank you so much for this video!
It seems like a way to distill an image of identifiable objects in their most basic forms and then using that information to once again layer the identified objects onto less compressed versions of the image. An analog reverse to this might be to have a completed puzzle of an image where you'd identify a few key objects and tag them on a few pieces, then you'd take the puzzle apart and hold on to the key objects and place them in their respective locations on the table. From there, you can start to place the surrounding pieces around each key piece until it's once again understandable.
yeah, that's pretty much summing it
the other use of encoder-decoder network is in generating synthetic image (by learning the representation in the middle, given by the encoder)
And then feeding that into a GAN 😈
Very serious key pieces would be the borders and especially the corners.
And the sky is blue, so blue pieces would usually sit at the top of the puzzle.
That’s an awesome explanation. Thanks!
GIVE ME THE KNOWLEDGE DOCTOR POUND
Downsampling by choosing the best of them? The max of them? No. First, the image must be low-pass filtered then simply downsample by discarding pixels. But then I see that you really do want to take the max when downsampling. Very interesting. Your GAN analogy at the end is excellent: the interior is like a generator and the higher resolution layers are like a discriminator.
I think you forgot some colour correction
Very interesting!
thank you
for such great content
Please Computherphile, can we have a playlist for all Dr. Mike Pound video's? :)
Plant science sounds rad! Also, two Mike Pound videos in one week, I'd rather this type of pound than to win the national lottery!
So basically the down up down sampling is doing what two separate systems working collaboratively could do - one to physically locate the item of interest and another to work on it? I'm working on speech recognition from 'images' generated using fast fourier - part of the solution involves locating the part of the image that contains the relevant information before inputting that into the recognition neural net - why would the procedure outlined in the video outperform two independent processes?
great channel
Great video. You remind me so much of James Acaster.
Great work. Keep going.
By the way, the reason data is brought from encoder to decoder is because of Unpooling which is the (partial) reverse of Pooling.
So, pooling takes the maximum pixel in its window. So, in normal convnets it's fine, we don't really need to know which pixel exactly got transferred to next layer.
However when unpooling in decoder, we need to know where that pixel was in the pooling "window" to more accurately upsample. To accomplish this, we get the index of which pixel got pooled and pass it to Unpooling layer.
Oscar Mulin no, the one shown here works differently, read Jonathan Long's paper about Fully Convolutional Networks
Holly bannanas... now that whole stacked restricted Boltzmann machine stuff makes sense to me! In the slide deck from my prof there was always this double pyramid structure depicted and i was like WHAAAT? You might literally have saved exam points here!
This is fascinating
brilliant idea
Well this make more sense to me, outline the raw sketch before you look for objects, like room, windows, edges of bookshelf desk, drawers and so on. Mike is the center object that shade the room view. And then break it down from there. Mike is the Blob obscuring the view ;), the neural network is not quite sure what he is but it will find out.
I usually just wipe the server with a cloth or something. What difference at this point does it make?
While expanding the image from smaller to larger size....how does we map the image?
It is essentially the inverse of the encoder layer. Say for images, the encoder layer we have convolutional 2D layers and max pool 2d layers. In the decoder layer they are replaced with deconvolutional 2D layers (which are essentially transpose of conv2d) while for max pooling, we can just copy over the intensity of the pixel to the pixels in the next layer for which the max pooling would be responsible for, if it were facing the other direction.
1:16 A Max Pool layer cannot move the representation of a dog from the left side of the image to the right. Max pool layers only gather adjacent pixels.
How can I make this same animation myself for a similar video? The ones at 2:05?
I always notice the cubes in the background.
When talking about segmentation, I was hoping he'd mention YOLO (You Only Look Once). It's such an interesting bit of technology, which performs semantic segmentation on each frame of a video in near-realtime, processing each frame only once, hence its name. And it performs quite well for what it's doing! You can find videos of it on CZcams.
Mike Pound: Teaching noobs about computers, when he's not teaching computers about plants. What an interesting person.
Is this the same thing as a UNet?
next video about GAN please !
helpful thank you!
Dr. Pound looks like the child of Zach Woods and Elijah Wood.
"Dr. Mike Pounds Wood"
Oh, wheat! Lots of wheat... fields of wheat... a tremendous amount of wheat!
fburton8 Perfect for running through.
That's what we eat. Wheat!
Where can i watch previous video?
+1
I did not understand anything, but it's very interesting
u are the best !
so thats basically a u-net?
Do a video on ML solving captchas?
I think this video was heavily manipulated, it is almost like a green screen is being used.
levmatta Yes - on the far right through the window is a white plane with his reflection. Visible intermittently.
can you add subtittles?
Him: "this is only one dimension I've drawn here but it's actually two dimensions"
Me: "okay I give up!"
NotMarkKnopfler lol it's not that hard. The width of the tip of the marker is the width itself, despite him only drawing a "single" line with seemingly no intended width.
It's actually 4 dimensions because you also have the colour channels and the data batch
He just drew it 1d because it is easier to draw. Just imagine the 2d thing that corresponds to the 1d thing.
He just drew it 1d because it is easier to draw. Just imagine the 2d thing that corresponds to the 1d thing.
Oskar Keurulainen not really, because he is only representing the spatial dimensions as he is talking about spatial downsizing.
color correction
with color correction, aside from semantic segmentation, you'd also want gradient information to avoid that aliasing when you apply some filter. In this case, it's probably easier to use traditional image processing techniques as gradient and color information is available before you build that convolution pyramid.
I think he is referring to the unusual color calibration of the video.
I Wish He Could Be My Professor. If so, I will Sleep at his Room's couch and Learn Great Stuff.
This is such beautiful, interesting and useful engineering but I cannot for one second stop thinking of the millions of ways it can be wrongfully used. It's a shame really.
Been watching too much dystopian sci-fi?
sci-fi? You're funny. Actually a couple of weeks back the BBC did a program about how police in the US are using computer software (I assume neural networks) to predict crimes. Search for "BBC The Enquiry: can computers predict crime?"
Why is that so bad? That can lead to a decrease in crime. As long as the agencies are bound by law to keep that information to themselves I don't see a problem with it.
ok
Those making the move from analog to ip video, specifically in regulated industries, would benefit using this video, to explain to their cheap ass check writers, why bubbke gun and duct tape is not a sustainable solution.
143rd!!!
3rd comment XD first 7 min
49 views, wow.