Faster R-CNN Explanation | Region Proposal Network

Fast R-CNN: Everything you need to know from the paper

Denoising Diffusion Probabilistic Models | DDPM Explained

Игра для тех, у кого нет игр, но есть 🥚 #настольныеигры #boardgames #настолки #настольные_игры

Minulá season byla plná cheaterů, našli jste nějakýho i v této??👀

Dad gives best memory keeper

Fast R-CNN Explained | ROI Pooling

ExplainingAI

zhlédnutí 1 795

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 13. 09. 2024

Komentáře • 12

@kelvin-uh7tf Před 3 měsíci
nice.
@thangphu6044 Před 2 měsíci
Do RCCN and Fast RCNN sample into batches before pass to CNN?
@Explaining-AI Před 2 měsíci
For RCNN yes. For Fast RCNN it doesn't really matter(whether sampling is done before or after) because we only need to do one forward pass through CNN per image. And then ROI pooling on this feature map is done only for the sampled proposals of the batch.
@thangphu6044 Před 2 měsíci
@@Explaining-AI Ah! I mean that all proposals are sampled into batches before feeding to Classifier and BBox Regression? (Whether RCNN takes all proposals one by one and Fast RCNN takes batch by batch?)
@Explaining-AI Před 2 měsíci
@@thangphu6044 I am not sure I understand what you are referring to by "sampled into batches". I have added a bit of detail for training of both, Can you please tell me which sampling you are referring to here.
For RCNN:
1. We Fetch all proposals for our dataset(2K proposals x number of images)
2. For each training iteration, we sample 128 proposals, this becomes our batch.
3. This batch is fed to CNN(with fc6 and fc7) to get 128 x 4096 features
4. These are then passed to a 4096 x N_Classes classification layer to get logits for all 128 proposals.
5. And then we also do the SVM and bbox regression training
For Fast RCNN:
1. Each training iterations takes 2 images.
2. We pass this batch of two images to CNN to get feature maps for the batch.
3. We sample N proposals from each of these two images
4. ROI Pooling is done to get 2N x 512 x 7 x7 output = 2N x 25088.
5. Then these 2N x 25088 is passed to fc6, fc7 and classification layers to get 2N x N_classes outputs
For inference, there is no sampling. So example for Fast RCNN inference, step 4 is done for all 2000 proposals to get 2000 x 25088 output
@thangphu6044 Před 2 měsíci
@@Explaining-AI It's step 2 in RCNN and step 3 in Fast RCNN. (In video, you talked about sampling some non-background hay background then feed to the FC layers)
@Explaining-AI Před 2 měsíci
@@thangphu6044 yes. during training, given an image, we only pass a sample of proposals(their features actually) to the fc layer. Like you can see for rcnn , we only use 128(out of 2000) , and same for fast rcnn.
@anshumansinha5874 Před 2 měsíci
Hi, I couldn't understand the training of the RPN. I get that we have the ground truth proposals from mapping original bounding boxes to final latent convolution output. But that's only the ground truth, what does the CNN network predict in itself so that we can compare it with the ground truth?
@14:55 you did mention that the input to the network are images and a list of proposals, are these random initial proposals which go through the network and will get penalised at the output in order for the model to learn the correct proposals? if they are random initial list of proposals, then how to know the number of proposals beforehand during initialisation? Or that we can take as a prior information from the actual number of ground truth boxes?
@Explaining-AI Před 2 měsíci
For Fast RCNN there is no RPN. Thats actually introduced in Faster RCNN(next video in this series in which I get into RPN and its training).
Fast RCNN still uses selective search proposals. Its just that unlike RCNN where we feed every selective search proposal to CNN separately(2000 forward passes), we now feed the image only once and use ROI pooling to get features for those proposals.
@Explaining-AI Před 2 měsíci
For Fast RCNN there is no RPN. Thats actually introduced in Faster RCNN(next video in this series in which I get into RPN and its training).
Fast RCNN still uses selective search proposals. Its just that unlike RCNN where we feed every selective search proposal to CNN separately(2000 forward passes), we now feed the image only once and use ROI pooling to get features for those proposals.

Další v pořadí

Automatické přehrávání

Faster R-CNN Explanation | Region Proposal Network

Faster R-CNN Explanation | Region Proposal Network

Fast R-CNN: Everything you need to know from the paper

Fast R-CNN: Everything you need to know from the paper

Denoising Diffusion Probabilistic Models | DDPM Explained

Denoising Diffusion Probabilistic Models | DDPM Explained

Игра для тех, у кого нет игр, но есть 🥚 #настольныеигры #boardgames #настолки #настольные_игры

Игра для тех, у кого нет игр, но есть 🥚 #настольныеигры #boardgames #настолки #настольные_игры

Minulá season byla plná cheaterů, našli jste nějakýho i v této??👀

Minulá season byla plná cheaterů, našli jste nějakýho i v této??👀

Dad gives best memory keeper

Dad gives best memory keeper

Největší FAIL Celého Fotbalového Týmu…

Největší FAIL Celého Fotbalového Týmu…

R-CNN: Clearly EXPLAINED!

R-CNN: Clearly EXPLAINED!

The Reparameterization Trick

The Reparameterization Trick

R-CNN Explained

R-CNN Explained

Faster R-CNN: Faster than Fast R-CNN!

Faster R-CNN: Faster than Fast R-CNN!

Why Democracy Is Mathematically Impossible

Why Democracy Is Mathematically Impossible

ROI Pooling Explained | RCNN

ROI Pooling Explained | RCNN

VQ-VAE | Everything you need to know about it | Explanation and Implementation

VQ-VAE | Everything you need to know about it | Explanation and Implementation

Watching Neural Networks Learn

Watching Neural Networks Learn

Mean Average Precision (mAP) | Explanation and Implementation for Object Detection

Mean Average Precision (mAP) | Explanation and Implementation for Object Detection

when you have plan B 😂

when you have plan B 😂

VŠECHNY VÁS NENÁVIDIM - PROBLÉMY S PÉŤOU

VŠECHNY VÁS NENÁVIDIM - PROBLÉMY S PÉŤOU

KAŽDÝ MŮŽE RAPOVAT (bohužel)

KAŽDÝ MŮŽE RAPOVAT (bohužel)

The videographer didn’t hold back in this one 🤣

The videographer didn’t hold back in this one 🤣

Linkin Park: FROM ZERO (Livestream)

Linkin Park: FROM ZERO (Livestream)

A Minecraft Movie | Teaser

A Minecraft Movie | Teaser

so trueee😂 #nevada #tiktok

so trueee😂 #nevada #tiktok

Troll 1 người thành 2 người | CHANG DORY | ometv

Troll 1 người thành 2 người | CHANG DORY | ometv