C4W3L08 Anchor Boxes

Sdílet
Vložit
  • čas přidán 17. 07. 2024
  • Take the Deep Learning Specialization: bit.ly/2TtgW58
    Check out all our courses: www.deeplearning.ai
    Subscribe to The Batch, our weekly newsletter: www.deeplearning.ai/thebatch
    Follow us:
    Twitter: / deeplearningai_
    Facebook: / deeplearninghq
    Linkedin: / deeplearningai

Komentáře • 45

  • @nischalkhadgi8128
    @nischalkhadgi8128 Před 5 lety +2

    Great one. Was really helpful. Hope you put some demonstration as well.

  • @MG5350
    @MG5350 Před 4 lety +41

    I feel that there are a lot of intricacies that are not explained. Great lecture hands down, but I'm starting to feel that I need concrete examples or implementation to understand many of these subtleties.

    • @dhidhi1000
      @dhidhi1000 Před 4 lety +7

      This is a video series, there is like 9 other videos explaining

    • @davidvultur8704
      @davidvultur8704 Před 2 lety

      czcams.com/play/PL_IHmaMAvkVxdDOBRg2CbcJBq9SY7ZUvs.html This seems to be the one

    • @ykpoff
      @ykpoff Před 3 měsíci

      @@dhidhi1000 so?

  • @heejuneAhn
    @heejuneAhn Před 3 lety +1

    Thank you, Prof. Ng, I learned a lot. One question or request for clarification. Is it that the anchor box shapes should be taken into account to the network receptive field? So we need to use non-squared convolutional filters? Thanks.

  • @koeficientas
    @koeficientas Před 5 lety

    If I have only 2 classes, I can give hardwired anchor for each class per grid cell and deny the c1 c2 c3? So the output vector can be y=[pc1 bx by bh bw, pc2 bx by bh bw]? pc1 - probability of perestrian, pc2 - probability of car.

  • @keweml3544
    @keweml3544 Před 3 lety +2

    I think anchor box algorithm is for those problems lying somewhere between image classification and pixel classification. Recognizing an object that is either the entire image or a pixel is really tricky.

  • @marcoburkhardt6496
    @marcoburkhardt6496 Před 3 lety

    just good. thanks a lot :)

  • @sandipansarkar9211
    @sandipansarkar9211 Před 3 lety

    nice explanation

  • @sanjivgautam9063
    @sanjivgautam9063 Před 4 lety +1

    The anchor box concept is not clear. Hands down to great explanation till date though. I want to share few ideas here. During training, the label that contains bx,by,bh,bw is changed to be between 0 and 1. Obviously, bh and bw can be greater than 1. So for each of those "normalized" bounding boxes, we try to determine which of the predefined anchor box is suitable. How do we define the "suitability"? The IOU. So if we choose 5 anchor boxes, we check our normalized bounding box against all those 5 anchor box and determine which one has highest IOU, so we choose that anchor box as what Andrew NG has explained above. Also the loss function is bit of a headache to explain, here in comment. But would be great if Andrew had explained it himself. Nevermind though, we are getting videos from AI god himself!

  • @nidhihada1122
    @nidhihada1122 Před 5 lety +2

    One doubt. if we are specifying Bx By Bh Bw then we can specify any anchor box. Then for an image where two objects are present in same grid cell, sharing same shape of anchor box, even this can be solved by using their respective bx, by, bh, bw in output. Where in both anchor box have their own bx by bh bw. I could not understand why andrew says it can not be solved.

    • @MARTIN-101
      @MARTIN-101 Před 3 měsíci

      i figured it out yesterday. but i forgot it again 😂😂

  • @MARTIN-101
    @MARTIN-101 Před 3 měsíci

    is there a way to detect object without anchors.
    like attaching 2 mlp heads on convolution base.
    one head for classification and another for regression...
    is it a implemented way in research ?

  • @nithinmesingerme6976
    @nithinmesingerme6976 Před 2 lety

    As the size of anchor boxes are fixed.. how the same kind of object, one which very close and one which very far works??

  • @anujk.9893
    @anujk.9893 Před 4 lety +1

    If we define the shape and size of anchor boxes, won't we need only 2 outputs to identify it. Bx and By would be enough. We should not need Bh and Bw ?
    Please explain if someone knows

    • @tomvandewiele7031
      @tomvandewiele7031 Před 4 lety +1

      We predict an arbitrary height and width so we do still have to output Bh and Bw. With anchor boxes, the IoU is used to pick the best matching anchor box shape of the labeled data. The target shape (together with Bx, By and the class) is only set as a target for the best matching anchor box.

  • @adityarajora7219
    @adityarajora7219 Před 4 lety +5

    how it can predict bounding box larger than grid cell................explain, please.....if anyone knows YOLO

    • @sanjivgautam9063
      @sanjivgautam9063 Před 4 lety +3

      Here is that thing. We actually have bx and by which falls between 0 and 1. However, the bw and bh (width and height) can have values more than 1, so that any object that goes beyond the grid cell is incorporated with that bh and bw. Did you get the point? In one of his videos, he explains how bx and by falls between 0 and 1 whilst bh and bw can go higher than 1.

    • @adityarajora7219
      @adityarajora7219 Před 4 lety +1

      @@sanjivgautam9063 thanks......but still I didn't get intuition.......could you give that video reference.

    • @sanjivgautam9063
      @sanjivgautam9063 Před 4 lety +4

      @@adityarajora7219 czcams.com/video/gKreZOUi-O0/video.html. I think you are following a playlist that doesn't have one video in it. The video in this link explains the bounding box rules.

  • @TheKovosh
    @TheKovosh Před 4 lety +1

    if I have a fixed size anchor box, then what is the point of bw and bh

  • @rijulsingh9803
    @rijulsingh9803 Před 3 lety

    So the minimum bound on number of anchor boxes is the number of classes present? Also, is there a way to optimize the size of anchor boxes? I'm a little confused here. Everything else here is crystal clear, thank you so much for this tutorial!

    • @polimetakrylanmetylu2483
      @polimetakrylanmetylu2483 Před 2 lety +1

      If I understand it correctly, as for 1, no, you can specify any number of anchor boxes, and each one will output it's predictions for class. You can also only specify one or any arbitrarly low/high number of them - there is no relation between number of classes and number of anchor boxes.
      As for 2, your NN will not output the entire bounding box, but instead it outputs the correction of an anchor box. They have to be defined when you create the model. What you can do is collect every bounding box from your dataset as width-height pair, and either plot it and look at it, or run some clustering algorithm to find optimal sizes

  • @dota2islife262
    @dota2islife262 Před 5 lety

    what is the name of the course on Coursera

    • @maxbaugh9372
      @maxbaugh9372 Před 3 lety

      Deep Learning Specialization - Course 4: Convolutional Neural Networks

  • @TheKovosh
    @TheKovosh Před 4 lety +4

    One video is missed that's why I have problem understanding the rest.

    • @rohitborra2507
      @rohitborra2507 Před 4 lety

      if u find it please keep the link bro

    • @aymannaeem22
      @aymannaeem22 Před 3 lety +4

      czcams.com/video/gKreZOUi-O0/video.html&list=PL_IHmaMAvkVxdDOBRg2CbcJBq9SY7ZUvs&t=656

  • @ganonlight
    @ganonlight Před 3 lety +1

    These anchor boxes seem more like a workaround than an actual solution tbh

    • @vishaljain4915
      @vishaljain4915 Před 3 lety

      Agreed, do you have a better idea

    • @ganonlight
      @ganonlight Před 3 lety +1

      @@vishaljain4915 No not really

    • @vishaljain4915
      @vishaljain4915 Před 3 lety

      @@ganonlight 😂😂😂 me neither aha

    • @ganonlight
      @ganonlight Před 3 lety

      @@vishaljain4915 😅

    • @akashkewar
      @akashkewar Před 3 lety +4

      Anchor boxes are one of the many ways you can use for object detection. Algorithms like "CornerNet" don't use anchor boxes to locate objects but keypoints. Some algorithm also uses pose estimation or/and semantic segmentation to give you pretty accurate bounding boxes prediction like Pose2Seg and so on. Just google search "anchorless object detection". Also, tbh most of the stuff you see in machine learning is "workaround", but it's magic to see them work so great. There is no silver bullet that could solve all the problems, machine learning is all about choosing the right tools and being creative to the problem given in hand.

  • @guardrepresenter5099
    @guardrepresenter5099 Před 5 lety

    What is pc and how pc know himself 0,1 before c1,c2,c3 are unknown????

    • @adityarajora7219
      @adityarajora7219 Před 4 lety +1

      PC shows there is "something" with probability and c1,c2,c3 describes what this "something" actually is.

  • @EranM
    @EranM Před 5 lety +5

    0:25 right in the nuts

    • @HabibRK
      @HabibRK Před 5 lety

      it's a she

    • @lovemormus
      @lovemormus Před 4 lety

      @@HabibRK how do you know it's a she

  • @ShubhamKumar-me7xy
    @ShubhamKumar-me7xy Před 2 lety

    Mid point of pedestrian :xd

  • @sandipansarkar9211
    @sandipansarkar9211 Před 3 lety

    nice explanation