SEER explained: Vision Models more Robust & Fair when pretrained on UNCURATED images!?

Sdílet
Vložit
  • čas přidán 29. 08. 2024

Komentáře • 15

  • @beresandris
    @beresandris Před 2 lety +6

    Really nice video! You mixed up two RegNet papers however (there are quite a few of those with the same model name sadly). This architecture is from "Designing Network Design Spaces" by FAIR as well, and is a pure CNN without any RNN.

  • @mndhamod
    @mndhamod Před 2 lety +3

    Superb investigation, miss coffee bean!

  • @louisrose7823
    @louisrose7823 Před 2 lety +3

    Very interesting !

  • @psebsellars
    @psebsellars Před 2 lety +4

    Great review, the paper definitely, in my opinion, is not as transparent and slightly deceiving. There is no evidence, as you say, that uncurated data is safe to use in anyway.

  • @toyuyn
    @toyuyn Před 2 lety +8

    "yeah, our dataset is totally uncurated...by us"
    misleading as the paper is, you can't deny that the "fairer" data used is public and easily accessible
    so maybe instead of jumping straight to curation, we should be taking a careful look at what is available and possible cheaper alternatives

    • @AICoffeeBreak
      @AICoffeeBreak  Před 2 lety +4

      You're wise. Should have cut you into the video saying this.

  • @Mrbits01
    @Mrbits01 Před 2 lety +5

    I always say that criticisms of a paper are always easier to come by than praise. But, and pardon my french, the bull definitely went number 2 on this one.
    Like, you trained on 1 billion (!) images, collected from a platform that internally culls images taken by people of mostly people (!!) all across the globe, and find that embeddings pre-trained on this dataset are "fairer" v/s those trained on a frickin object recognition dataset ~1000x smaller, because it does better at gender and skintone retrieval on a target dataset with 20k face images (UTKFace).
    What's worse is there's no analysis of what they're saying.
    Like, who designed this study lol. I bet they just went "how can we turn this into a paper" and decided to double down on a popular topic (while not answering any question about that topic) and then wrote everything around that.

  • @luisrperaza
    @luisrperaza Před 6 měsíci +1

    Thank you for your videos, I watched the entire transformers series, all are really good.

  • @sifatmd
    @sifatmd Před 2 lety +3

    Please do a video on PaLM from Google

  • @tildarusso
    @tildarusso Před 2 lety +2

    I guess the reason it words is by pre-training on these uncurated images, it helps to capture more generic features at low layer convolution feature maps, hence "less biased" compared to standard training sets (say, ImageNet) Therefore, future layers towards specific purpose exhibit better results.

  • @jonathansum9084
    @jonathansum9084 Před 2 lety +2

    So the solution is it was trained on Instagram image?