Video není dostupné.
Omlouváme se.

FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence

Sdílet
Vložit
  • čas přidán 18. 08. 2024
  • FixMatch is a simple, yet surprisingly effective approach to semi-supervised learning. It combines two previous methods in a clever way and achieves state-of-the-art in regimes with few and very few labeled examples.
    Paper: arxiv.org/abs/...
    Code: github.com/goo...
    Abstract:
    Semi-supervised learning (SSL) provides an effective means of leveraging unlabeled data to improve a model's performance. In this paper, we demonstrate the power of a simple combination of two common SSL methods: consistency regularization and pseudo-labeling. Our algorithm, FixMatch, first generates pseudo-labels using the model's predictions on weakly-augmented unlabeled images. For a given image, the pseudo-label is only retained if the model produces a high-confidence prediction. The model is then trained to predict the pseudo-label when fed a strongly-augmented version of the same image. Despite its simplicity, we show that FixMatch achieves state-of-the-art performance across a variety of standard semi-supervised learning benchmarks, including 94.93% accuracy on CIFAR-10 with 250 labels and 88.61% accuracy with 40 -- just 4 labels per class. Since FixMatch bears many similarities to existing SSL methods that achieve worse performance, we carry out an extensive ablation study to tease apart the experimental factors that are most important to FixMatch's success. We make our code available at this https URL.
    Authors: Kihyuk Sohn, David Berthelot, Chun-Liang Li, Zizhao Zhang, Nicholas Carlini, Ekin D. Cubuk, Alex Kurakin, Han Zhang, Colin Raffel
    Links:
    CZcams: / yannickilcher
    Twitter: / ykilcher
    BitChute: www.bitchute.c...
    Minds: www.minds.com/...

Komentáře • 21

  • @manuelpariente2288
    @manuelpariente2288 Před 4 lety +22

    Thanks again :-)
    Loved the critic at the end.
    Also, nice from them that they report these results, lots of papers would silence it to make it seem like the method brought all the gains !

  • @shrinathdeshpande5004
    @shrinathdeshpande5004 Před 4 lety +8

    definitely one of the best ways to explain a paper!! Kudos to you

  • @herp_derpingson
    @herp_derpingson Před 4 lety +19

    78% accuracy from 1 image per class. This blew my mind.
    What a time to be alive.

    • @TeoZarkopafilis
      @TeoZarkopafilis Před 4 lety +6

      HOLD ON TO YOUR PAPERS

    • @meudta293
      @meudta293 Před 4 lety +1

      my brain matter is all over the floor right now hhh

    • @matthewtang1489
      @matthewtang1489 Před 4 lety +1

      @@TeoZarkopafilis Woah! A fellow scholar here!

  • @sora4222
    @sora4222 Před rokem

    I loved the critique at the end. Thanks.

  • @hungdungnguyen8258
    @hungdungnguyen8258 Před 3 měsíci

    well explained. Thank you

  • @hihiendru
    @hihiendru Před 4 lety +1

    just like UDA, emphasis on way you augment. and poor UDA got rejected. ps LOVE your breakdowns, please keep them coming.

  • @jurischaber6935
    @jurischaber6935 Před rokem

    Thanks again...Great teacher for us students. 🙂

  • @AmitKumar-ts8br
    @AmitKumar-ts8br Před 3 lety

    Really nice explanation and concise...

  • @vishalahuja2502
    @vishalahuja2502 Před 3 lety +1

    Yannic, nice coverage of the paper. I have one question: at 15:05, you explain that the pseudo-label is used only if the confidence is above a certain threshold (which is also a hyperparameter). Where is the confidence coming from? It is well known that the confidence score coming out of softmax is not reliable. Can you please explain?

  • @tengotooborn
    @tengotooborn Před 3 lety

    Something which I find weird: isn’t a constant pseudolabel always correct? It seems that there are only positive examples in the scheme which uses the unlabeled data, and so there is nothing in the loss which forces the model to not always output the same pseudolabel for everything.
    Yes, one can argue that this would fail the supervised loss, but then the question becomes “how is the supervised loss weighted w.r.t. the unsupervised loss”. In any case, it seems that one would also desire to have negative examples in the unsupervised case

  • @NooBiNAcTioN1334
    @NooBiNAcTioN1334 Před 2 lety

    Fantastic!

  • @reginaldanderson7218
    @reginaldanderson7218 Před 4 lety +1

    Nice edit

  • @ramonbullock6630
    @ramonbullock6630 Před 4 lety +1

    I love this content :D

  • @christianleininger2954

    Really Good Job please keep going

  • @abhishekmaiti8332
    @abhishekmaiti8332 Před 4 lety +1

    In what order do they train the model, feed the labelled image first and then the unlabelled ones? Also, can two unlabelled images of the same class have a different pseudo label?

    • @YannicKilcher
      @YannicKilcher  Před 4 lety +4

      I think they do everything at the same time. I guess the labelled images can also go the unlabelled way, yes. But not the other way around, obviously :)

  • @Manu-lc4ob
    @Manu-lc4ob Před 4 lety +1

    What is the software that you are using to annotate papers Yannic ? I am using Margin notes but it does not seem as smooth

  • @Dr.Z.Moravcik-inventor-of-AGI

    Google again, wow! 😂