Transfer learning and Transformer models (ML Tech Talks)

Sdílet
Vložit
  • čas přidán 24. 07. 2024
  • In this session of Machine Learning Tech Talks, Software Engineer from Google Research, Iulia Turc, will walk us through the recent history of natural language processing, including the current state of the art architecture, the Transformer.
    0:00 - Intro
    1:07 - Encoding text
    8:21 - Language modeling & transformers
    29:46 - Transfer learning & BERT
    43:55 - Conclusion
    Catch more ML Tech Talks → goo.gle/ml-tech-talks
    Subscribe to TensorFlow → goo.gle/TensorFlow
    product: TensorFlow - General; re_ty: Publish;
  • Věda a technologie

Komentáře • 88

  • @klammer75
    @klammer75 Před 3 lety +100

    This was one of the best videos I’ve seen explaining transformers and NL models….well done and look forward to the other videos in the series!🍻

  • @timoose3960
    @timoose3960 Před 3 lety +3

    The comparison between attentions heads and CNN filters made so much sense!

  • @BiswajitGhosh-wg6qj
    @BiswajitGhosh-wg6qj Před 3 lety +10

    Please @Tensorflow Team continue with this lecture series ML Tech series

  • @miladchenaghlou5278
    @miladchenaghlou5278 Před 2 lety +1

    After reading about language models, word embeddings, transformers, etc. for a month, this video put everything in order for me. Thanks!

  • @moseswai-mingwong8307
    @moseswai-mingwong8307 Před 2 lety +3

    Thank you for the awesome talk on all the main NLP models, in particular, the great explanation of the Transformer model!

  • @OtRatsaphong
    @OtRatsaphong Před 2 lety +3

    Great overview and explanation of the Transformer network. I am just starting my exploration into NLP and this talk has saved me lots of time. I now know that this where I need to be focussing my attention. Thank you 👍🙏😍

  • @correctmeifiamwrong5862
    @correctmeifiamwrong5862 Před 2 lety +1

    Great Video. The first Transformer explanation that (correctly) does not use the Encoder/Decoder diagram from the Transformer paper, well done!
    Additionally talking about the exact outputs (using only one output for predictions) was very helpful.

  • @harryz7973
    @harryz7973 Před 2 lety +1

    best youtube NLP walk through without cutting corners. best delivery as well.

  • @jorgegines1802
    @jorgegines1802 Před rokem +3

    A cristal clear explanation of Transformers. Papers in many cases are very difficult to follow. Pointing out the important omited details which are critical for the model, even if not explained, is very useful. Many out there try to explain transformers without having a clue of what it is. Clearly, this is not the case. Thanks in its deepest tokenized meaning for sharing your knowledge. BTW, the last programming tip is really helpful. A small hands on demo of using BERT(or any flavor of BERT) with a classifier for a particular application would be amazing for another video.

  • @paygood
    @paygood Před rokem

    I believe this video provides the most comprehensive explanation of transformers.

  • @maudentable
    @maudentable Před rokem

    This is the best video on youtube that introduces transformer models

  • @kanikaagarwal6150
    @kanikaagarwal6150 Před 2 lety +1

    One of the best explanation i have come across on transformers. Thanks

  • @JTedam
    @JTedam Před rokem

    So clear. This is one of the best videos explaining transformer architecture.

  • @inteligenciamilgrau
    @inteligenciamilgrau Před 2 měsíci

    After a year returning to that video finally I fully (or at least saw the entire video in a row) understand what is going on!! Maybe one more time to fix and go to the next part!! Thanxx

  • @deepaksadulla8974
    @deepaksadulla8974 Před 3 lety +1

    Best explanations so far of the attention or QKV concept... I was searching for a good way to visualize it.. Thanks a ton!!

  • @karanacharya18
    @karanacharya18 Před rokem

    Very well explained! Thank you very much. Especially loved the comparison between CV kernels and multiple QKV parameters.

  • @sergiobdbd
    @sergiobdbd Před rokem

    One of the best explanations of transformers that I've seen!

  • @awadheshkumarsrivastava4407

    This video initialised my attention weights for transformers,

  • @GBlunted
    @GBlunted Před rokem +9

    She's so good! I've watched a few videos attempting to explain these self-attention version of transformers and this one is by far the best in so many aspects with actual deep understanding of the architecture at the top followed closely by coherently communicating concepts, good script, presentation and graphics! I hope she narrates more videos like this... I'm about to search and find out lol! 🧐🤞 🤓

  • @ArtificialWisdomCloud
    @ArtificialWisdomCloud Před rokem +1

    Mapping to geometry is pro. I have thought since my education about 40 years ago that current mathematics is taught incorrectly. Here is a pro example of how math should be taught!

  • @jocalvo
    @jocalvo Před 2 lety

    Wow that explanation actually dissipated many of my questions. Thanks a lot Julia!

  • @goelnikhils
    @goelnikhils Před rokem

    Best Video on Transfer Learning. So much clarity

  • @jacobyoung2045
    @jacobyoung2045 Před 3 lety +3

    44:40: Thanks for your attention 😁

  • @shivibhatia1613
    @shivibhatia1613 Před 8 měsíci

    Hands down the best explanation, this after watching so many videos, terrific, Looking forward to some videos on understanding on BARD and its fine tuning

  • @josephpareti9156
    @josephpareti9156 Před 2 lety

    awesome; the very BEST explanation on self-attention and trasformers

  • @rwp8033
    @rwp8033 Před 3 lety

    Great video, it would be nice to have a video of reinforcement learning in future ml tech talks.

  • @geshi7121
    @geshi7121 Před 3 lety +1

    The explanation is so clear, thank you.

  • @aikw5946
    @aikw5946 Před rokem

    Thank you very much ! Great video and very well explained. Yes a video about sentiment analysisfine tuning would be Amazing !

  • @WARATEL_114_30
    @WARATEL_114_30 Před 3 lety +2

    Very straightforward.
    Thank you so much

  • @Vinicius-nd8nz
    @Vinicius-nd8nz Před rokem

    Great presentation! Really easy to understand exaplanations of some hard topics, thank you.

  • @user-or7ji5hv8y
    @user-or7ji5hv8y Před 3 lety +2

    Great presentation. Really well structured.

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w Před rokem

    My favorite explanation so far. Great job.

  • @PavelTverdunov
    @PavelTverdunov Před 2 lety

    super professional explanation of the topic! Excellent work!

  • @EngRiadAlmadani
    @EngRiadAlmadani Před 3 lety +1

    It's very important library in nlp great work

  • @davidobembe5302
    @davidobembe5302 Před 3 lety +2

    Very clear explanation. Thank youuu

  • @josephpareti9156
    @josephpareti9156 Před 2 lety

    at minute 35 the video describes transfer learning, and it is said that during the fine tuning phase ALL the parameters are adjusted, not only the classifier parameters. Is that right? In contrast, when using a pre-trained deep network for a specific image calssification, I froze all parameters belonging to the CNN and just allowed the classifier parameters to vary

  • @jvishnuiitm123
    @jvishnuiitm123 Před 2 lety

    Excellent presentation of complex NLP topic.

  • @irfanyaqub9643
    @irfanyaqub9643 Před rokem

    She has done an incredible job.

  • @toplizard
    @toplizard Před 7 měsíci

    This is very beautifully explained!

  • @PaulFishwick
    @PaulFishwick Před 3 lety

    Agreed with all. This person should take the lead for other Google educational videos.

  • @devanshbatra5267
    @devanshbatra5267 Před rokem

    Thanks a ton for the explantion! Just wanted to ask how do we arrive at the values for matrices K, V and Q?

  • @gunnarw97
    @gunnarw97 Před rokem +1

    Great explanations, thank you so much for this video!

  • @zedudli
    @zedudli Před rokem

    That was super interesting. Very clear explanation

  • @Jacob011
    @Jacob011 Před 3 lety +1

    I expected some wishy-washy feel-good "explanation", but I'm pleasantly surprised. So far the best explanation. Goes after the relevant distinguishing key features of the transformers without getting bogged down in unnecessary details.

  • @SanataniAryavrat
    @SanataniAryavrat Před 3 lety +1

    Awesome.. great explanation. Thanks.

  • @bryanbosire
    @bryanbosire Před 3 lety +1

    Great Presentation

  • @jimnason7293
    @jimnason7293 Před rokem

    Very nice topic discussion! Thank you 🙂

  • @haneulkim4902
    @haneulkim4902 Před 2 lety

    Amazing talk! very informative. Thank you :)

  • @pohkeamtan9876
    @pohkeamtan9876 Před 3 lety

    Excellent teaching !

  • @parsarahimi71
    @parsarahimi71 Před 3 lety +1

    Crystal clear .. Tnx

  • @FarisSkt
    @FarisSkt Před 3 lety +1

    amazing video !

  • @davedurbin813
    @davedurbin813 Před 2 lety

    Great talk, really clear, thanks!
    Also I see what you did "Thanks for your attention" 🤣

  • @santhoshkrishnan6269
    @santhoshkrishnan6269 Před rokem

    Great Explanation

  • @herbertk9266
    @herbertk9266 Před 3 lety +2

    Thank you

  • @ThomasYangLi
    @ThomasYangLi Před rokem

    very good presentation!

  • @user-he4rv6zx5m
    @user-he4rv6zx5m Před rokem +1

    well done video!

  • @lbognini
    @lbognini Před 3 lety

    Simply great! 👏👏👏

  • @ManzoorAliRA
    @ManzoorAliRA Před rokem

    Simply awesome

  • @it-series-music
    @it-series-music Před rokem

    Can someone explain the inputs dict shown in the code at 42:15.

  • @sanjaybhatikar
    @sanjaybhatikar Před 8 měsíci

    Nice, thank you ❤

  • @chavdarpapazov4423
    @chavdarpapazov4423 Před 3 lety +3

    Great presentation! Are the slides available for download? This would be fantastic. Thank you.

  • @joekakone
    @joekakone Před 3 lety +1

    Thank you for shraing !

  • @amitjain9389
    @amitjain9389 Před rokem

    Where can I get the slides for this talk? Great talk

  • @satyajit1512
    @satyajit1512 Před 2 lety

    Great slides.

  • @Randomize-md3bt
    @Randomize-md3bt Před rokem

    I came here from tutorials sections of tensorflow official webpage, but i get caught by her beauty

  • @saurabhkumar-yf1vs
    @saurabhkumar-yf1vs Před 2 lety

    real help, thanks.

  • @OnionKnight541
    @OnionKnight541 Před rokem +1

    very nice

  • @fahemhamou6170
    @fahemhamou6170 Před rokem

    تحياتي الخالصة شكرا جزيلا

  • @vunguyenthai4366
    @vunguyenthai4366 Před 3 lety

    nice video

  • @jantuitman
    @jantuitman Před rokem

    This is a fairly good presentation. There are some areas where it summarizes to the point where it becomes almost misleading, and at least very questionable: 1. Several other sources that I read claim that the Bert layers will have to be frozen during fine tuning, so I think it is still open for debate what the right thing to do is there? 2. This presentation glosses over the outputs of the pretraining phase. I think the output corresponding to the CLS token is pretrained with the “next sentence prediction task”. So, is this output layer dropped entirely in the fine tuning task? Otherwise I don’t see how the CLS token output would be a good input for sentiment classification. 3. The presentation suggest that the initial non contextual token step is also trainable and fine tunable. Isn’t it just fixed byte pair encodings? I know that these depend on frequencies of letters in the language but can these be trained in process with Bert? 4. This presentation equals transformers very silently to transformer encoders, and thus drops the fact that transformers can also be decoders. I think all initial transformers were trained on sequence to sequence transformation, and then the decoders were trained on next token prediction giving rise to things like GPT, whereas the encoders were trained on a combination of masked token prediction and next sentence prediction giving rise to the BERT like models.

  • @JTedam
    @JTedam Před rokem

    Julia,
    Your presentation has triggered a Eureka moment in me . What makes a great training video? Can AI help answer that. Here is a suggestion. Get a collection of videos and rank them by review comments. Using a large language model, find patterns and features and see whether there are correlations between the features and the views and review rankings. The model should be unsupervised. Some of the features can be extracted from comments

  • @shakilkhan4306
    @shakilkhan4306 Před 9 měsíci +1

    Interesting

  • @koushikroy6259
    @koushikroy6259 Před 2 lety

    Thanks for your ATTENTION 🤗🤗.. Pun intended!44:39

  • @ravipratapmishra7013
    @ravipratapmishra7013 Před 2 lety

    Please provide the slides

  • @billyblackburn864
    @billyblackburn864 Před rokem

    i love it

  • @jdcrunchman999
    @jdcrunchman999 Před rokem

    Where can I get the GitHub file

  • @FinnBrownc
    @FinnBrownc Před rokem

    This is a positive comment. CZcams should let it past it’s sentiment filter.

  • @arnablaha
    @arnablaha Před 3 měsíci

    Immaculate!

  • @algogirl2846
    @algogirl2846 Před 2 lety

    👍🏻👍🏻👌

  • @alikhatami6610
    @alikhatami6610 Před 2 lety

    okay what you are saying is completely vague . like for the query matrice you mentioned ( some other representation [why do we need another representation at all ?])

  • @enes-the-cat-father
    @enes-the-cat-father Před 3 lety

    Thanks for not calling Sentiment Classification as Sentiment Analysis!

  • @babaka1850
    @babaka1850 Před 2 měsíci

    Sorry to say, but this was not very good. Key information is missing mostly the WHYs ? why is there a need for Query and Key Matrices? what is the main function of these matrices? How does the Attention function alter the Feedforward NNs?

  • @mohammadmousavi1
    @mohammadmousavi1 Před 2 lety

    I always find the face of presenter distracting when it is on the slides … can you just talk over slides instead of covering them with presenter’s face??