What Are Word and Sentence Embeddings?

Sdílet
Vložit
  • čas přidán 7. 07. 2024
  • This video is part of LLM University
    docs.cohere.com/docs/text-emb...
    Sentence and word embeddings are the bread and butter of language models. Here is a very simple introduction to what they are by Luis Serrano.
    Bio:
    Luis Serrano is the lead of developer relations at Co:here. Previously he has been a research scientist and an educator in machine learning and quantum computing. Luis did his PhD in mathematics at the University of Michigan, before embarking to Silicon Valley to work at several companies like Google and Apple. Luis is the author of the Amazon best-seller "Grokking Machine Learning", where he explains machine learning in a clear and concise way, and he is the creator of the educational CZcams channel "Serrano.Academy", with over 100K subscribers and 5M views.
    ===
    Resources:
    Blog post: txt.cohere.ai/sentence-word-e...
    Learn more: / luisserrano
  • Věda a technologie

Komentáře • 10

  • @carlosperezcpe
    @carlosperezcpe Před rokem +1

    Hey thanks for the video, im looking to do this exercise with around 13k to 15k news articles. Is there an efficient way of doing it with cohere?

  • @nguyenvinh2298
    @nguyenvinh2298 Před 2 měsíci

    I can understand word embedding as the process of passing the digital form of the word through the embedding network to get the word embedding. But is sentence embedding a combination of word embedding?

  • @user-zi2lh9df7e
    @user-zi2lh9df7e Před 8 dny

    Does this mean, we can start the process in say, Swahili and then convert to English, so as not to lose the particular characteristics of Swahili which happens when one starts with English?

  • @berkk1993
    @berkk1993 Před 11 měsíci

    good

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w Před rokem +1

    If one is given embeddings, can one go ahead and use those embeddings for any NLP model?

    • @SerranoAcademy
      @SerranoAcademy Před rokem

      Absolutely! They can be used for classification, generation, clustering, basically any ML model that you'd like to train. Embeddings can be used as a preprocessing method, where instead of text, now you have vectors, and you train models on those vectors.

    • @JimmyGarzon
      @JimmyGarzon Před rokem

      ​@@SerranoAcademy Thanks for this wonderful video! Just to make sure, though. If one creates embeddings using cohere embeddings say, then they can be used in any model say BERT or GPT-J, somehow? Or are we talking about NLP tasks?

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w Před rokem +1

    curious, how are the embeddings actually calculated?

    • @SerranoAcademy
      @SerranoAcademy Před rokem +2

      Great question! There are many ways to calculate them, some with neural networks. Check out Word2Vec to learn them (a future video will come out on that, stay tuned!)
      The idea is that when two words appear in similar sentences a lot, the model will slowly join them, and when they don't, the model will slowly separate them. In that way, similar words end up close to each other. A neural network, for example, can be trained to predict the neighbours of a word, and the last layer of the neural network can be used for the embedding.

  • @bastabey2652
    @bastabey2652 Před rokem +1

    i assumed if world cup is in qatar, it makes sense to add arabic translation unless there are issues in arabic support