Word Embeddings

Sdílet
Vložit
  • čas přidán 14. 07. 2017
  • Word embeddings are one of the coolest things you can do with Machine Learning right now.
    Try the web app: embeddings.macheads101.com
    Word2vec paper: arxiv.org/abs/1301.3781
    GloVe paper: nlp.stanford.edu/pubs/glove.pdf
    GloVe webpage: nlp.stanford.edu/projects/glove/
    Other resources:
    www.aclweb.org/anthology/Q15-1016
    en.wikipedia.org/wiki/Word_em...
  • Věda a technologie

Komentáře • 215

  • @oscarmvl
    @oscarmvl Před rokem +7

    This is so relevant in 2023, timeless explanation!

  • @sau002
    @sau002 Před 5 lety +93

    You spent considerable no of minutes to explain the nature of the problem before presenting the solution to the problem - I like that approach.

  • @amreshgiri4933
    @amreshgiri4933 Před 5 lety +18

    You're a genius. I was struggling to understand the word embedding concept through Stanford University videos. Your explanation and pace is much better. Thanks. Keep making such videos.

  • @myfolder4561
    @myfolder4561 Před 5 měsíci

    Glad to have come across this while looking for materials to learn about word embeddings to understand how text prompts work in stable diffusion text to image models in 2023. You're a great teacher. A lot of videos on this topic across yt is full of jargons without clear explanation. in 2023 there's still tons of relevance of this video with where the current state of technology is

  • @AshwinVel
    @AshwinVel Před 6 lety

    I honestly think this is a really good explanation to word embedding. Breaks the nitty gritty involved in word2vec and co-occurrence. I’ve read a couple article and watched a few videos but by far yours is the easiest to comprehend. Thank you so much.
    Cheers from Malaysia!

  • @artemkoren9582
    @artemkoren9582 Před 5 lety

    I've gone through several embedding explanations until I arrived at this one. Well done, finally all pieces make sense. Thanks!

  • @longship44
    @longship44 Před 4 lety +7

    You are very good at explaining pretty complex concepts. I appreciate the time you took to do this, it was very informative.

  • @panchoingham
    @panchoingham Před 6 lety +38

    Dude, seriously, thank you. I am not a CS major but I got absurdly interested in AI during my last years of college. It's inspiring to see your self-learnt enthusiasm and it gives me strength to follow my interest. Keep up the good work and thank you!
    Cheers from Buenos Aires, Argentina

  • @12sandy345
    @12sandy345 Před 6 lety +1

    An Exceptional Lecture! What I loved about it is its focus on implementation and results which really helps build good intution around it, to garner interest and dig deeper into math details later which most us are boggled before knowing how powerful/useful the results are. Thank you again.

  • @jabusch24
    @jabusch24 Před 5 lety +1

    This is really well explained. Best word2vec explanation ive seen on youtube so far.

  • @coolshoos
    @coolshoos Před 6 lety

    Glad to see a relatively new video from you guys. I'm an old-time fan. And this is exactly what I'm attempting to learn right now.

  • @lakshmisairamthubati9080

    Probably the most clear explanation of word2vec. Thanks for the video.

  • @fiddlepants5947
    @fiddlepants5947 Před 5 lety +5

    Humble, concise, brilliant... Subscribed!

  • @salutoitoi
    @salutoitoi Před 3 lety +1

    I recently started learning NLP, and the part of word embeddings was just not clear at all. It makes sense now.
    Thank you a lot ! You won a subscriber

  • @fhypnos912
    @fhypnos912 Před 2 lety

    True genius knows how to explain a complex concept in a really simple and intuitive way. Solut.

  • @haridotvenkat
    @haridotvenkat Před 6 lety

    Excellent work. I was looking for such an explanation on word embeddings & I am happy that I found this. Thanks.

  • @jessicas2978
    @jessicas2978 Před 4 lety

    Thank you so much for your video! It's the best learning material I can find on youtube.

  • @govinda1993
    @govinda1993 Před 5 lety

    i really appreciated the emphasis you gave on word embeddings rather than on word2vec.

  • @partheshsoni1905
    @partheshsoni1905 Před 5 lety

    I liked the way you explained...crisp and clear!

  • @johnandersontorresmosquera1156

    Awesome explanation , after hours of looking for good stuff to understand the word embeddings. Thanks !

  • @sreeramv112
    @sreeramv112 Před 5 lety

    For someone who knows nothing and wants to know everything in ML, this is the simply awesome explanation.

  • @naveenkalhan95
    @naveenkalhan95 Před 4 lety

    amazing man... i was going through 10's and 10's of online respurces to understand what is word embeddings! They way you explained it made me directly subscribe to your channel... very well. thank you very much

  • @saksham01
    @saksham01 Před rokem

    I haven't seen a better explanation on this. Thank you. This was really good.

  • @Aviator168
    @Aviator168 Před 4 lety +1

    Great video. I was having difficulty understand 'context'. You explained clearly. Thank you.

  • @vijeta268
    @vijeta268 Před 4 lety

    Your explanation was very clear and simple, thanks for making this video.

  • @blancheporter1289
    @blancheporter1289 Před 4 lety +1

    one word, awesome. Thanks a lot for the video.
    Humble, concise, brilliant... Subscribed!
    Why have you stopped making videos man. Miss your vids

  • @garbour456
    @garbour456 Před 6 lety

    Awesome video man. Extremely well presented. I'm impressed with your presentation skills. thanks for the video

  • @kenchu764
    @kenchu764 Před 2 lety +6

    I just started learning ML concepts, and this video helped tremendously with word embeddings. You got another subscriber. I do have a question though. In your example, how did you decide on 64 as the other dimension of your factored matrices? Would a larger number there give you a better word embedding?

  • @varunverma744
    @varunverma744 Před 5 lety +1

    That was an amazing explanation. Thank you!

  • @sadeebahsan4804
    @sadeebahsan4804 Před 5 lety

    this is really intuitive. most places I got answers like representing words with vectors which wasn't helpful. now i think i have a proper idea.

  • @charlieangkor8649
    @charlieangkor8649 Před 3 lety +2

    good lecture, includes important information like what he finds cool, that it’s the best example what can be done etc. that allows the listener to organize information hierarchically. not like some university lectures, where just a monolithic dump of text is flowing out and we don’t know what is important to remember and what not.

  • @nishankbani3257
    @nishankbani3257 Před 6 lety

    Informative, interesting. Raised my interest in the topic of word embedding

  • @joshuafishman9002
    @joshuafishman9002 Před 7 lety +12

    I'm glad you made this video. Now I don't have to download the vector for every word on twitter.

    • @sufyanqadeer2705
      @sufyanqadeer2705 Před 6 lety

      hello, friend. I need word file that was available on this side.Now the link is not working. Help me and send me the words vector file.Please
      link : embeddings.macheads101.com/

    • @sufyanqadeer2705
      @sufyanqadeer2705 Před 6 lety

      My Email : sufyan.ali7272@gmail.com

  • @Alkis05
    @Alkis05 Před 3 lety +2

    More generally, word2vec is nothing more than graph2vec. Sentences can be seen as random walks in the english-language graph, in which each word is a node and every world is connected to other words. The strength of this connections depend on how frequent they appear on the same context. Seen this as graphs allow you to run network analysis and see what other kinds of information you can extract from it. By doing it right, you might even be able to estimate the connections for words that didn't appear in the training set and try to update the model to make it better.
    Or use the word embedding and try to embed sentences and see how that goes.

  • @alfital2
    @alfital2 Před 3 lety

    Awesome explanation, thanks.

  • @poonritchie
    @poonritchie Před 6 lety

    I just learning embedding layers and luckily I ran into this video.

  • @camilaferraz8153
    @camilaferraz8153 Před 2 lety

    Thanks for sharing! It helped a lot!

  • @radanici
    @radanici Před 5 lety

    Just starting out to venture into this world. Thanks for the explanation. I have a medical background, but no background on computer science. So this gives me a little bit hope in learning totally something new.

  • @emenikeanigbogu9368
    @emenikeanigbogu9368 Před 4 lety

    Amazing man. Thank you for your time!

  • @meijiishin5650
    @meijiishin5650 Před rokem +3

    Fun fact: This guy went on to work at OpenAI and is one of the creators of DALL-E 2.

    • @GodofStories
      @GodofStories Před rokem

      Haha nice, as soon as I saw him speak for 5 seconds, and saw the timestamp of 5 years ago. I typed this -"If this guy isn't already a founder of a leading company in this AI wave, i'll be disappointed. but hey most of the smartest people don't always see success. And fame, money isn't everything."
      Glad to see I was right haha. I was like there is a high probability, considering this was 5 years ago, and if nothing else in the Universe interfered with this guy's life trajectory just based on the way this guy talks, and looks which basically shout young and motivated, or hungry should mean he is one of the ls a big time guy now.

    • @LokeshSharma-me5pg
      @LokeshSharma-me5pg Před rokem

      no wonder a man like him can do the job...

  • @blenderpanzi
    @blenderpanzi Před 5 měsíci

    8:08 Thank you! This explained the missing piece to me. Multiple other videos on that topic where missing this nice and easy to understand diagram.

  • @darsh_shukla
    @darsh_shukla Před 6 lety

    Man you are my teacher from now onwards.

  • @smritidey2942
    @smritidey2942 Před 5 lety

    Oh Man u are excellent in explaining word2vec, hoping to see some more in text NLP.

  • @vanbap
    @vanbap Před 2 lety

    I really appreciate this video sir !

  • @BlockDesignz
    @BlockDesignz Před 4 lety

    This was brilliant. Keep on creating.

  • @keres993
    @keres993 Před 4 lety

    Brilliant explanation! Thank you!

  • @poonritchie
    @poonritchie Před 6 lety

    HI Macheads, I have ran into so many hopelessly disappointing video presenters or live 'trainer" who just talk to themselves. Evenworse they make me confused about areas i already know and make you even confused about things you know. haha (even from the top IT corporations). Hope you can talk in our upcoming training session - just to demonstrate what is a quality presentation of tech ideas. You have inborn ability to explain and motivate

  • @sniperas96
    @sniperas96 Před 2 lety

    still in 2022 much clearer explanation than my professor on my master.

  • @yoniziv
    @yoniziv Před 3 lety +2

    This is gold! thank you (from the future :-))

  • @barteksielicki7276
    @barteksielicki7276 Před 6 lety +1

    Great explanation!

  • @vidurwadhwa6897
    @vidurwadhwa6897 Před 6 lety

    Great explanation!! Thanks a lot

  • @manedinesh
    @manedinesh Před 5 lety

    very well explained word2vec Vs glove, thank you.

  • @ritik84629
    @ritik84629 Před rokem +1

    Temperature: 5 years ago
    Feels like 15 years ago

  • @reactorscience
    @reactorscience Před 4 lety

    Great explanation!!!

  • @lenant
    @lenant Před 5 lety

    Very nice explanation, thanks

  • @azai.mp4
    @azai.mp4 Před 6 lety +3

    I'm wondering if something similar to Disentangled Variational Autoencoding could be used to improve a word2vec embedding. I'm not quite sure on the details, but it seems DVA has an effect similar to stuff like factor analysis, and principal component analysis, producing a latent space whose dimensions are more akin to real "separate" dimensions. Aka, producing separate dimensions for the italic-ness and boldness of a written digit, as seen in the paper Disentangled Variational Auto-Encoder for Semi-supervised Learning by Yang Li et al. (I would link it but CZcams has a history of assuming comments with links in them are spam.)
    If that technique translates well into word vectors, it could for example result in a model where "maleness" is its own dimensions. i.e. "man" - "woman" ~= "king" - "queen" ~= (0, 0, 0, 1, 0, 0, ...) (A large vector that is parallel to one of the dimensions.)
    Another interesting venture would be to pre-process the data using an NLP library, so that different forms of the same lemma are already grouped together by default, and so that homographs can be separated. It could also expose information that a skip-gram or bag-of-words model would miss, such as dependency / sentence structure.
    I really ought to get my hands dirty some time instead of just thinking about this stuff in my head...

  • @user-fy5go3rh8p
    @user-fy5go3rh8p Před 3 lety

    I don't get, why all the dislikes, the explanation is great.

  • @adage3256
    @adage3256 Před 5 lety

    Awesome recap !

  • @TestTest-tj4nt
    @TestTest-tj4nt Před 6 měsíci

    The app is still up, impressive.

  • @NahinAndroid
    @NahinAndroid Před 3 lety

    Beautiful, great work

  • @amardeepganguly6676
    @amardeepganguly6676 Před 4 lety

    Amazing explanantion brother thank you

  • @ROHAN0APK
    @ROHAN0APK Před 6 lety

    Brilliant video! Thanks :)

  • @Nova-Rift
    @Nova-Rift Před 2 lety +1

    very well explained imo

  • @anamqureshi5263
    @anamqureshi5263 Před 2 lety +1

    Great Explanation!

    • @AmeerHamza-jy5ml
      @AmeerHamza-jy5ml Před 2 lety

      Hope you are doing good, I'm interested doing this thing in Urdu language. I wish you contribute with me to do this. thanks

  • @medhj9679
    @medhj9679 Před 4 lety

    Thanks man ! good explanation

  • @yeahorightbro
    @yeahorightbro Před 6 lety +1

    Just checked out the web app and was wondering how you put that together? Django? It is brilliant!

  • @nimeshsingh9271
    @nimeshsingh9271 Před 4 lety

    This is much better explanation than that available on some of the paid courses.

  • @mahdip.4674
    @mahdip.4674 Před 6 lety

    Thanks for the video. I have seen GloVe models that contain the stop words and basically it means that at least they do not remove stop words. I assume one can remove tham and create two different model or vectors. If so, I assume there is not that much space to talk about precision of the two approach. Right?
    The other thing is that in case of embedding we do not apply stemming or other similar techniques, since the process is largely on context level. Right?

  • @kanwar793
    @kanwar793 Před 7 lety

    Am a regular follower.. Keep up!!

  • @ghazibenyoussef8424
    @ghazibenyoussef8424 Před rokem +1

    Im new in AI, but impressed by what happens now and this hype around NLP and deep learning. Im self learning about all this. I really like what you have done, downloading tweets and word embedding. Is it possuble to access to your source code
    Tks

  • @bananakiu
    @bananakiu Před 3 lety

    great video!

  • @jaysaha1967
    @jaysaha1967 Před 4 lety +1

    The website is really cool🔥

  • @cristianjuarez1086
    @cristianjuarez1086 Před 2 lety

    I wish I could understand word embeddings just as well as you, im still a begginer as for now but this is what I want to become. Also I share your love for WE specially because I want to develop a NLP or a language model that generates answers but its too ambitious at the moment

  • @ridhwanfranc
    @ridhwanfranc Před 3 lety

    amazing video bro

  • @deniscandido4116
    @deniscandido4116 Před 6 lety

    Hello, do you invested time on learning all Calculus things like doing some partial derivative by hand or you can abstract this? I'm kind of slipping when I see mathematical content... but I'm able to build a CNN on Tensorflow without problems. Are this painfull way?

  • @nssSmooge
    @nssSmooge Před 5 lety

    So far I was able to only use normal dtm tfidf to compare speeches given at un - using text2vec in R. Not sure if I used it correctly though. It has an option for glove too, which I want to try out because I am a beginner and started in R. Python is soo confusing for me and do not let me start on gensin packages and docs.

  • @techynerdz9566
    @techynerdz9566 Před 6 lety

    Hey how long did it take to download all that twitter data? Also, did you run it in the cloud or directly on your mac and if you ran it in the cloud, what service provider did u use? Thanks for your videos

  • @shivamraj2649
    @shivamraj2649 Před 6 lety

    u are doing a very good thing buddy keep it up😊😊

  • @vaibhavvaghela6234
    @vaibhavvaghela6234 Před 6 lety +2

    Why have you stopped making videos man. Miss your vids

  • @WenjunLv
    @WenjunLv Před 4 lety

    how could embedding be applied to other field?
    Embeding is applicable to sparse features?

  • @juleswombat5309
    @juleswombat5309 Před 5 lety

    That was pretty awesome. It would be nice to see some code.

  • @hamidkhalil9598
    @hamidkhalil9598 Před 5 lety

    subscribed after 1 minute!

  • @gabriellevaillant5153
    @gabriellevaillant5153 Před 4 lety

    If i understood well. You have to multiply the first weight matrix (between input layer and hidden layer) with all word vector (composed of 0 and 1) to obtain the embedding matrix ? (weight matrix obtained with BackWard Propagation etc... ?)
    Thanks :)

  • @matthisc5100
    @matthisc5100 Před rokem

    nice video, thanks

  • @alexanderblumin6659
    @alexanderblumin6659 Před 2 lety

    Very helpful!!!

  • @piyalikarmakar5979
    @piyalikarmakar5979 Před 2 lety

    Sir, I have one query that what exactly the output layer predicts? The embedding of the input word or the context of input word?

  • @Pakrdjdjdnsnsmskzozovyff

    Great Job!

  • @CarlJohnson-jj9ic
    @CarlJohnson-jj9ic Před 5 lety

    Thank you! What else can i do to learn and engage?

  • @bieberssaman3805
    @bieberssaman3805 Před 4 lety

    Wow Great Video.

  • @ugurkaraaslan9285
    @ugurkaraaslan9285 Před 3 lety

    Thank you very much. How can we decide number of features in embedding matrix? When you deal with colors you have 3 features (R,G,B) but how can i define features for a 10000 words corpus? Thanks in advance.

  • @jaradcollier2677
    @jaradcollier2677 Před 6 lety

    Is this really how word embeddings work? Like, I'm blown away that this complex thought of what word embeddings were are just a word co-occurrence matrix decomposed into a smaller matrix. Is there more to it than that or is that the gist? More specifically, take text (documents), transform into tfidf 1-word token. Do dot product on tfifd matrix to get a square matrix (co-occurrence matrix). Take that and decompose it to say 64 components. Each of those rows are your word-vectors? The entire matrix is the word embedding at that point?

  • @ghazybajak
    @ghazybajak Před 5 lety

    please i need some help with a mac .... apear on my screen this leyend ..... SystemExtr se ha cerrado inesperadamente.
    Tx so much

  • @jamesang7861
    @jamesang7861 Před 4 lety

    thank you!

  • @sufyanqadeer2705
    @sufyanqadeer2705 Před 6 lety

    what hardware do you have to train these models..

  • @bilalchandiabaloch8464

    But i am confused about weights matrix W from where and how do we acquire these weights W and W'.

  • @SagarSharma-ce6jg
    @SagarSharma-ce6jg Před 5 lety

    Hi it is indeed great...
    you mentioned downloading twitter data.
    Is it possible that I could get this data?
    Working on a similar project of sentiment analysis

  • @FuZZbaLLbee
    @FuZZbaLLbee Před 6 lety

    I think the famous calculation example is
    King - man + woman = queen
    Update: ah I see, you where referring to the example image on the glove website.
    Anyway interesting stuff will take a look at glove as well. Currently trying to see if I can make a resumé / job listing similarity model. I guess I can do something with cos distance of terms in the resumé to terms in the job listing. Ideas are welcome

  • @rasyaramesh7433
    @rasyaramesh7433 Před 4 lety +4

    omg you look kinda like hiccup from how to train your dragon xD also thank you so much you literally just saved my life

  • @WaqarAli-jc6zg
    @WaqarAli-jc6zg Před 4 lety

    Hi, I have downloaded GloVe from nlp.stanford.edu/projects/glove/ and want use for word embeddings on a different dataset. But unable to compile the source "cd GloVe-1.2 && make", anybody can please guide me how make it functional...?

  • @yeahorightbro
    @yeahorightbro Před 6 lety

    Do you plan on doing a video on ConvNets?