NLP Demystified 6: TF-IDF and Simple Document Search
Vložit
- čas přidán 2. 08. 2024
- Course playlist: • Natural Language Proce...
We look at the problems of the previous bag-of-words approach, then use an improved technique (TF-IDF) to overcome them. In the demo, we'll use spaCy and scikit-learn to build TF-IDF vectors and build a simple document search engine.
Colab notebook: colab.research.google.com/git...
Timestamps:
00:00:00 TF-IDF
00:00:15 The problem with binary/frequency bag-of-words
00:01:03 Using relative frequency instead
00:01:50 Term Frequency (TF)
00:03:14 Inverse Document Frequency (IDF)
00:03:54 Getting a word's TF-IDF score
00:04:52 Variations of TF-IDF
00:05:49 DEMO: creating TF-IDF vectors with scikit-learn
00:08:41 DEMO: querying a corpus and ranking results
00:11:04 Benefits and shortcomings of TF-IDF
This video is part of Natural Language Processing Demystified --a free, accessible course on NLP.
Visit www.nlpdemystified.org/ to learn more.
Timestamps:
00:00:00 TF-IDF
00:00:15 The problem with binary/frequency bag-of-words
00:01:03 Using relative frequency instead
00:01:50 Term Frequency (TF)
00:03:14 Inverse Document Frequency (IDF)
00:03:54 Getting a word's TF-IDF score
00:04:52 Variations of TF-IDF
00:05:49 DEMO: creating TF-IDF vectors with scikit-learn
00:08:41 DEMO: querying a corpus and ranking results
00:11:04 Benefits and shortcomings of TF-IDF
You are my hero, I'm sure you will achieve a-lot of social awareness soon enough, The job is amazing I love it!
Thank you!
Thank you!
Thanks for your vids
why aren't university teachers like you?
Bring end to end nlp projects videos