NLP Demystified 13: Recurrent Neural Networks and Language Models
Vložit
- čas přidán 2. 08. 2024
- Course playlist: • Natural Language Proce...
We'll learn how to get computers to generate text through a technique called recurrence. We'll also look at the weaknesses of the bag-of-words approaches we've seen so far, how to capture the information in word order, and in the demo, we'll build a part-of-speech tagger and text-generating language model.
Colab notebook: colab.research.google.com/git...
Timestamps
00:00:00 Recurrent Neural Networks
00:00:23 The problem with bag-of-words techniques
00:02:28 Using recurrence to process text as a sequence
00:07:53 Backpropagation with RNNs
00:12:03 RNNs vs other sequence processing techniques
00:13:08 Introducing Language Models
00:14:37 Training RNN-based language models
00:17:40 Text generation with RNN-based language models
00:19:44 Evaluating language models with Perplexity
00:20:54 The shortcomings of simple RNNs
00:22:48 Capturing long-range dependencies with LSTMs
00:27:20 Multilayer and bidirectional RNNs
00:29:58 DEMO: Building a Part-of-Speech Tagger with a bidirectional LSTM
00:42:22 DEMO: Building a language model with a stacked LSTM
00:58:04 Different RNN setups
This video is part of Natural Language Processing Demystified --a free, accessible course on NLP.
Visit www.nlpdemystified.org/ to learn more.
Timestamps
00:00:00 Recurrent Neural Networks
00:00:23 The problem with bag-of-words techniques
00:02:28 Using recurrence to process text as a sequence
00:07:53 Backpropagation with RNNs
00:12:03 RNNs vs other sequence processing techniques
00:13:08 Introducing Language Models
00:14:37 Training RNN-based language models
00:17:40 Text generation with RNN-based language models
00:19:44 Evaluating language models with Perplexity
00:20:54 The shortcomings of simple RNNs
00:22:48 Capturing long-range dependencies with LSTMs
00:27:20 Multilayer and bidirectional RNNs
00:29:58 DEMO: Building a Part-of-Speech Tagger with a bidirectional LSTM
00:42:22 DEMO: Building a language model with a stacked LSTM
00:58:04 Different RNN setups
Really super mix between theory , concepts , math and then coding .. highly underrated ..
Thanks, Hazem! I was going for a particular mix that explored the subject at multiple levels. It's good to hear it resonated with you.
I hope you provided full and detailed course on neural networks. You're the best.
Excellent video! You really bring this subject to life
I Really want to show your videos on different topics like CNN GANS
At 31:20, what is the `+/` operator? I've never seen that before in Python and I can't find much on google
The forward slash is just a line continuation operator:
www.google.com/search?q=pyton+multiline+with+forward+slash
Why didn't you use embedding layer? What is the purpose here? What would have changed if we add one?
You'll find the answer in this comment cell:
colab.research.google.com/github/nitinpunjabi/nlp-demystified/blob/main/notebooks/nlpdemystified_recurrent_neural_networks.ipynb#scrollTo=2DgNpgicAMbr
"We're not using embeddings for the input. We can, but since this is a character model with just a few dozen possible choices, we can get away with one-hot encoding. There's also no reason to think a particular letter should be closer to another in vector space as we would want in a word-level model."
I haven't tried with an embedding layer (give it a shot!). My prediction for this particular example is that it wouldn't make much of a difference since this is a character-level model.
@@futuremojo Thank you very much for the answer.