NLP Demystified 13: Recurrent Neural Networks and Language Models

Sdílet
Vložit
  • čas přidán 2. 08. 2024
  • Course playlist: • Natural Language Proce...
    We'll learn how to get computers to generate text through a technique called recurrence. We'll also look at the weaknesses of the bag-of-words approaches we've seen so far, how to capture the information in word order, and in the demo, we'll build a part-of-speech tagger and text-generating language model.
    Colab notebook: colab.research.google.com/git...
    Timestamps
    00:00:00 Recurrent Neural Networks
    00:00:23 The problem with bag-of-words techniques
    00:02:28 Using recurrence to process text as a sequence
    00:07:53 Backpropagation with RNNs
    00:12:03 RNNs vs other sequence processing techniques
    00:13:08 Introducing Language Models
    00:14:37 Training RNN-based language models
    00:17:40 Text generation with RNN-based language models
    00:19:44 Evaluating language models with Perplexity
    00:20:54 The shortcomings of simple RNNs
    00:22:48 Capturing long-range dependencies with LSTMs
    00:27:20 Multilayer and bidirectional RNNs
    00:29:58 DEMO: Building a Part-of-Speech Tagger with a bidirectional LSTM
    00:42:22 DEMO: Building a language model with a stacked LSTM
    00:58:04 Different RNN setups
    This video is part of Natural Language Processing Demystified --a free, accessible course on NLP.
    Visit www.nlpdemystified.org/ to learn more.

Komentáře • 12

  • @futuremojo
    @futuremojo  Před 2 lety

    Timestamps
    00:00:00 Recurrent Neural Networks
    00:00:23 The problem with bag-of-words techniques
    00:02:28 Using recurrence to process text as a sequence
    00:07:53 Backpropagation with RNNs
    00:12:03 RNNs vs other sequence processing techniques
    00:13:08 Introducing Language Models
    00:14:37 Training RNN-based language models
    00:17:40 Text generation with RNN-based language models
    00:19:44 Evaluating language models with Perplexity
    00:20:54 The shortcomings of simple RNNs
    00:22:48 Capturing long-range dependencies with LSTMs
    00:27:20 Multilayer and bidirectional RNNs
    00:29:58 DEMO: Building a Part-of-Speech Tagger with a bidirectional LSTM
    00:42:22 DEMO: Building a language model with a stacked LSTM
    00:58:04 Different RNN setups

  • @HazemAzim
    @HazemAzim Před rokem +7

    Really super mix between theory , concepts , math and then coding .. highly underrated ..

    • @futuremojo
      @futuremojo  Před rokem +2

      Thanks, Hazem! I was going for a particular mix that explored the subject at multiple levels. It's good to hear it resonated with you.

  • @danialb9894
    @danialb9894 Před rokem +1

    I hope you provided full and detailed course on neural networks. You're the best.

  • @samuelcortinhas4877
    @samuelcortinhas4877 Před rokem

    Excellent video! You really bring this subject to life

  • @adityashukla9840
    @adityashukla9840 Před 11 dny

    I Really want to show your videos on different topics like CNN GANS

  • @moistnar
    @moistnar Před rokem

    At 31:20, what is the `+/` operator? I've never seen that before in Python and I can't find much on google

    • @futuremojo
      @futuremojo  Před rokem +1

      The forward slash is just a line continuation operator:
      www.google.com/search?q=pyton+multiline+with+forward+slash

  • @KemalCanKara
    @KemalCanKara Před rokem +1

    Why didn't you use embedding layer? What is the purpose here? What would have changed if we add one?

    • @futuremojo
      @futuremojo  Před rokem

      You'll find the answer in this comment cell:
      colab.research.google.com/github/nitinpunjabi/nlp-demystified/blob/main/notebooks/nlpdemystified_recurrent_neural_networks.ipynb#scrollTo=2DgNpgicAMbr
      "We're not using embeddings for the input. We can, but since this is a character model with just a few dozen possible choices, we can get away with one-hot encoding. There's also no reason to think a particular letter should be closer to another in vector space as we would want in a word-level model."
      I haven't tried with an embedding layer (give it a shot!). My prediction for this particular example is that it wouldn't make much of a difference since this is a character-level model.

    • @KemalCanKara
      @KemalCanKara Před rokem +1

      @@futuremojo Thank you very much for the answer.