NLP Demystified 11: Essential Training Techniques for Neural Networks

Sdílet
Vložit
  • čas přidán 2. 08. 2024
  • Course playlist: • Natural Language Proce...
    In our previous deep dive into neural networks, we looked at the core mechanisms behind how they learn. In this video, we'll explore all the additional details when it comes to effectively training them.
    We'll look at how to converge faster to a minimum, when to use certain activation functions, when and how to scale our features, and what deep learning is ultimately about.
    We'll also apply our knowledge by building a simple deep learning model for text classification, and this will mark our return to NLP for the rest of the course.
    Colab notebook: colab.research.google.com/git...
    Timestamps
    00:00:00 Neural Networks II
    00:01:09 Mini-batch stochastic gradient descent
    00:03:55 Finding an effective learning rate
    00:06:15 Using a learning schedule
    00:07:35 Complex loss surfaces and local minima
    00:09:12 Adding momentum to gradient descent
    00:12:50 Adaptive optimizers (RMSProp and Adam)
    00:15:08 Local minima are rarely a problem
    00:15:21 Activation functions (sigmoid, tanh, and relu)
    00:19:35 Weight initialization techniques (Xavier/Glorot and He)
    00:21:25 Feature scaling (normalization and standardization)
    00:23:28 Batch normalization for training stability
    00:28:26 Regularization (early stopping, L1, L2, and dropout)
    00:33:11 DEMO: building a basic deep learning model for NLP
    00:56:19 Deep learning is about learning representations
    00:58:18 Sensible defaults when building deep learning models
    This video is part of Natural Language Processing Demystified --a free, accessible course on NLP.
    Visit www.nlpdemystified.org/ to learn more.

Komentáře • 10

  • @futuremojo
    @futuremojo  Před 2 lety +1

    Timestamps
    00:00:00 Neural Networks II
    00:01:09 Mini-batch stochastic gradient descent
    00:03:55 Finding an effective learning rate
    00:06:15 Using a learning schedule
    00:07:35 Complex loss surfaces and local minima
    00:09:12 Adding momentum to gradient descent
    00:12:50 Adaptive optimizers (RMSProp and Adam)
    00:15:08 Local minima are rarely a problem
    00:15:21 Activation functions (sigmoid, tanh, and relu)
    00:19:35 Weight initialization techniques (Xavier/Glorot and He)
    00:21:25 Feature scaling (normalization and standardization)
    00:23:28 Batch normalization for training stability
    00:28:26 Regularization (early stopping, L1, L2, and dropout)
    00:33:11 DEMO: building a basic deep learning model for NLP
    00:56:19 Deep learning is about learning representations
    00:58:18 Sensible defaults when building deep learning models

  • @user-ie5gj9bh4p
    @user-ie5gj9bh4p Před rokem +2

    Your course is gold!

  • @richardharris9708
    @richardharris9708 Před 2 lety +5

    Thank so much for your work on this series. Just a note, you need at least 10GB of free ram for the notebook to complete without crashing. (At least on my machine). Good idea to close any unnecessary programs before running it.

    • @futuremojo
      @futuremojo  Před 2 lety +1

      Thank you for sharing this tip. Yep, I see a spike to about 4.5GB of RAM on the free Colab tier the first time the bag-of-words is converted to a sparse tensor before being garbage-collected. This cell here: bit.ly/3cMh7of.

  • @kevinoudelet
    @kevinoudelet Před 5 měsíci +1

    Thank you !!!

  • @renjua
    @renjua Před rokem +1

    Many thanks for this excellent video.

  • @caiyu538
    @caiyu538 Před rokem +1

    Great lectures.

  • @amparoconsuelo9451
    @amparoconsuelo9451 Před 9 měsíci +1

    It was ONLY your video that made me realize that it will take me years to study AI programming. And by the time those years have passed, I will again have to learn new variants and models and familiarize myself with new Python libraries and modules. Will I ever catch up? Thanks for "revealing the secrets". How about an interactive tutorial where a student inputs different variables and watch how Python responds? 56:01

    • @gokharol
      @gokharol Před 9 měsíci

      Why 10 years? Seems to be unnecessary

  • @chenola
    @chenola Před 7 měsíci

    You kind of sound like Casually Explained