Lecture 56 - Finding the Latent Factors | Stanford University

Sdílet
Vložit
  • čas přidán 25. 07. 2024
  • 🔔 Stay Connected! Get the latest insights on Artificial Intelligence (AI) 🧠, Natural Language Processing (NLP) 📝, and Large Language Models (LLMs) 🤖. Follow ( / mtnayeem ) on Twitter 🐦 for real-time updates, news, and discussions in the field.
    Check out the following interesting papers. Happy learning!
    Paper Title: "On the Role of Reviewer Expertise in Temporal Review Helpfulness Prediction"
    Paper: aclanthology.org/2023.finding...
    Dataset: huggingface.co/datasets/tafse...
    Paper Title: "Abstractive Unsupervised Multi-Document Summarization using Paraphrastic Sentence Fusion"
    Paper: aclanthology.org/C18-1102/
    Paper Title: "Extract with Order for Coherent Multi-Document Summarization"
    Paper: aclanthology.org/W17-2407.pdf
    Paper Title: "Paraphrastic Fusion for Abstractive Multi-Sentence Compression Generation"
    Paper: dl.acm.org/doi/abs/10.1145/31...
    Paper Title: "Neural Diverse Abstractive Sentence Compression Generation"
    Paper: link.springer.com/chapter/10....
  • Věda a technologie

Komentáře • 11

  • @samirelzein1095
    @samirelzein1095 Před rokem

    Clear, thanks!

  • @shaoboliu
    @shaoboliu Před 3 lety

    Thank you! Very clear explained!

  • @btsjiminface
    @btsjiminface Před 4 lety +1

    Amazing lecturer

  • @MrNAGATO139
    @MrNAGATO139 Před 5 lety +1

    well explained, thank you

  • @vslaykovsky
    @vslaykovsky Před 3 lety +1

    11:45 there might be a small mistake in the explanation. Normally the gradient is computed for all parameters of the model at once before all the weights are updated. E.g. first calculate delta(P) and delta(Q), next update P and Q.

    • @santosluiza314
      @santosluiza314 Před 8 měsíci

      You're proposing more of a math method. Usually in math, we take the gradient for all parameters and have a matrix that we solve for the values when the gradient is 0. But he is giving the algorithmic way of doing that which is more of a computer science method, it's called alternating minimization. You fix one to solve for the other then fix the other to solve for the one. Here's some slides from MIT that might help www.mit.edu/~rakhlin/6.883/lectures/lecture07.pdf

  • @Niels1234321
    @Niels1234321 Před 7 lety

    What makes the gradient descent used here a stochaastic gradient descent?

  • @15jorada
    @15jorada Před rokem +1

    Neat! So if I understand correctly, lambda should be a function of the data that you have on the user. If you have a user with a lot of ratings, the latent factor model would be able to more confidently understand what content should be suggested to the user, and the user will have a low lambda because in the content space, the model won't need to correct as much for lack of data. On the other hand, if there is no data on the user the lambda will essentially give the user more generic content.
    Am I understanding this right?

    • @raj-nq8ke
      @raj-nq8ke Před rokem +1

      No, lambda is a hyperparameter and won't change in any case.

  • @noneofyoureffingbizness5806

    ΤΗΑΝΚ ΥΟU