Building an Implicit Recommendation Engine with Spark with Sophie Watson (Red Hat)

Sdílet
Vložit
  • čas přidán 9. 07. 2024
  • Many of today’s most engaging - and commercially important - applications provide personalised experiences to users. Collaborative filtering algorithms capture the commonality between users and enable applications to make personalised recommendations quickly and efficiently. The Alternating Least Squares (ALS) algorithm is still deemed the industry standard in collaborative filtering. In this talk Sophie will show you how to implement ALS using Apache Spark to build your own recommendation engine.
    Sophie will show that, by splitting the recommendation engine into multiple cooperating services, it is possible to reduce the system’s complexity and produce a robust collaborative filtering platform with support for continuous model training. In this presentation you will learn how to build a recommendation system for the case where recorded data is explicitly given as a rating, as well as for the case where the data is less succinct. You will walk away from this talk with the knowledge and tools needed to implement your own recommendation system using collaborative filtering and microservices.
    About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
    Read more here: databricks.com/product/unifie...
    Connect with us:
    Website: databricks.com
    Facebook: / databricksinc
    Twitter: / databricks
    LinkedIn: / databricks
    Instagram: / databricksinc Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. databricks.com/databricks-nam...
  • Věda a technologie

Komentáře • 13

  • @medwards1086
    @medwards1086 Před rokem

    What a surprise to notice the speaker was someone I knew! Was great, thanks Sophie. 😊

  • @jamespaz4333
    @jamespaz4333 Před 2 měsíci

    can we have more than one feature apart from number of plays (for instance)?

  • @EvanZamir
    @EvanZamir Před 4 lety +3

    It's odd that 'trainImplicit' is not mentioned in the Spark docs!

  • @Atlas-ck9vm
    @Atlas-ck9vm Před 4 lety

    Great Talk!

  • @chzigkol
    @chzigkol Před 5 lety +1

    Very helpful video. I have one question. Does ALS, when implicitPrefs is True, create confidence equal to 1 for unrated items or you should include them to as zero values in the given dataset? Thanks!

  • @np10k
    @np10k Před 3 lety +2

    in the video the speaker uses MSE to evaluate the quality of the algorithm and to perform some parameters tuning. But in the case of implicit feedback - as the authors of the cited article (Collaborative Filtering for Implicit Feedback Datasets) state - is better use the Mean Percentile Rank. isn't it?

    • @JC-gp6bd
      @JC-gp6bd Před 2 lety +1

      Yes MSE makes no sense here!!

  • @tandavme
    @tandavme Před 5 lety +1

    When r is implicit, which values it can have? Is range 0..1 good? (going to read original paper, chapter `Preliminaries`)

    • @Xnaarkhoo
      @Xnaarkhoo Před 4 lety

      I don't think shouldn't matter - at the end it is a matrix decomposition method - I guess you can look into the convergence or the loss function. I think in the example, the values are not limited to [0, 1]

  • @GeorgeSut
    @GeorgeSut Před dnem

    Epic Brit accent

  • @GK-oj3cn
    @GK-oj3cn Před 3 lety

    Your matrix is wrong. Y should be P * U^(t) in your configuratuin ( if rows in Y are products and cols are users)

  • @M1ntAll
    @M1ntAll Před rokem

    Great talk but the manner of talking is barely comprehensible for non-native English speakers.