Building an Implicit Recommendation Engine with Spark with Sophie Watson (Red Hat)
Vložit
- čas přidán 9. 07. 2024
- Many of today’s most engaging - and commercially important - applications provide personalised experiences to users. Collaborative filtering algorithms capture the commonality between users and enable applications to make personalised recommendations quickly and efficiently. The Alternating Least Squares (ALS) algorithm is still deemed the industry standard in collaborative filtering. In this talk Sophie will show you how to implement ALS using Apache Spark to build your own recommendation engine.
Sophie will show that, by splitting the recommendation engine into multiple cooperating services, it is possible to reduce the system’s complexity and produce a robust collaborative filtering platform with support for continuous model training. In this presentation you will learn how to build a recommendation system for the case where recorded data is explicitly given as a rating, as well as for the case where the data is less succinct. You will walk away from this talk with the knowledge and tools needed to implement your own recommendation system using collaborative filtering and microservices.
About: Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.
Read more here: databricks.com/product/unifie...
Connect with us:
Website: databricks.com
Facebook: / databricksinc
Twitter: / databricks
LinkedIn: / databricks
Instagram: / databricksinc Databricks is proud to announce that Gartner has named us a Leader in both the 2021 Magic Quadrant for Cloud Database Management Systems and the 2021 Magic Quadrant for Data Science and Machine Learning Platforms. Download the reports here. databricks.com/databricks-nam... - Věda a technologie
What a surprise to notice the speaker was someone I knew! Was great, thanks Sophie. 😊
can we have more than one feature apart from number of plays (for instance)?
It's odd that 'trainImplicit' is not mentioned in the Spark docs!
Great Talk!
Very helpful video. I have one question. Does ALS, when implicitPrefs is True, create confidence equal to 1 for unrated items or you should include them to as zero values in the given dataset? Thanks!
in the video the speaker uses MSE to evaluate the quality of the algorithm and to perform some parameters tuning. But in the case of implicit feedback - as the authors of the cited article (Collaborative Filtering for Implicit Feedback Datasets) state - is better use the Mean Percentile Rank. isn't it?
Yes MSE makes no sense here!!
When r is implicit, which values it can have? Is range 0..1 good? (going to read original paper, chapter `Preliminaries`)
I don't think shouldn't matter - at the end it is a matrix decomposition method - I guess you can look into the convergence or the loss function. I think in the example, the values are not limited to [0, 1]
Epic Brit accent
Your matrix is wrong. Y should be P * U^(t) in your configuratuin ( if rows in Y are products and cols are users)
Great talk but the manner of talking is barely comprehensible for non-native English speakers.
Brit haters gonna hate...