05e Machine Learning: Shapley Value
Vložit
- čas přidán 28. 06. 2024
- I extend the discussion on feature ranking and selection with Shapley Value (1953). Adapted from game theory, this is a useful tool for feature ranking and to support explainable machine learning.
The interactive examples are avalaible @ git.io/Jt1os and a more complete workflow with feature ranking is available@ git.io/fjm4p.
You are a hero! Never saw a clearer explanation in this detail. Thank you Michael!
Best lecture on Shapley Value for ML in whoel of youtube, thank you so much sir . All the OR folks just made it look so complicated . Your explaination is 100!
Thanks a lot for detailing with examples. Very few persons have such gift of imparting knowledge.
It's a clear explanation and very helpful. Much appreciated.
You saved my life after searching so many introductions on Google.
Excellent video Professor. May I get access to your PPT slides. Thank you so much for making such amazing lectures open source and accessible.
Thank you for recording and sharing this explanation. It was very helpful!
Thank you for making this video. It helps me alot to understand this!
Thanks for the amazing tutorial. Just want to point out the typo in 16:20 in case someone like me got confused there.
The 3rd order should be x3, f(x3)-E(Y) x4, f(x3, x4) - f(x3) x1, f(x3, x4, x1) - f(x3, x4) .
besides, in 20:56 there's a missing factor symbol ! after (|F| - |S| - 1). should be easy to tell from the later slides
Also, I was a bit confused why there isn't a f(X2 = x2, X3 = x3) - f(X2 = x2) in 24:25 ?
Brilliant video! Really helped me to grasp the concept. Many thanks! Isn't there a missing factorial in the weighted average expression?
Very clear. Thanks
Thanks for your great lecture! Im wondering how we cope with multicollinearity here? i.e. if we determine the weights on the training set, due to multicollinearity a potentially "valuable" feature may have its weight shrunk to 0 due to the presence of other "slightly more valuable features". Therefore out of sample, the marginal contribution of this feature will be based on the shrunken training weight, hence not reflect its true potential for predictive performance 🤔
very helpful! thanks
The hero we need.
Super!
Great video class!! I have one doubt: it seems to me that at 19'05" the f(x1,x2,x3,x4=E[x4) works only if f(x) is linear. I think that a more general approximation of f(x1,x2,x3) is to fix x1,x2,x3 to the local input values and average f(x) over x4.
Yes you are right. I have seen this simplification many times and I was confused about it. You are right that this works for linear models but not all models.
If applied to a linear regression model... How the interpretation of the SHAP values different from the coefficients of the standardized variables or from the partial R^2?
How is the interpretation of the ALE plots different from the SHAP dependence plot?
amazing
Thanks for this! Do you think that the Shapley approach largely solves the ML interpretability challenge?
Also, if we are generally speaking that ML models are superior to linear and logistic regressions in prediction accuracy...would it be appropriate to say that ML models coupled with Shapley values are superior to linear/logistic in variable coefficients for understanding causal inference?
Why is the case where no one does nothing including player1’s value?
for 17:28 - are the sequences referred to combinations or permutations i.e. is x1, x2, x4 the same as x1, x4, x2 ? seems like you are referring to combinations based on 16:43
At 9:18 why can't we just split $120k in the ratio of 30:70? If we do that, Player 1 gets $36k. Whats wrong with that approach?