SHAP - What Is Your Model Telling You? Interpret CatBoost Regression and Classification Outputs

Sdílet
Vložit
  • čas přidán 1. 06. 2024
  • Let's understand our models using SHAP - "SHapley Additive exPlanations" using Python and Catboost. Let's go over 2 hands-on examples, a regression, and classification, and analyze the SHAP summary plots. It's that powerful! And fun!
    For source code:
    www.viralml.com/video-content...
    Signup for my newsletter and more: www.viralml.com
    Connect on Twitter: / amunategui
    My books on Amazon:
    Python Web Work - Prototyping Guide for Makers: Use HTML5 Templates, Serve Dynamic Content, Build Machine Learning Web Apps, Grow Audiences, Conquer the World
    amzn.to/2veZtnB
    MVP Light Stack Field Guide: Take Your Python MVP to the Web as Quickly and Cheaply as Possible:
    amzn.to/2Q4Hiay
    The Little Book of Fundamental Indicators: Hands-On Market Analysis with Python: Find Your Market Bearings with Python, Jupyter Notebooks, and Freely Available Data:
    amzn.to/2DERG3d
    Monetizing Machine Learning: Quickly Turn Python ML Ideas into Web Applications on the Serverless Cloud:
    amzn.to/2PV3GCV
    Grow Your Web Brand, Visibility & Traffic Organically: 5 Years of amunategui.github.Io and the Lessons I Learned from Growing My Online Community from the Ground Up:
    amzn.to/2JDEU91
    Fringe Tactics - Finding Motivation in Unusual Places: Alternative Ways of Coaxing Motivation Using Raw Inspiration, Fear, and In-Your-Face Logic
    amzn.to/2DYWQas
    Create Income Streams with Online Classes: Design Classes That Generate Long-Term Revenue:
    amzn.to/2VToEHK
    Defense Against The Dark Digital Attacks: How to Protect Your Identity and Workflow in 2019:
    amzn.to/2Jw1AYS
  • Věda a technologie

Komentáře • 44

  • @quasenerd5476
    @quasenerd5476 Před 3 lety +20

    8:24 SHAP interpretation begins
    Thank you for the video!

  • @pavloseimskyi2413
    @pavloseimskyi2413 Před rokem +6

    Great video, thanks! Just one little note: At 5:08 when you impute the age, this is actually data leakage. To avoid it, you should only use the average age of the training set in both training and validation sets. Cheers!

    • @asdasvcxvwe5114
      @asdasvcxvwe5114 Před rokem +1

      thanks for pointing that out. but i have a question, why not use the average of the training in the training set and the average of the validation set in the validation set?

  • @himanshubhusanrath212
    @himanshubhusanrath212 Před 2 lety +2

    Thank you so much for such a clear explanation of SHAP values and their interpretation.

  • @quasenerd5476
    @quasenerd5476 Před 3 lety +16

    16:50 I guess maybe you have made a mistake. The x-axis do not give the amount it's affecting the model in the model's output unit. There is a non linear relationship between the SHAP values for features of an example and the prediction the model makes for this example.

  • @user-pb5su9zb9g
    @user-pb5su9zb9g Před 4 lety +3

    Thank you man, really interesting material. I will look and read more about it! You are a great teacher.

  • @xbsd
    @xbsd Před 3 lety +1

    Clear explanations, excellent work!

  • @stephenhobbs948
    @stephenhobbs948 Před 4 lety +1

    Thank you! Your videos are very interesting. I look forward to more.

  • @opalkabert
    @opalkabert Před 4 lety +2

    @amunatequi the plot you don't like is in fact the local plot for the purpose of explaining why an individual got a particular prediction. What you explained is the global explanation for the entire model. So in a case of credit decision making, the local explanation may be important. This is Albert

    • @viralml
      @viralml  Před 4 lety +2

      Hey Albert - always good to hear from you - thanks for clarifying this - will be helpful to me and many others!

    • @viralml
      @viralml  Před 4 lety +1

      And thanks for watching!!

    • @opalkabert
      @opalkabert Před 4 lety +2

      As always, it was a great post. I am sure there will be a follow up for deep learning and SHAP like using it tf.keras

  • @seanbarrett1681
    @seanbarrett1681 Před 2 lety

    Love you man, great video, exactly what I was looking for

  • @jhonnyespinozabryson8241
    @jhonnyespinozabryson8241 Před 3 lety +1

    Thanks for sharing Manuel

  • @rajvernekar8605
    @rajvernekar8605 Před rokem +1

    Thank you for this video. Very helpful!

  • @bheeshmak.s5125
    @bheeshmak.s5125 Před měsícem

    Great explanation..

  • @nancyzhang6790
    @nancyzhang6790 Před rokem +1

    Great talk. Thanks. BTW, I don't know how Shapley was connected with Washington Univ. He got his A.B. from Harvard and Ph.D. from Princeton.

  • @vincentdonghoonlee4497
    @vincentdonghoonlee4497 Před 4 lety +1

    Thank you Manuel for sharing your insight!

    • @viralml
      @viralml  Před 4 lety +1

      Thanks DHLee! (and thanks for pointing out the wrong source code)

  • @prathameshdinkar2966
    @prathameshdinkar2966 Před 3 lety +1

    Thanks! Very nice video.

  • @alexisparenty9445
    @alexisparenty9445 Před rokem

    Manuel, you re the best!

  • @smeagolita1
    @smeagolita1 Před 3 lety +1

    Great video!

  • @danielinflam3s
    @danielinflam3s Před 3 lety +1

    loved it!

  • @juanete69
    @juanete69 Před rokem

    Hello.
    If we apply SHAP to a linear regression model... are those Phi_i equivalent to the coefficients of the regression model? Do they also take into account the variance as the p-values do?
    How is the SHAP value for a variable different from the partial R^2?

  • @fliederblumen1843
    @fliederblumen1843 Před 3 lety

    thanks for the video, can Shap be used for lstm model intepretation? it seems there is some problem due to the 3d tensor format of the lstm output.

  • @albertoaltamirano5462
    @albertoaltamirano5462 Před 3 lety +1

    Muchas Gracias Manuel, muy interesante este tema me lo habían recomendado, estoy en proceso de aprendizaje. Saludos

  •  Před 4 lety +2

    Kudos Manuel! What about to use SHAP or LIME for error analysis?

  • @michaeljohn8835
    @michaeljohn8835 Před 3 lety

    This video was really helpful! I was wondering, how would you interpret the SHAP graph when you have variables that don’t have a low/high value? Would you have to encode your variables a certain way in order to do this?

  • @jonimatix
    @jonimatix Před 3 lety

    Great video, your material deserve more coverage!
    Is there a way to download the ipynb notebook?

  • @viralml
    @viralml  Před 4 lety +1

    Sorry - had the wrong link to code - fixed now - thanks!

  • @pratikpratik8495
    @pratikpratik8495 Před 3 lety

    I can apply shap library and interpret the chart but what is final report out if it ??? Like what management / user expect from it ??? I can't see this chart to non-technical person . is there any report can be generated to draw any conclusion

  • @jardelvieira8742
    @jardelvieira8742 Před rokem

    I have a problem when I tried to use foce_plot for multiple Samples. "NotImplementedError: matplotlib = True is not yet supported for force plots with multiple samples!". Can you help-me?

  • @hifredyo1773
    @hifredyo1773 Před rokem

    Why is the base value / expected value for the classification problem negative when the problem is a binary classifcation? I thought the expected value was the mean of, so how could it be negative if the only possible values for the target variable are 0 or 1?

  • @MadhurDevkota
    @MadhurDevkota Před 2 lety +1

    Thanks for great SHAP workout. Is it only me or does he look like/ sound like Bil Burr of Data Science!! lol

    • @viralml
      @viralml  Před 2 lety +1

      Haha thanks, I'll take that as a compliment as am a big fan!

  • @KN-tx7sd
    @KN-tx7sd Před 2 lety

    Is the same could be done in R, thanks

  • @shaythuramelangkovan5800

    hi sir, why is it grey at 10:24 ?

  • @user-he7jw9uc4d
    @user-he7jw9uc4d Před 3 lety

    Hey Manuel,Thank you for a great instructional video, I learned your code, but at the end there was a problem. How do I solve this ,model_regressor = CatBoostRegressor(**params),NameError: name 'params' is not defined。 thanks

  • @btcthousand5188
    @btcthousand5188 Před 2 lety

    Seems your code to handle categorical variable is wrong. You can tell from your shape plot both CHAS and RAD are not considered :) something to do with your # convert to values to string piece of code.

  • @hoaxuan7074
    @hoaxuan7074 Před 3 lety

    Ankit Patel breaking bad YT video. And then tell me ReLU is not a switch. f(x)=x is connect, f(x)=0 is disconnect. A light switch in your house is binary on off, yet connects and disconnects a continuously variable AC voltage signal.
    AI462 neural networks.