Cost Complexity Pruning (Theory + Code)

Sdílet
Vložit
  • čas přidán 6. 12. 2023
  • Cost complexity pruning (CCP) made easy! In this video we will explain about pruning, specifically cost complexity pruning, and also augment our previous (manual) tree implementation in python to incorporate CCP (again, using only numpy and built in functions).
    Trees Playlist: bit.ly/MeerkatStatisticsTrees
    Become a member and get full access to this online course:
    meerkatstatistics.com/courses...
    ** 🎉 Special CZcams 60% Discount on Yearly Plan - valid for the 1st 100 subscribers; Voucher code: First100 🎉 **
    "Decision Trees" Mini Course Outline:
    * Course Materials
    * Introduction to Decision Trees
    * Split Criteria
    * Stop Criteria, Categorical Data, Missing Values and Implementation Details
    * Build a Decision Tree from scratch in Python using numpy
    * Code - Moving to a class implementation, entropy
    * Code - Building the tree using a stack and a queue
    * Code - Regression Trees
    * Cost Complexity Pruning - Theory and Code
    If you’re looking for statistical consultation, work on interesting projects, or training workshop, visit my website meerkatstatistics.com/ or contact me directly at david@meerkatstatistics.com
    ~~~~~ SUPPORT ~~~~~
    Paypal me: paypal.me/MeerkatStatistics
    ~~~~~~~~~~~~~~~~~

Komentáře • 2

  • @zenkya-sin1117
    @zenkya-sin1117 Před 7 měsíci

    Hello, first thank you very much for your video, it was very clear. But I have a question, in the following demonstration, I don't understand why we replace R(T−Tt) − R(T) by R(t)−R(Tt) and why we replace |f(T−Tt)| − |f(T)| by 1 − |f(Tt)|
    Rα(T−Tt) − Rα(T) = R(T−Tt) − R(T) + α( |f(T−Tt)| − |f(T)| ) = R(t)−R(Tt) + α( 1 − |f(Tt)| )
    Thanks a lot !

    • @MeerkatStatistics
      @MeerkatStatistics  Před 7 měsíci

      This is not from my video... I normally don't answer these kinds of questions. If you wish to have a private class with me, you can contact me via my website.
      Since this is small - I will let you know that Rα(T) is defined to be R(T) + α|T|. And notice that R(T−Tt) − R(T) is equal to R(t)−R(Tt) since R is a sum over nodes, and both the full and pruned tree share all nodes until node t; and also that the cardinality of |T-Tt|-|T| is the number of leaf nodes in the pruned tree minus the number of leaves in the full tree, and this is indeed 1-|Tt|.