How to train XGBoost models in Python

Sdílet
Vložit
  • čas přidán 15. 01. 2023
  • Welcome to How to train XGBoost models in Python tutorial. You'll build an XGBoost Classifier model with an example dataset, step-by-step.
    By following this tutorial, you’ll learn:
    ✅What is XGBoost (vs. gradient tree boosting algorithm)
    ✅How to build an XGBoost model (Classifier) in Python, step-by-step:
    - Step #1: Explore and prep data
    - Step #2: Build a pipeline of training
    - Step #3: Set up hyperparameter tuning (cross-validation)
    - Step #4: Train the XGBoost model
    - Step #5: Evaluate the model and make predictions
    - Step #6: Measure feature importance (optional)
    If you want to use Python to create XGBoost models to make predictions, this practical tutorial will get you started.
    GitHub Repo with code: github.com/liannewriting/YouT...
    Technologies that will be used:
    ☑️ JupyterLab (Notebook)
    ☑️ pandas
    ☑️ scikit-learn (sklearn)
    ☑️ category_encoders
    ☑️ xgboost Python package
    ☑️ scikit-optimize (skopt)
    Links mentioned in the video
    ► Bank marketing dataset: archive.ics.uci.edu/ml/datase...
    ► What is gradient boosting in machine learning tutorial: fundamentals explained: www.justintodata.com/gradient...
    ► To learn Python basics, take our course Python for Data Analysis with projects: www.justintodata.com/courses/...
    ► sklearn pipeline: scikit-learn.org/stable/modul...
    ► Target Encoder: contrib.scikit-learn.org/cate...
    ► XGBClassifier documentation with hyperparameters definition: xgboost.readthedocs.io/en/sta...
    There's also an article version of the same content. If you prefer reading, please check it out. How to build XGBoost models in Python: www.justintodata.com/xgboost-...
    Get access to more data science materials, check out our website Just into Data: justintodata.com/
  • Věda a technologie

Komentáře • 21

  • @TheHorn89
    @TheHorn89 Před 6 měsíci +3

    Love your calm explanation style and right level of detail for a youtube tutorial - thank you!

  • @8shounak
    @8shounak Před 2 měsíci

    Love the tutorial and in depth explanation. Thanks

  • @samihamine906
    @samihamine906 Před rokem +4

    Fantastic explanation! Your clear and engaging content has certainly earned you a new subscriber. I'm thrilled to have discovered your channel and I'm eager to see more insightful videos on Machine Learning. Keep up the incredible work! 💐

  • @paulodoi6941
    @paulodoi6941 Před 7 měsíci +1

    Great staff

  • @user-hj6zn8js3i
    @user-hj6zn8js3i Před 8 měsíci

    Thanks a lot!

  • @bakerb-rz6lv
    @bakerb-rz6lv Před rokem +1

    Love from China!

  • @natural8471
    @natural8471 Před 7 měsíci +1

    Thank kyo!

  • @dianafarhat9479
    @dianafarhat9479 Před 2 měsíci

    Great tutorial, but I have a question. Why did you change the result column to 0's and 1's if there's a target encoder? Can we keep them categorical?

  • @azingo2313
    @azingo2313 Před 10 měsíci +1

    What is F-Score here. Can you please explain the final step?

  • @kylecheung9302
    @kylecheung9302 Před 4 měsíci

    How do you interpret the prediction results? The results are all real numbers, can you look at each prediction on its own or do you have to evaluate as a whole? For instance person X target is 0.45, what does that tell me? Or negative values as the result what does that mean

  • @Cantblendthis
    @Cantblendthis Před 3 měsíci

    I get a warning at the training step. np.int has been deprecated and removed, so I can't continue as it doesn't run (no warnings that could be ignored). What do I need to solve this? Thanks.

  • @hritwijkamble9988
    @hritwijkamble9988 Před 8 měsíci

    My model is not training. I mean programming is stuck at opt.fit(x_train,y_train) and it is not moving forward from here. What's Happening?

  • @edsonmisaelastorgacastro9170

    Why do you use Real or Interger on your hyperparameters? Thanks!!!

  • @bakerb-rz6lv
    @bakerb-rz6lv Před rokem +4

    when i run "opt.fit(...)". It is wrong. "ValueError: multiclass format is not supported" How to fix it?

    • @langwang9130
      @langwang9130 Před 10 měsíci

      same here

    • @dennislam1501
      @dennislam1501 Před 9 měsíci

      u may need to read TargetEncoder documentation to find out more. He did not use sklearn onehot or ordinal encoder

    • @-uz
      @-uz Před 3 měsíci

      @@langwang9130you have to set a parameter to specific xgb to use multiclasses

  • @VincentvanWitteloostuyn
    @VincentvanWitteloostuyn Před 7 měsíci

    Why not including euribor3m interest rates, it seems a strong predictor given the type of conversion for a bank, also it's proven in the data.
    Train 0.794
    Test: 0.811

    • @justintodata
      @justintodata  Před 7 měsíci +2

      Hi Vincent, we didn't really focus on what features to include since this is more of a demo of the xgboost model:) Thanks for bringing it up

    • @1993Redemption
      @1993Redemption Před 3 měsíci

      Then include it in your model. Choosing columns (or features) to include is just user judgement and domain knowledge, and so doesn't pertain much to making a better model in a mathematical sense since XGBoost is already so robust. If including it makes the model better, great put it in.