How to train XGBoost models in Python
Vložit
- čas přidán 15. 01. 2023
- Welcome to How to train XGBoost models in Python tutorial. You'll build an XGBoost Classifier model with an example dataset, step-by-step.
By following this tutorial, you’ll learn:
✅What is XGBoost (vs. gradient tree boosting algorithm)
✅How to build an XGBoost model (Classifier) in Python, step-by-step:
- Step #1: Explore and prep data
- Step #2: Build a pipeline of training
- Step #3: Set up hyperparameter tuning (cross-validation)
- Step #4: Train the XGBoost model
- Step #5: Evaluate the model and make predictions
- Step #6: Measure feature importance (optional)
If you want to use Python to create XGBoost models to make predictions, this practical tutorial will get you started.
GitHub Repo with code: github.com/liannewriting/YouT...
Technologies that will be used:
☑️ JupyterLab (Notebook)
☑️ pandas
☑️ scikit-learn (sklearn)
☑️ category_encoders
☑️ xgboost Python package
☑️ scikit-optimize (skopt)
Links mentioned in the video
► Bank marketing dataset: archive.ics.uci.edu/ml/datase...
► What is gradient boosting in machine learning tutorial: fundamentals explained: www.justintodata.com/gradient...
► To learn Python basics, take our course Python for Data Analysis with projects: www.justintodata.com/courses/...
► sklearn pipeline: scikit-learn.org/stable/modul...
► Target Encoder: contrib.scikit-learn.org/cate...
► XGBClassifier documentation with hyperparameters definition: xgboost.readthedocs.io/en/sta...
There's also an article version of the same content. If you prefer reading, please check it out. How to build XGBoost models in Python: www.justintodata.com/xgboost-...
Get access to more data science materials, check out our website Just into Data: justintodata.com/ - Věda a technologie
Love your calm explanation style and right level of detail for a youtube tutorial - thank you!
Love the tutorial and in depth explanation. Thanks
Fantastic explanation! Your clear and engaging content has certainly earned you a new subscriber. I'm thrilled to have discovered your channel and I'm eager to see more insightful videos on Machine Learning. Keep up the incredible work! 💐
Great staff
Thanks a lot!
Love from China!
Thank kyo!
Great tutorial, but I have a question. Why did you change the result column to 0's and 1's if there's a target encoder? Can we keep them categorical?
What is F-Score here. Can you please explain the final step?
How do you interpret the prediction results? The results are all real numbers, can you look at each prediction on its own or do you have to evaluate as a whole? For instance person X target is 0.45, what does that tell me? Or negative values as the result what does that mean
I get a warning at the training step. np.int has been deprecated and removed, so I can't continue as it doesn't run (no warnings that could be ignored). What do I need to solve this? Thanks.
My model is not training. I mean programming is stuck at opt.fit(x_train,y_train) and it is not moving forward from here. What's Happening?
Why do you use Real or Interger on your hyperparameters? Thanks!!!
when i run "opt.fit(...)". It is wrong. "ValueError: multiclass format is not supported" How to fix it?
same here
u may need to read TargetEncoder documentation to find out more. He did not use sklearn onehot or ordinal encoder
@@langwang9130you have to set a parameter to specific xgb to use multiclasses
Why not including euribor3m interest rates, it seems a strong predictor given the type of conversion for a bank, also it's proven in the data.
Train 0.794
Test: 0.811
Hi Vincent, we didn't really focus on what features to include since this is more of a demo of the xgboost model:) Thanks for bringing it up
Then include it in your model. Choosing columns (or features) to include is just user judgement and domain knowledge, and so doesn't pertain much to making a better model in a mathematical sense since XGBoost is already so robust. If including it makes the model better, great put it in.