Video není dostupné.

Omlouváme se.

Predict childcare costs in US counties with xgboost and early stopping

Julia Silge

zhlédnutí 3 485

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 19. 08. 2024

Komentáře • 20

@michaelmahoney3806 Před rokem ⁺⁵
I don't believe that I have ever watched one of your videos that I didn't come away with some new nugget. Thanks, Julia!
@hesamseraj Před rokem ⁺¹
As always, thank you for such great screen cast.
@wilrivera2987 Před rokem
Dream job . To work in Posit
@tofreddy Před rokem
I stumbled into your channel. Thank you for the teachable moment.
@carvalhoribeiro Před rokem
Very Very useful. Thank you so much Julia !
@geralgariza7199 Před rokem
nice work! well done!
@CaribouDataScience Před rokem
Thanks, that was interesting!
@djangoworldwide7925 Před rokem
Hey.. rsample::validation_set does not exist anymore. As to 24-06-2023 we can use validation_split/time_split/group_validation_split. I had a feeling it was the validation_split anyway but i wonder, maybe i should use the dev version?
@anselmekouame1913 Před rokem
Hi Julia, how might a multicollinearity affect the machine learning model? If multicollinearity is found, should we remove variables that are highly correlated?
@JuliaSilge Před rokem ⁺⁴
If you are using a linear model, correlated features can be a big problem! In cases like that, you would want to remove features that are highly correlated with other ones, or use something like PCA. Check out feature engineering approaches like these:
recipes.tidymodels.org/reference/step_corr.html
recipes.tidymodels.org/reference/step_pca.html
Tree-based models tend to do OK with correlated features and it often doesn't really help to handle them in a special way. Just crank it on through the model!
@anselmekouame1913 Před rokem
@@JuliaSilge thank you bunch.
@omoniyitemitope6113 Před 5 měsíci
Hi, I have these data with 35 variables and want to run some regression(RF,xgboost, etc..) on it. I am new to R and want to know if you have any special online training that I can register for?
@JuliaSilge Před 5 měsíci ⁺¹
I recommend that you work through this:
www.tidymodels.org/start/
And then take a look at this book:
www.tmwr.org/
Good luck!
@omoniyitemitope6113 Před 5 měsíci
Thanks so much for your response. I followed one of your screencasts and got rsq of 0.37 for the RF model, is/are there anything I can do to improve the fit of my model?@@JuliaSilge
@JuliaSilge Před 5 měsíci
@@omoniyitemitope6113This definitely depends on the specifics of your situation! I recommend that you check out a resource like *Tidy Modeling with R* for digging deeper on the model building process: www.tmwr.org/
@omoniyitemitope6113 Před 5 měsíci
@@JuliaSilgeThanks for your response. I will go through it. I did something that I did not know the statistical implication. I took the log of my dependent variable and performed a RF, and to my surprise I got % var explained to be 99.74, this looks too good to be true to me
@danielhallriggins9008 Před 4 měsíci
Thanks Julia, love your videos! To get a more accurate sense of performance, would it be helpful to use {spatialsample} to account for spatial autocorrelation?
@JuliaSilge Před 4 měsíci ⁺¹
That would be a great thing to do! This dataset doesn't have explicitly spatial information in it (just county FIPS code) so you would need to join some spatial info together with the original dataset.
@konormccracken Před rokem
Always grateful for these videos! Though the grating little economist in me screamed a bit when you discounted the fixed-effect of "county" here 🫥
@JuliaSilge Před rokem
Ah yep! The xgboost algorithm does not have the ability to incorporate fixed effects the way that a multilevel model does, say like those from multilevelmod:
multilevelmod.tidymodels.org/
However, we could still use a resampling approach that takes into account how a given county is in this dataset a bunch of times, to avoid overly optimistic performance estimates. We'd want to switch out `initial_split()` for `group_initial_split()` and `validation_split()` for `group_validation_split()`:
rsample.tidymodels.org/reference/validation_split.html

Další v pořadí

Automatické přehrávání

Use xgboost and effect encodings to model tornadoes

Use xgboost and effect encodings to model tornadoes

Logistic regression for US House election vote share

Logistic regression for US House election vote share

Resampling to understand gender in art history textbooks

Resampling to understand gender in art history textbooks

Ženu pronásleduje vrah #horrorshorts #shorthorrorstory #shorthorrorstories

Ženu pronásleduje vrah #horrorshorts #shorthorrorstory #shorthorrorstories

7 Days Stranded In A Cave

7 Days Stranded In A Cave

Virální legendy potřetí: Finální zúčtování! | KOVY

Virální legendy potřetí: Finální zúčtování! | KOVY

Co na to ŘÍKÁTE?🔥 JIŽ online na HEROHERO🔥

Co na to ŘÍKÁTE?🔥 JIŽ online na HEROHERO🔥

Empirical Bayes for Doctor Who episodes

Empirical Bayes for Doctor Who episodes

Machine Learning Lab (Coding) session 4: Linear Regression

Machine Learning Lab (Coding) session 4: Linear Regression

Deploy a model on AWS SageMaker with vetiver

Deploy a model on AWS SageMaker with vetiver

Weighted log odds ratios for haunted places in the US

Weighted log odds ratios for haunted places in the US

Tuning XGBoost using tidymodels

Tuning XGBoost using tidymodels

Mapping change in United State polling places

Mapping change in United State polling places

To downsample or not? Handling class imbalance in bird feeder observations

To downsample or not? Handling class imbalance in bird feeder observations

Early Stopping. The Most Popular Regularization Technique In Machine Learning.

Early Stopping. The Most Popular Regularization Technique In Machine Learning.

Koupím Barče Cokoliv Co Trefí Šipkou!

Koupím Barče Cokoliv Co Trefí Šipkou!

TOHLE JSTE O V HLAVĚ NEVĚDĚLI #zajimavosti #insideout

TOHLE JSTE O V HLAVĚ NEVĚDĚLI #zajimavosti #insideout

Girl’s compassion turns foul dog into cute furry friend #shorts

Girl’s compassion turns foul dog into cute furry friend #shorts

Pick a Color, I'll Buy 🟪🟨

Pick a Color, I'll Buy 🟪🟨

Insane Coffee trick EXPOSED 😱☕️ #shorts

Insane Coffee trick EXPOSED 😱☕️ #shorts

天使救了路飞！#天使#小丑#路飞#家庭

天使救了路飞！#天使#小丑#路飞#家庭

Classic Italian Pasta Dog

Classic Italian Pasta Dog