Predict ratings for chocolate with tidymodels
Vložit
- čas přidán 11. 09. 2024
- Get started with feature engineering for text using #TidyTuesday data on chocolate ratings, transforming language to be used in machine learning algorithms. This is a good screencast for folks who are newer to tidymodels. Check out the code on my blog: juliasilge.com...
These videos of yours are, hands down, the best series of tidymodels there are! I would love to see some time series modeling soon! Thanks for all the effort put into this, Julia! You are a blessing!
As always, very insightful and great learning from your videos !
Thanks for sharing this valuable knowledge, Julia! Very interesting
I love your videos - they're informative, well structured and fun :) Would you ever make videos for a broader audience? I think you could do a great job getting people excited to use R
Awesome! Thank you so much for these amazing videos, Julia!
Vcs duas são ótimas em conteúdo estatístico, sigo as duas
@@Levy957 😍😍😍
Great video, really enjoyed coding along with this. I'm about half way through Tidy Modelling with R now and am having so much fun building models with it. It would be nice to see you do more time series based videos sometime. Thanks.
As always.. impressive
Hi Julia, great video as always. I wonder if you have any package or material suggestions for a beginner to microsimulation modelling with R?
This is not my area of expertise but I saw a talk by the author of simmer a while back that was great. You might check that out: r-simmer.org/
Awesome!!
I think I may just fell in love with the cuteness, as much as of course with the superbly introduced and tutored ML codes...
Julia, could you please help in the final code chunk I am receiving the the error "no tidy method for objects of class ranger." In other words, it won't run tidy()
No worries, I figured it out. Love your videos. Keep making them!
What package is the template part of?
The tidymodels metapackage: tidymodels.tidymodels.org/
Hi,
one thing that was missed in the data preparation step was to account for "most_memorable_characteristics" consisting of 2 words.
For example "sour fruit", "sour notes", "off note", etc.
What would be the best way to substitute these so that the "unnest_tokens" takes them into account as one?
I think this would work
chocolate %>%
unnest_tokens(word, most_memorable_characteristics, token = stringr::str_split, pattern = ",") %>%
count(word, sort = TRUE)
If you want to see how many times each word was used *per description* instead of most common words overall, I would recommend something like:
chocolate %>%
mutate(id = row_number()) %>%
unnest_tokens(word, most_memorable_characteristics) %>%
distinct(id, word) %>%
count(word, sort = TRUE)