Predict ratings for chocolate with tidymodels

Sdílet
Vložit
  • čas přidán 11. 09. 2024
  • Get started with feature engineering for text using #TidyTuesday data on chocolate ratings, transforming language to be used in machine learning algorithms. This is a good screencast for folks who are newer to tidymodels. Check out the code on my blog: juliasilge.com...

Komentáře • 19

  • @asolisca
    @asolisca Před 2 lety +1

    These videos of yours are, hands down, the best series of tidymodels there are! I would love to see some time series modeling soon! Thanks for all the effort put into this, Julia! You are a blessing!

  • @faiazrummankhan5589
    @faiazrummankhan5589 Před 2 lety +2

    As always, very insightful and great learning from your videos !

  • @dapragotto
    @dapragotto Před 2 lety +1

    Thanks for sharing this valuable knowledge, Julia! Very interesting

  • @deanbevitt9501
    @deanbevitt9501 Před 2 lety +1

    I love your videos - they're informative, well structured and fun :) Would you ever make videos for a broader audience? I think you could do a great job getting people excited to use R

  • @FernandaPeres
    @FernandaPeres Před 2 lety

    Awesome! Thank you so much for these amazing videos, Julia!

    • @Levy957
      @Levy957 Před 2 lety

      Vcs duas são ótimas em conteúdo estatístico, sigo as duas

    • @FernandaPeres
      @FernandaPeres Před 2 lety

      @@Levy957 😍😍😍

  • @jaredwsavage
    @jaredwsavage Před 2 lety

    Great video, really enjoyed coding along with this. I'm about half way through Tidy Modelling with R now and am having so much fun building models with it. It would be nice to see you do more time series based videos sometime. Thanks.

  • @ammarparmr
    @ammarparmr Před 2 lety +1

    As always.. impressive

  • @mpayne7904
    @mpayne7904 Před 2 lety +2

    Hi Julia, great video as always. I wonder if you have any package or material suggestions for a beginner to microsimulation modelling with R?

    • @JuliaSilge
      @JuliaSilge  Před 2 lety

      This is not my area of expertise but I saw a talk by the author of simmer a while back that was great. You might check that out: r-simmer.org/

  • @j7andrew
    @j7andrew Před rokem

    Awesome!!

  • @goodyonsen77
    @goodyonsen77 Před 2 lety

    I think I may just fell in love with the cuteness, as much as of course with the superbly introduced and tutored ML codes...

  • @jorampodcast
    @jorampodcast Před 2 lety

    Julia, could you please help in the final code chunk I am receiving the the error "no tidy method for objects of class ranger." In other words, it won't run tidy()

    • @jorampodcast
      @jorampodcast Před 2 lety +1

      No worries, I figured it out. Love your videos. Keep making them!

  • @davidjackson7675
    @davidjackson7675 Před 2 lety +1

    What package is the template part of?

    • @JuliaSilge
      @JuliaSilge  Před 2 lety +2

      The tidymodels metapackage: tidymodels.tidymodels.org/

  • @argytzak
    @argytzak Před 2 lety

    Hi,
    one thing that was missed in the data preparation step was to account for "most_memorable_characteristics" consisting of 2 words.
    For example "sour fruit", "sour notes", "off note", etc.
    What would be the best way to substitute these so that the "unnest_tokens" takes them into account as one?
    I think this would work
    chocolate %>%
    unnest_tokens(word, most_memorable_characteristics, token = stringr::str_split, pattern = ",") %>%
    count(word, sort = TRUE)

    • @JuliaSilge
      @JuliaSilge  Před 2 lety +1

      If you want to see how many times each word was used *per description* instead of most common words overall, I would recommend something like:
      chocolate %>%
      mutate(id = row_number()) %>%
      unnest_tokens(word, most_memorable_characteristics) %>%
      distinct(id, word) %>%
      count(word, sort = TRUE)