Clean your data with R. R programming for beginners.

Sdílet
Vložit
  • čas přidán 14. 12. 2021
  • If you are a R programming beginner, this video is for you. In it Dr Greg Martin shows you in a step by step manner how to clean you dataset before doing any additional analysis. This is part of a series that considers exploring data, cleaning data, manipulating (or wrangling) data, describing data, visualizing data and finally, analyzing data. The tutorial uses data built into R so you can replicate the work on your computer at home. Dr Martin uses the Tidyverse packages that allows for additional functions like select, filter, mutate etc. This tutorial also deals with missing data. So if you are a data scientist, or interested in quantitative analysis or research, this this is a good video to start with.

Komentáře • 153

  • @RProgramming101
    @RProgramming101  Před 9 měsíci +1

    Get my FREE cheat sheets for R programming and statistics (including transcripts of these lessons) here: www.learnmore365.com/courses/rprogramming-resource-library

  • @GFXHDTV
    @GFXHDTV Před 26 dny +6

    00:01 Cleaning your data involves systematic exploration, cleaning, manipulation, visualization, and analysis.
    01:44 Installing packages in R expands functionality.
    05:31 Converting character variable to factor variable in R
    07:31 Using the factor function to swap levels in R
    11:11 Understanding the difference between 'or' and 'and' in filtering data.
    13:03 Handling missing data is crucial for accurate analysis.
    17:02 Understanding how to handle missing values in data sets is crucial for data cleaning in R.
    18:49 Handle missing data with nuanced approach, not just sweeping deletion
    22:16 Identify and handle duplicates in data frames
    24:04 Selecting and filtering data using base R method

  • @Shawn-gm4cf
    @Shawn-gm4cf Před 2 lety +10

    Honestly love all your videos. Detailed explanation and yet simple and straightforward. Keep up the great work.

  • @elliebrown7694
    @elliebrown7694 Před 2 lety

    Thank you so much for creating so much accessible and engaging content. For a beginner the way you teach is very clear and easy to understand and your passion for R has made me love it even more! (I also appreciate being hyped up for learning by some drum and bass in the intro)

  • @nonoobott8602
    @nonoobott8602 Před 2 lety +5

    Your tutorials are unarguably the most explicit and practical for the use of R. Beyond using R as a tool, you explain a lot of statistical concepts. Thanks for all you do, I've learned a lot from your channel

  • @danquixote6072
    @danquixote6072 Před 2 lety +6

    Very useful thank you - especially the section on NAs and recoding. Also appreciate the editing, effects, sound quality and close ups of the code. I’ve recently been using the star wars database for my English teaching lessons to help students with the interrogative. How tall is R2D2? How much does Darth Vader weigh? Etc. One request if possible - Times, Dates and TimeDates. Thank you again, your videos have been very helpful.

  • @antonreinhold8478
    @antonreinhold8478 Před 2 lety +4

    Thank you so much, you're such a huge help! I dont think i would pass my 'digital data analysis' course without your channel

  • @Junecode
    @Junecode Před 2 dny

    Thanks for the lessons so far. Love it

  • @johnrussell5715
    @johnrussell5715 Před 2 lety +3

    I do love your enthusiasm Greg, it really keeps me interested in watching through to the end!

  • @GallantDanny
    @GallantDanny Před rokem +2

    Excellent! Thanks for putting out such great content that's not only useful, but easy to follow!

  • @harrisonnash4948
    @harrisonnash4948 Před 2 lety +9

    Life saver using this vidoe in a last min dash to finish some coursework

  • @annleonard9713
    @annleonard9713 Před 2 lety

    This is brilliant, I am on my third video and I am amazed at how easy this is made to seem👏

  • @KarstenDrKempf
    @KarstenDrKempf Před 2 lety

    Very helpful. the perfect mixture between the background ideas ( what data to dismiss) and the R way to do so.
    Hope to see more videos.
    Enjoy X-mas

  • @emiliezeuthen7631
    @emiliezeuthen7631 Před 2 lety

    Very useful Greg, such a big help for at R-n00b

  • @max5916
    @max5916 Před 2 lety +26

    We really appreciate your best CZcams channel for learning R we looking forward to see more especially for survival analysis, parametric and non parametric tests

  • @user-bn4hd1qh4c
    @user-bn4hd1qh4c Před 8 měsíci

    Incredible channel, the material is better than any course I have paid for. The delivery and the breakdown of topics into separate videos are perfect for learning. Thank you for sharing your expertise and time.

  • @raulpalomares1092
    @raulpalomares1092 Před 2 lety

    You're a hell of a teacher! congrats!!

  • @Aaqib..
    @Aaqib.. Před 2 lety

    Thanks a lot sir. I can't be grateful enough for your videos

  • @AlexKashie
    @AlexKashie Před 6 měsíci

    I am here for the "super duper easy"... Keep up the great work Dr, thnaks.

  • @vivicaanuforo4754
    @vivicaanuforo4754 Před rokem

    YOU ARE SUCH A GOOD TEACHER... THANK YOU

  • @phillippin6699
    @phillippin6699 Před 5 měsíci

    I don't usually comment, but this man here is the best I've come across on youtube,. Damn, too good

  • @HiltonT69
    @HiltonT69 Před 2 lety

    Very clear, useful and interesting. I'm just getting into R and this helped me understand how it can be used for sensible data cleaning.

  • @TaraGhimite
    @TaraGhimite Před rokem

    Thank you, Dr Martin

  • @Sorjen108
    @Sorjen108 Před 6 měsíci

    Easy, Peasy, Lemon Squeezy!
    Best R Programming Channel
    Keep it going!

  • @folashadeolaitan6222
    @folashadeolaitan6222 Před rokem

    You are an awesome teacher. I want to give you a hug right now! 😊Thank you for making it so easy to foloow through.

  • @kamaboko1
    @kamaboko1 Před měsícem

    Solid tut. Thank you.

  • @domyndegeya7760
    @domyndegeya7760 Před rokem +1

    Thank you so much Dr Martin. As a beginner , the way you explain the R programming makes me loving that language more and easy to deal with coding. Keep up with more great videos.

    • @RProgramming101
      @RProgramming101  Před rokem +1

      Thank you for the feedback. Glad you enjoyed it! You got this!

  • @raulpalomares1092
    @raulpalomares1092 Před 2 lety

    Thumbs up for you Greg!

  • @jerryeyong5585
    @jerryeyong5585 Před 18 dny

    Excellent lectures and a good lecturer also

  • @erpampa94
    @erpampa94 Před rokem

    this tutorial is Excellent! thank you!

  • @yamimartina
    @yamimartina Před rokem +1

    Hi there, Greg! Thanks a lot for these videos, love your style. I've learned A LOT!

  • @summer7361
    @summer7361 Před rokem

    You sir have an incredible voice for teaching. Glad I found your channel

  • @felipecruz3061
    @felipecruz3061 Před 9 měsíci

    You channel is a life saver man. Thank you

  • @deniseortiz8567
    @deniseortiz8567 Před rokem

    Out of all the online classes and videos I have done, I WISH I STARTED WITH THIS ONE!! Thank you!

  • @samikzr
    @samikzr Před rokem

    Much appreciated. Amazing skills in such a simplified way thank you.

  • @shaikhahmedbd
    @shaikhahmedbd Před 11 měsíci

    Thank you so much! love all your videos! simple and straightforward! with detailed explanation.

    • @RProgramming101
      @RProgramming101  Před 10 měsíci

      I'm so glad to have you as a subscriber! Thank you for being a part of this community.

  • @noahsalazar2738
    @noahsalazar2738 Před 2 lety

    You Sir, are a legend!

  • @elenag.224
    @elenag.224 Před rokem

    You really make programming seem "easy-peasy lemon squeezy", keep it up!

  • @yidanjiang7599
    @yidanjiang7599 Před 2 lety +1

    Truly helpful! Amazing video for tidyverse

  • @altareq8045
    @altareq8045 Před 6 měsíci

    best practice lecture for R

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 Před 2 lety

    Doc Martin!
    As usual an excellent video! I have one question regarding the recoding. In case of binary cases is it not typical to do 0 and 1 in order to be able to some stats on the data?
    Thanks and Happy Holidays!

  • @IarukaSkYouk
    @IarukaSkYouk Před rokem

    oh my god man you are a godsent!!!!
    I've been learning R in the google data course from Coursera and they don't teach much.

  • @marianam5181
    @marianam5181 Před 2 měsíci

    Thank you!! 👑

  • @sagarlokare5269
    @sagarlokare5269 Před rokem

    Bravo, this content was very good

  • @anujakori
    @anujakori Před 4 dny

    Thank you so much. Your videos are very helpful.

  • @congz5101
    @congz5101 Před 2 lety

    Thank you so much!

  • @RosatiSamuel
    @RosatiSamuel Před měsícem

    GREAT VIDEO

  • @solafajobi
    @solafajobi Před rokem

    Thank you for this. Very helpful.

  • @tarkanh2519
    @tarkanh2519 Před rokem

    Perfect lecture ! benefited a lot.

  • @user-pu9ll7vd5m
    @user-pu9ll7vd5m Před měsícem

    Excellente~! Thanks -

  • @lauramagangfopossi1770

    thanks . I've really appreciate your video

  • @kamekaze997
    @kamekaze997 Před rokem

    Love your videos, man - they're clear and concise, easy to follow! Would you be open to creating content based on the Google Data Analytics Cert?
    One of the case studies they have is about a fictional bike study called Cyclistic.
    And there is only one person on YT who does it in an R (Caribou Data Science) but it's not as seamless or clear as you make your videos out to be! :)

  • @goon5031
    @goon5031 Před 2 lety

    Love this. I didn't know there was an in-house R dataset for star wars.

  • @MaltePeter
    @MaltePeter Před 2 lety +1

    Such good videos for learning R programming and such a nice series. When is the next episode about manipulating your data comming out? Can't wait for it!

  • @Lin-pj5bo
    @Lin-pj5bo Před rokem

    Thank you very very very much!! Really appreciate your wonderful video. 👍👍👍

  • @citizenhk1040
    @citizenhk1040 Před rokem

    Thank you very much! Very helpful

  • @rnarith855
    @rnarith855 Před 2 lety

    Appreciate your helpful videos

  • @kamogelokhumalo4792
    @kamogelokhumalo4792 Před 2 lety

    Thank you very much Sir. This was quite easy to understand as a beginner. Would you kindly maybe make a series of these videos for us beginners because you have many videos and we wouldn't know which video to watch after this one. I hope that makes sense. Anyway thank you a lot for these videos

  • @iblisthemage
    @iblisthemage Před 11 dny

    Great content.
    Recode is superseded… we need a new video on this topic 🙂

  • @SuccessGossips
    @SuccessGossips Před 6 měsíci

    love it

  • @shafeen1058
    @shafeen1058 Před rokem

    You explained it thoroughly ❤❤

  • @dollysiharath4205
    @dollysiharath4205 Před rokem

    Thank you!

  • @findthetruth3021
    @findthetruth3021 Před 2 lety

    Thank you so much, you're such a great help! please show us how to create Dashboards via Shiny. Thanks a lot.

  • @eyadha1
    @eyadha1 Před rokem

    great! thank you very much

  • @namatovufaridah3420
    @namatovufaridah3420 Před rokem

    Thanks a lot.

  • @riptideking
    @riptideking Před rokem +1

    Thank you for the video sir

  • @DudeGuyWho
    @DudeGuyWho Před rokem

    Thanks much!

  • @juliablazy4011
    @juliablazy4011 Před rokem

    super thanks

  • @lets_code_this2678
    @lets_code_this2678 Před 10 měsíci

    Yo you arer the best programing youtube chanel bro

    • @RProgramming101
      @RProgramming101  Před 10 měsíci

      Wow I appreciate the kind words. Your support encourages me to create more content that you'll enjoy! Thank you

  • @robsonreis76
    @robsonreis76 Před 2 lety

    Pretty handy tips

  • @adrianareitano3
    @adrianareitano3 Před 2 lety +1

    Hi,
    Love your videos they are so helpful. Could you do a video on loops in r ?? Thanks!

  • @vincenzo4259
    @vincenzo4259 Před 2 lety

    Thanks

  • @MrDarkplace22
    @MrDarkplace22 Před 2 lety

    Just curious is there any reason why you haven't enabled coloured brackets as per the last update it would make the code easier to read

  • @b5lovermore
    @b5lovermore Před rokem

    When you start to explain how to find complete and incomplete cases at 16:09, what do you do if you want to find incomplete cases for the entire dataset? Would you just omit the "select" portion of the code?

  • @andrewjohnson4352
    @andrewjohnson4352 Před 10 měsíci

    Got it!

    • @RProgramming101
      @RProgramming101  Před 10 měsíci

      Thank you so much for watching and leaving a comment! I appreciate your support.

  • @amandihiyare1184
    @amandihiyare1184 Před 2 lety

    Could you do a video on data management using R please

  • @user-il8mt2wz9t
    @user-il8mt2wz9t Před 7 měsíci

    Really excellent PT.
    Then how R is associated with Python algorithm?

  • @korman9872
    @korman9872 Před rokem

    Tx sir

  • @krazitired
    @krazitired Před 3 měsíci

    Thank you for this amazing resource. Very helpful for someone like myself who is learning R without any meaningful stats experience aside from a semester at uni.
    Is anyone learning along able to share their experience of using the mutate and recode functions. I haven't had any success using this whilst following along with this video and a previous one when trying to recode the gender to M and F, or 1 and 2. I've had to work around using :
    starwars %>%
    select(name,gender) %>%
    mutate(gender=if_else(gender=="masculine", "1", "2"))
    But I'd really like to know what I'm doing wrong using recode as I think my code looks the same as Greg's!
    starwars %>%
    select(name,gender) %>%
    mutate(gender=recode(gender, "masculine"=1, "feminine"=2))

  • @davidispiryan5689
    @davidispiryan5689 Před rokem

    Hello Sir, can you please make a video about pachage shiny on medical data ?
    Thanks a lot, great videos !!

  • @peterscheerer2346
    @peterscheerer2346 Před 8 měsíci

    Can anybody help me with how to disaggregate data that exists in the same column? i.e. in this example lets say you wanted to have a second column(or new variable) for secondary hair color for those values which contain a primary then a secondary hair color i.e "brown, grey", "auburn, white" etc. I am actually working on a file which contains addresses and in many cases the apartment number is not actually separated into another column. However, it should be for the import into the database I am working on and therefore I want to try to create logic to clean and disaggregate these pieces into separate columns. Any help would be greatly appreciated.

  • @crazyytha
    @crazyytha Před 7 měsíci

    Thanks so much for the engaging yet very useful video! Can i ask why the following chunk of code is not able to recode missing value to the assigned value? starwars %>%
    select(name, gender) %>% mutate(gender2= if_else(gender=="masculine",1,if_else(is.na(gender),3,2))). Thanks a lot in advancE!

  • @diddysysavane6006
    @diddysysavane6006 Před 5 měsíci

    For the replace NA, it just filter the dataset but it doesn't change anything from the dataset. So the dataset remained not cleaned.

  • @reecebinx4191
    @reecebinx4191 Před 8 měsíci

    I'm a newbie to r, is there a open community where people can help you with your work

  • @nabilafandih
    @nabilafandih Před rokem

    variable types
    select and filter
    find and deal with missing data
    find and deal with duplicates
    recode values

  • @chizfoodiehub3444
    @chizfoodiehub3444 Před rokem

    What if the integer was a chr data type and you want to change it to an integer or double

  • @microbemike9693
    @microbemike9693 Před rokem

    How come I am just discovering this channel?

  • @fenysnake
    @fenysnake Před rokem

    Warning in install.packages("tidyverse") :
    'lib = "C:/Program Files/R/R-4.2.3/library"' is not writable
    Error in install.packages("tidyverse") : unable to install packages
    do you have any insight why I get this message? I'm starting with R for a statistics class and I see you recommend this package? my laptop is archaic...

  • @GracieJiuJitsu1015
    @GracieJiuJitsu1015 Před rokem

    I'm having a problem where I want to mutate two variables with values 0, 1 and NA into a new variable with the sum of 0 and 1, however, R in my case counts NA as 0. Are there an easy fix to this, to exclude the NA?

  • @dadandahmanwahidi4814
    @dadandahmanwahidi4814 Před 2 lety

    Hello can you make playlist about OOP in R ?

  • @jamesleleji6984
    @jamesleleji6984 Před rokem

    How can you convert data type using mutate in dplyr?

  • @nitamaitra2921
    @nitamaitra2921 Před 2 lety

    what is the difference between a data frame and a tibble

  • @mercywaithira3240onlinemaths

    And how do you undo a command that you had already executed?

  • @DrJohnnyJ
    @DrJohnnyJ Před 2 lety +1

    I can't get the line to work: filter(hair_color %in% c(“blond”, “brown”) & height < 180)
    Error: unexpected symbol in " filter(hair_color c"
    > height < 180)

  • @BloomingtonFPV
    @BloomingtonFPV Před 2 lety +1

    If you are here from Tom's Bayesian stats class, give this video a thumbs up!

    • @mwegankanda6594
      @mwegankanda6594 Před 5 měsíci

      nobody seemed to be. But here's a thumbs up (I'm here cause of stats class too lol)

  • @Ketoswammy
    @Ketoswammy Před rokem

    Super duper easier in SPSS.

  • @sertansafak2056
    @sertansafak2056 Před rokem

    22.22 There was 5 hair_color missing 3 of em was druid so they don't have hair. U can change it into none but the other 2 they weren't druid they may have some hair but that data is missing. Why didn't u remove that two row?

  • @sriram7845
    @sriram7845 Před 2 měsíci

    Never thought Jason Statham is an "R" Expert

  • @thankgodojomah274
    @thankgodojomah274 Před 5 měsíci

    I am just getting to know you. you sounds so interesting

  • @pipertripp
    @pipertripp Před 2 lety +2

    Just a heads up to everyone, at the end of the vid when you're doing the recode bit, you might hit this error:
    Error: Problem with `mutate()` column `gender`.
    i `gender = recode(gender, masculine = 1, feminine = 2)`.
    x unused arguments (masculine = 1, feminine = 2)
    if you get this error, force R to use dplyr's version of recode like this:
    starwars %>%
    select(name, gender) %>%
    mutate(gender_coded = dplyr::recode(gender, "masculine"=1, "feminine"=2))
    I'm not sure why I had to do this, as I had already run the library(tidyverse) command, but replacing recode with dplyr::recode sorted it out.

    • @pipertripp
      @pipertripp Před 2 lety

      Figured it out. I also have the "car" package (for the Variation Inflation Factor function "vif" that I'm also using) in my project and so there was a name collision and it was taking recode from car (Comparison to Applied Regression) rather than from dplyr.