Clean your data with R. R programming for beginners.
Vložit
- čas přidán 14. 12. 2021
- If you are a R programming beginner, this video is for you. In it Dr Greg Martin shows you in a step by step manner how to clean you dataset before doing any additional analysis. This is part of a series that considers exploring data, cleaning data, manipulating (or wrangling) data, describing data, visualizing data and finally, analyzing data. The tutorial uses data built into R so you can replicate the work on your computer at home. Dr Martin uses the Tidyverse packages that allows for additional functions like select, filter, mutate etc. This tutorial also deals with missing data. So if you are a data scientist, or interested in quantitative analysis or research, this this is a good video to start with.
Get my FREE cheat sheets for R programming and statistics (including transcripts of these lessons) here: www.learnmore365.com/courses/rprogramming-resource-library
Great lesson, thanks
00:01 Cleaning your data involves systematic exploration, cleaning, manipulation, visualization, and analysis.
01:44 Installing packages in R expands functionality.
05:31 Converting character variable to factor variable in R
07:31 Using the factor function to swap levels in R
11:11 Understanding the difference between 'or' and 'and' in filtering data.
13:03 Handling missing data is crucial for accurate analysis.
17:02 Understanding how to handle missing values in data sets is crucial for data cleaning in R.
18:49 Handle missing data with nuanced approach, not just sweeping deletion
22:16 Identify and handle duplicates in data frames
24:04 Selecting and filtering data using base R method
Honestly love all your videos. Detailed explanation and yet simple and straightforward. Keep up the great work.
Thank you so much for creating so much accessible and engaging content. For a beginner the way you teach is very clear and easy to understand and your passion for R has made me love it even more! (I also appreciate being hyped up for learning by some drum and bass in the intro)
Your tutorials are unarguably the most explicit and practical for the use of R. Beyond using R as a tool, you explain a lot of statistical concepts. Thanks for all you do, I've learned a lot from your channel
Very useful thank you - especially the section on NAs and recoding. Also appreciate the editing, effects, sound quality and close ups of the code. I’ve recently been using the star wars database for my English teaching lessons to help students with the interrogative. How tall is R2D2? How much does Darth Vader weigh? Etc. One request if possible - Times, Dates and TimeDates. Thank you again, your videos have been very helpful.
Thank you so much, you're such a huge help! I dont think i would pass my 'digital data analysis' course without your channel
Happy to help!
Thanks for the lessons so far. Love it
I do love your enthusiasm Greg, it really keeps me interested in watching through to the end!
Thanks John - much appreciated!
Excellent! Thanks for putting out such great content that's not only useful, but easy to follow!
Wow - what a nice thing to say (thanks!!)
Life saver using this vidoe in a last min dash to finish some coursework
Thanks for the feedback Harrison. Glad I could help.
This is brilliant, I am on my third video and I am amazed at how easy this is made to seem👏
Very helpful. the perfect mixture between the background ideas ( what data to dismiss) and the R way to do so.
Hope to see more videos.
Enjoy X-mas
Very useful Greg, such a big help for at R-n00b
We really appreciate your best CZcams channel for learning R we looking forward to see more especially for survival analysis, parametric and non parametric tests
Wow, thank you!
Incredible channel, the material is better than any course I have paid for. The delivery and the breakdown of topics into separate videos are perfect for learning. Thank you for sharing your expertise and time.
You're a hell of a teacher! congrats!!
Thanks a lot sir. I can't be grateful enough for your videos
I am here for the "super duper easy"... Keep up the great work Dr, thnaks.
YOU ARE SUCH A GOOD TEACHER... THANK YOU
I don't usually comment, but this man here is the best I've come across on youtube,. Damn, too good
Very clear, useful and interesting. I'm just getting into R and this helped me understand how it can be used for sensible data cleaning.
Great to hear! Thanks for watching!
Thank you, Dr Martin
Easy, Peasy, Lemon Squeezy!
Best R Programming Channel
Keep it going!
You are an awesome teacher. I want to give you a hug right now! 😊Thank you for making it so easy to foloow through.
Solid tut. Thank you.
Thank you so much Dr Martin. As a beginner , the way you explain the R programming makes me loving that language more and easy to deal with coding. Keep up with more great videos.
Thank you for the feedback. Glad you enjoyed it! You got this!
Thumbs up for you Greg!
Excellent lectures and a good lecturer also
this tutorial is Excellent! thank you!
Hi there, Greg! Thanks a lot for these videos, love your style. I've learned A LOT!
Great to hear, Yamila! Thank you for the feedback!
You sir have an incredible voice for teaching. Glad I found your channel
Glad to hear it! Thank you!
You channel is a life saver man. Thank you
Out of all the online classes and videos I have done, I WISH I STARTED WITH THIS ONE!! Thank you!
Glad it was helpful! Thanks for your amazing feedback
Much appreciated. Amazing skills in such a simplified way thank you.
Glad it was helpful! Thanks!
Thank you so much! love all your videos! simple and straightforward! with detailed explanation.
I'm so glad to have you as a subscriber! Thank you for being a part of this community.
You Sir, are a legend!
You really make programming seem "easy-peasy lemon squeezy", keep it up!
Thanks for watching!!! I appreciate your feedback!
Truly helpful! Amazing video for tidyverse
Glad it was helpful!
best practice lecture for R
Doc Martin!
As usual an excellent video! I have one question regarding the recoding. In case of binary cases is it not typical to do 0 and 1 in order to be able to some stats on the data?
Thanks and Happy Holidays!
oh my god man you are a godsent!!!!
I've been learning R in the google data course from Coursera and they don't teach much.
Thank you!! 👑
Bravo, this content was very good
Thank you so much. Your videos are very helpful.
Glad you like them!
Thank you so much!
GREAT VIDEO
Thank you for this. Very helpful.
You're very welcome! Glad it was helpful.
Perfect lecture ! benefited a lot.
Glad it was helpful! Thank you :)
Excellente~! Thanks -
thanks . I've really appreciate your video
You are welcome! Thank you!
Love your videos, man - they're clear and concise, easy to follow! Would you be open to creating content based on the Google Data Analytics Cert?
One of the case studies they have is about a fictional bike study called Cyclistic.
And there is only one person on YT who does it in an R (Caribou Data Science) but it's not as seamless or clear as you make your videos out to be! :)
Love this. I didn't know there was an in-house R dataset for star wars.
Yeah - I love it!
Such good videos for learning R programming and such a nice series. When is the next episode about manipulating your data comming out? Can't wait for it!
haha - any day now... (perhaps tonight)
Thank you very very very much!! Really appreciate your wonderful video. 👍👍👍
You are very welcome. Glad you enjoyed it!
Thank you very much! Very helpful
Glad it helped! Thanks for watching!
Appreciate your helpful videos
Happy to help! Thank you.
Thank you very much Sir. This was quite easy to understand as a beginner. Would you kindly maybe make a series of these videos for us beginners because you have many videos and we wouldn't know which video to watch after this one. I hope that makes sense. Anyway thank you a lot for these videos
I will try my best
Great content.
Recode is superseded… we need a new video on this topic 🙂
love it
You explained it thoroughly ❤❤
Glad it was helpful!
Thank you!
You're welcome!
Thank you so much, you're such a great help! please show us how to create Dashboards via Shiny. Thanks a lot.
great! thank you very much
Thanks for the great feedback- Much appreciated !!
Thanks a lot.
Most welcome!
Thank you for the video sir
Most welcome! Thanks for watching.
Thanks much!
You're welcome!
super thanks
Yo you arer the best programing youtube chanel bro
Wow I appreciate the kind words. Your support encourages me to create more content that you'll enjoy! Thank you
Pretty handy tips
Glad you think so!
Hi,
Love your videos they are so helpful. Could you do a video on loops in r ?? Thanks!
Thanks for the suggestion. Will do.
Thanks
Just curious is there any reason why you haven't enabled coloured brackets as per the last update it would make the code easier to read
When you start to explain how to find complete and incomplete cases at 16:09, what do you do if you want to find incomplete cases for the entire dataset? Would you just omit the "select" portion of the code?
Got it!
Thank you so much for watching and leaving a comment! I appreciate your support.
Could you do a video on data management using R please
Really excellent PT.
Then how R is associated with Python algorithm?
Tx sir
Welcome!!
Thank you for this amazing resource. Very helpful for someone like myself who is learning R without any meaningful stats experience aside from a semester at uni.
Is anyone learning along able to share their experience of using the mutate and recode functions. I haven't had any success using this whilst following along with this video and a previous one when trying to recode the gender to M and F, or 1 and 2. I've had to work around using :
starwars %>%
select(name,gender) %>%
mutate(gender=if_else(gender=="masculine", "1", "2"))
But I'd really like to know what I'm doing wrong using recode as I think my code looks the same as Greg's!
starwars %>%
select(name,gender) %>%
mutate(gender=recode(gender, "masculine"=1, "feminine"=2))
Hello Sir, can you please make a video about pachage shiny on medical data ?
Thanks a lot, great videos !!
Can anybody help me with how to disaggregate data that exists in the same column? i.e. in this example lets say you wanted to have a second column(or new variable) for secondary hair color for those values which contain a primary then a secondary hair color i.e "brown, grey", "auburn, white" etc. I am actually working on a file which contains addresses and in many cases the apartment number is not actually separated into another column. However, it should be for the import into the database I am working on and therefore I want to try to create logic to clean and disaggregate these pieces into separate columns. Any help would be greatly appreciated.
Thanks so much for the engaging yet very useful video! Can i ask why the following chunk of code is not able to recode missing value to the assigned value? starwars %>%
select(name, gender) %>% mutate(gender2= if_else(gender=="masculine",1,if_else(is.na(gender),3,2))). Thanks a lot in advancE!
For the replace NA, it just filter the dataset but it doesn't change anything from the dataset. So the dataset remained not cleaned.
I'm a newbie to r, is there a open community where people can help you with your work
variable types
select and filter
find and deal with missing data
find and deal with duplicates
recode values
What if the integer was a chr data type and you want to change it to an integer or double
How come I am just discovering this channel?
Warning in install.packages("tidyverse") :
'lib = "C:/Program Files/R/R-4.2.3/library"' is not writable
Error in install.packages("tidyverse") : unable to install packages
do you have any insight why I get this message? I'm starting with R for a statistics class and I see you recommend this package? my laptop is archaic...
I'm having a problem where I want to mutate two variables with values 0, 1 and NA into a new variable with the sum of 0 and 1, however, R in my case counts NA as 0. Are there an easy fix to this, to exclude the NA?
Hello can you make playlist about OOP in R ?
How can you convert data type using mutate in dplyr?
what is the difference between a data frame and a tibble
And how do you undo a command that you had already executed?
I can't get the line to work: filter(hair_color %in% c(“blond”, “brown”) & height < 180)
Error: unexpected symbol in " filter(hair_color c"
> height < 180)
same here, showing ERROR in 'filter()'
If you are here from Tom's Bayesian stats class, give this video a thumbs up!
nobody seemed to be. But here's a thumbs up (I'm here cause of stats class too lol)
Super duper easier in SPSS.
22.22 There was 5 hair_color missing 3 of em was druid so they don't have hair. U can change it into none but the other 2 they weren't druid they may have some hair but that data is missing. Why didn't u remove that two row?
Never thought Jason Statham is an "R" Expert
I am just getting to know you. you sounds so interesting
Just a heads up to everyone, at the end of the vid when you're doing the recode bit, you might hit this error:
Error: Problem with `mutate()` column `gender`.
i `gender = recode(gender, masculine = 1, feminine = 2)`.
x unused arguments (masculine = 1, feminine = 2)
if you get this error, force R to use dplyr's version of recode like this:
starwars %>%
select(name, gender) %>%
mutate(gender_coded = dplyr::recode(gender, "masculine"=1, "feminine"=2))
I'm not sure why I had to do this, as I had already run the library(tidyverse) command, but replacing recode with dplyr::recode sorted it out.
Figured it out. I also have the "car" package (for the Variation Inflation Factor function "vif" that I'm also using) in my project and so there was a name collision and it was taking recode from car (Comparison to Applied Regression) rather than from dplyr.