I am searching for hours on how to do oversampling in R, your explanation was the only one that worked for me kkkkkk !!!!! Thank youuuuu very much , you saved my semester
Thank you very much for the hands-on tutorial on binary imbalanced issue. Could you please also do a video tutorial on solving class imbalance issue on multiclass problem where there are 5 or 6 classes to identify and data is not balanced among them. Thanks again for the video.
It depends on the amount of data available. If you have thousands of observations and the imbalance is not as extreme as 20%/80%, do nothing. Otherwise, try everything to see the impact of imbalance. Sometimes the right answer is learning more stuff about your data. But there's not clear-cut answer here...
Isn't sampling essentially changing the original dataset? If we predict the Survived class on these sampled data, would that mean anything for the original dataset?
It's not exactly changing the dataset but the proport of some observations of the dataset. It's not guaranteed that it should work but it does most of the time.
But on doing this, there so many duplicated variables are being created...this is effecting me.. I'm getting NAs on running machine learning code..(on using lda model)
But on doing this, there so many duplicated variables are being created...this is effecting me.. I'm getting NAs on running machine learning code..(on using lda model)
I am searching for hours on how to do oversampling in R, your explanation was the only one that worked for me kkkkkk !!!!! Thank youuuuu very much , you saved my semester
Thanks. Love how simple you made it look like. It's great to have base R explanation before using a package to do the same.
Thanks once again Mario - your videos are always concise!
Amazing tutorial. Great job. You gained a subscriber! Thanks for the content and keep it up!
Thank you! Helped me in econometrics class. From Brasil!
Yogur videos are extreamly useful!❤
Very helpful, thanks!
Thank you very much for the hands-on tutorial on binary imbalanced issue. Could you please also do a video tutorial on solving class imbalance issue on multiclass problem where there are 5 or 6 classes to identify and data is not balanced among them. Thanks again for the video.
thank you so much , you are the best of the quartier !
This was very helpful
Thank you very much, it has been very helpful, but I have a question, in your opinion which one is better and why?
It depends on the amount of data available. If you have thousands of observations and the imbalance is not as extreme as 20%/80%, do nothing. Otherwise, try everything to see the impact of imbalance. Sometimes the right answer is learning more stuff about your data. But there's not clear-cut answer here...
Isn't sampling essentially changing the original dataset? If we predict the Survived class on these sampled data, would that mean anything for the original dataset?
It's not exactly changing the dataset but the proport of some observations of the dataset. It's not guaranteed that it should work but it does most of the time.
I have multiclass to resample and the result is not equal, what should i do?
But on doing this, there so many duplicated variables are being created...this is effecting me.. I'm getting NAs on running machine learning code..(on using lda model)
But on doing this, there so many duplicated variables are being created...this is effecting me.. I'm getting NAs on running machine learning code..(on using lda model)