Handling Imbalanced Datasets in Python with Stratified Split, SMOTE and Random Oversampling

Sdílet
Vložit
  • čas přidán 7. 05. 2022
  • In this video, we discuss handling imbalanced datasets in a classification context by using a number of different sampling techniques in python.
    We begin by using a stratified split technique to ensure the training and test sets have an equal proportion of samples from each class. We then move on to the business of handling imbalanced datasets by employing the SMOTE technique, which oversamples the minority class by creating synthetic observations and Random Oversampling which oversamples instances from the minority class. SMOTE and Random Oversampling both rely on the imbalanced learn library (imblearn).
    The full python notebook is available from github at the following link if you want to follow along. github.com/SuperDataWorld/Pyt...

Komentáře • 2

  • @VanithaSRA
    @VanithaSRA Před rokem +1

    good video. my doubt is cleared regarding stratified and smote technique . Confusion about which one to use before. Thanks.