Imbalanced Data in Machine Learning | Undersampling | Oversampling | SMOTE

Sdílet
Vložit
  • čas přidán 27. 06. 2024
  • Imbalanced data refers to datasets where the distribution of classes is heavily skewed, with one class significantly outnumbering the others. Dealing with imbalanced data is crucial as it can lead to biased models that perform poorly on minority classes. Addressing Class Imbalance with Undersampling, Oversampling, SMOTE, and Ensemble Methods. Imbalanced datasets pose challenges for machine learning models, but techniques like undersampling (reducing majority class samples), oversampling (increasing minority class samples), SMOTE (Synthetic Minority Over-sampling Technique), and ensemble methods (combining multiple models) help mitigate bias and improve predictive performance on minority classes.
    Code - colab.research.google.com/dri...
    ============================
    Did you like my teaching style?
    Check my affordable mentorship program at : learnwith.campusx.in
    DSMP FAQ: docs.google.com/document/d/1O...
    ============================
    📱 Grow with us:
    CampusX' LinkedIn: / campusx-official
    CampusX on Instagram for daily tips: / campusx.official
    My LinkedIn: / nitish-singh-03412789
    Discord: / discord
    E-mail us at support@campusx.in
    ✨ Hashtags✨
    #Datascience #Machinelearning #Imbalanceddata #CampusX
    ⌚Time Stamps⌚
    00:00 - Intro
    00:54 - What is Imbalanced Data?
    04:10 - Problems with Imbalanced Data
    08:00 - Imbalanced Data Demo
    11:13 - Why studying imbalanced data is important?
    16:58 - Undersampling
    25:56 - Oversampling
    31:06 - SMOTE
    42:43 - Ensemble Learning
    47:06 - Cost Sensitive Learning
    51:30 - Other techniques

Komentáře • 40

  • @campusx-official
    @campusx-official  Před měsícem +20

    I had to reupload this video because I forgot to include the part on ensemble techniques due to an editing error in the previous upload. Check timestamps.

    • @Ashishkumar-id1nn
      @Ashishkumar-id1nn Před měsícem

      Sir, please make a video on the difference between encoding and embedding

    • @mohitjoshi8984
      @mohitjoshi8984 Před měsícem

      Sir please make a video on AB testing

    • @AMANRAJ-dt8gu
      @AMANRAJ-dt8gu Před měsícem

      I am writing to request your assistance in creating videos that delve into metaheuristic approaches, such as genetic algorithms, ant colony optimization, and others. It has come to my attention that there is a noticeable scarcity of resources covering these topics on platforms like CZcams.

    • @sagarbp-2854
      @sagarbp-2854 Před měsícem

      Sir make video about AB testing

  • @advaitdanade7538
    @advaitdanade7538 Před měsícem +3

    Thank you sir for the best series on CZcams, I just completed it in 2 months by watching 4 hr daily at 1.5x speed

  • @shripaddeshpande5766
    @shripaddeshpande5766 Před měsícem +1

    Another fantastic video by Nitish! Wonderful!!!

  • @manikarnikatiwari199
    @manikarnikatiwari199 Před měsícem +2

    THANK you so much Nitish 😊u are the best in everything.🎉 Thanks for being my teacher 😊🙏

  • @user-mg5fk7mf5c
    @user-mg5fk7mf5c Před měsícem

    I understood everything sir
    Thank you so much
    You are the best

  • @vinayakvijay108
    @vinayakvijay108 Před měsícem

    Awesome Content

  • @divyakarlapudi
    @divyakarlapudi Před měsícem

    Thankyou so much for this video, very helpful sir 🤌

  • @balrajprajesh6473
    @balrajprajesh6473 Před měsícem

    Thank you very much sir

  • @nsbipritam9682
    @nsbipritam9682 Před 8 dny

    very helpful video

  • @Sulehri226
    @Sulehri226 Před měsícem

    Thanks Sir

  • @mukeshrajpurohit5593
    @mukeshrajpurohit5593 Před měsícem

    Hi Sir,
    Big Fan!!
    I was searching for class imbalance video and you have uploaded it on right time.
    I am training an ANN model for customer churn prediction where my dataset has class imbalance issues 96:4. I have used Upsampling, Downsampling, SMOTE, SMOTE-ENN, Class Weight but neither of them gave promising results and fail to predict well on minority class the recall value is very low. What should be done in such case where the model is not predicting well on minority class. I have also trained XGBoost classifier but that model also did not perform well.

  • @himanshurathod4086
    @himanshurathod4086 Před měsícem +1

    please continue your llm transformers series.and also please upload nlp ner and topic modeling

  • @wamiqmushtaq2825
    @wamiqmushtaq2825 Před měsícem +1

    Sir pls do a session on cross validation.... There's no sperate video on cross validation in the ml playlist

  • @not_amanullah
    @not_amanullah Před měsícem

    Thanks

  • @chandrimapramanick1111
    @chandrimapramanick1111 Před měsícem

    Sir, I truly admire your work and love all of your videos, learning so much from them. Thank you!!!
    I have one question: at the end of the video you said that in spam filtering false positive is the critical one but if one msg is spam and classified as not spam(false negative) that will be the critical case isn't it? false negatives are generally considered to be more dangerous in this case because they can expose the recipient to potential harm.

    • @RajatTomar-r7i
      @RajatTomar-r7i Před 2 dny

      I think false positive is more critical because it may send your important mail in spam which is more harmful rather than showing some spam mails as important mail.

  • @haroonmalik2195
    @haroonmalik2195 Před měsícem

    Sir Also make video on multi label classification problem.

  • @uditbhandari5791
    @uditbhandari5791 Před měsícem +1

    Sir, when will you start a new batch for DSMP?

  • @tusharshukla9361
    @tusharshukla9361 Před měsícem

    Nitishi Sir please update your Machine Learning Roadmap and add links of your new videos (We want more and more videos of yours)

  • @AMANRAJ-dt8gu
    @AMANRAJ-dt8gu Před měsícem +1

    I am writing to request your assistance in creating videos that delve into metaheuristic approaches, such as genetic algorithms, ant colony optimization, and others. It has come to my attention that there is a noticeable scarcity of resources covering these topics on platforms like CZcams.

  • @parth.mandaliya
    @parth.mandaliya Před měsícem +2

    Please make a new video on transformers 🙏

  • @soumyaranjandas7394
    @soumyaranjandas7394 Před měsícem

    Dear Nitish sir, plz make video on how to fine tune our custom data using LLama llm.

  • @souvik5560
    @souvik5560 Před měsícem

    Nitish :- At 7:00 It will be "Testing data" for determining the accuracy. Am I correct ?

  • @muhammadikram375
    @muhammadikram375 Před měsícem +1

    Sir please do some working on MLOps playlist

  • @bhushansonawane5915
    @bhushansonawane5915 Před 9 dny

    Hello sir, how can i connect with you ? Need urgent help please

  • @pujarameet9699
    @pujarameet9699 Před 7 dny

    Is this series complete or anything remaining sirm

  • @not_amanullah
    @not_amanullah Před měsícem

    🖤

  • @user-vj3nx7sh8r
    @user-vj3nx7sh8r Před 20 dny

    Playlist ke end tak aate aate aisa lag rha ki aap jawan se budhe ho gye.

  • @anandshaw-ie3qk
    @anandshaw-ie3qk Před měsícem

    it's better

  • @mohitnemade5320
    @mohitnemade5320 Před měsícem

    Nitesh bhai aapka knowledge perfect hai but video itne long hote h ki chahke bhi pura nahi dekh pate.. please try to make video in short way🙏🤝👍

  • @AbhishekGupta-te3fe
    @AbhishekGupta-te3fe Před měsícem +1

    Sab LLM ki bat Kar Rahe hai app Machine learning par ruke hai

    • @abhinavkale4632
      @abhinavkale4632 Před měsícem +1

      Bhai LLM ke bhi videos cover kar Rahe hai nitesh sir. To us, these concepts are still gold and they are used everywhere.

    • @AbhishekGupta-te3fe
      @AbhishekGupta-te3fe Před měsícem

      @@abhinavkale4632 bhai sir ke sare video mere laptop me hai all total video LLM ka history padhe hai abhi tak

    • @omsaikommawar
      @omsaikommawar Před měsícem

      From an interviewer's perspective, an imbalanced dataset is a common topic in interviews. Focusing on simple topics can increase your chances of success in cracking the interview.

  • @AbhishekGupta-te3fe
    @AbhishekGupta-te3fe Před měsícem

    Sir app bahut peeche hai