Difference Between fit(), transform(), fit_transform() and predict() methods in Scikit-Learn

Sdílet
Vložit
  • čas přidán 11. 04. 2021
  • Hello All,
    iNeuron is coming up with the Affordable Advanced Deep Learning, Open CV and NLP(DLCVNLP) course. This batch is starting from 17th April and the timing will be 12:30pm to 2:30pm IST on Saturdays and Sunday and it will be live sessions.
    Prerequisites: Python And Basic Machine Learning
    The course fees will be 3000 inr+18% GST.
    Download the syllabus and fill the form to reserve the seat
    ineuron1.viewpage.co/DLCVNLPAPRIL
    Incase of any queries you can contact the below number.
    8788503778
    6260726925
    9538303385
    8660034247
    9880055539
    -------------------------------------------------------------------------------------------------------------------------
    ⭐ Kite is a free AI-powered coding assistant that will help you code faster and smarter. The Kite plugin integrates with all the top editors and IDEs to give you smart completions and documentation while you’re typing. I've been using Kite for a few months and I love it! www.kite.com/get-kite/?...
    All Playlist In My channel
    Interview Playlist: • Machine Learning Inter...
    Complete DL Playlist: • Complete Road Map To P...
    Julia Playlist: • Tutorial 1- Introducti...
    Complete ML Playlist : • Complete Machine Learn...
    Complete NLP Playlist: • Natural Language Proce...
    Docker End To End Implementation: • Docker End to End Impl...
    Live stream Playlist: • Pytorch
    Machine Learning Pipelines: • Docker End to End Impl...
    Pytorch Playlist: • Pytorch
    Feature Engineering : • Feature Engineering
    Live Projects : • Live Projects
    Kaggle competition : • Kaggle Competitions
    Mongodb with Python : • MongoDb with Python
    MySQL With Python : • MYSQL Database With Py...
    Deployment Architectures: • Deployment Architectur...
    Amazon sagemaker : • Amazon SageMaker
    Please donate if you want to support the channel through GPay UPID,
    Gpay: krishnaik06@okicici
    Telegram link: t.me/joinchat/N77M7xRvYUd403D...
    Please join as a member in my channel to get additional benefits like materials in Data Science, live streaming for Members and many more
    / @krishnaik06
    Please do subscribe my other channel too
    / @krishnaikhindi
    Connect with me here:
    Twitter: / krishnaik06
    Facebook: / krishnaik06
    instagram: / krishnaik06

Komentáře • 138

  • @jammerules80
    @jammerules80 Před 3 měsíci +3

    Thank you for the clear explanation. I spent $10K learn ML-AI from UC Berkeley and yet I could not understand this concept before this video. Job well done!

  • @nigamaveena4211
    @nigamaveena4211 Před 3 lety +34

    I always get confused about fit() ,transform(), fit_transform()....thank you sir... you are like a saviour to many people like me...

    • @cocgamingstar6990
      @cocgamingstar6990 Před rokem

      Still not got

    • @pratikghute2343
      @pratikghute2343 Před rokem +3

      @@cocgamingstar6990 see bro, first of all we use .fit in two scenarios first one is at time of scaling and second one is at models training. (scaler.fit_transform(xtrain) and scaler.transform(xtest) that is part of Data preprocessing step and the second scenario we use .fit is at model training (model.fit(xtrain)) there we use fit to fetch the parameters like slope and y intercept.

  • @mariuszwiesiolek9340
    @mariuszwiesiolek9340 Před 2 lety +16

    This was fantastic, I really got the essence of not only when, how, and why to use fit(), transform(), fit_transform(), predict() but in the context I was looking for!

  • @survivor9367
    @survivor9367 Před 3 lety +10

    Actually I am searching for this in on other videos. As it is not available in you lr play list.. You just updated.. Thank you so much sir

  • @alessandrodf888
    @alessandrodf888 Před 3 lety +8

    Batman, Superman and Krish Naik

  • @ashraf_isb
    @ashraf_isb Před měsícem

    old videos are gold, thanks for this krish

  • @jiaxinxie8794
    @jiaxinxie8794 Před 2 lety

    This is clear, in-depth, comprehensive, and helpful! Thank you so much!

  • @devinpython5555
    @devinpython5555 Před 3 lety +17

    summary of this is (intuition)
    for train data: fit creates formula for all the features in dataset ,transform will transform data with created formula.
    for test data: formula already created just transform it accordingly.

    • @srujankumar637
      @srujankumar637 Před 2 lety

      But on test data fit( ) has not applies then how it gives transform( ) value,,,
      Because mean(mu) and st.dev has to Calculated for test data by using fit ( ).

    • @srujankumar637
      @srujankumar637 Před 2 lety

      Or data distribution is almost same for both train and test data so that's why mean and st.dev is same for train and test data...
      And once we got values for train by using fit () that will be transformed for test data ????

  • @gomathic9557
    @gomathic9557 Před 3 lety +4

    it is very useful video krish, now i got a clear information about fit and transform thanks giving this useful information krish .

  • @rohitkatkar3736
    @rohitkatkar3736 Před 2 lety

    I was not clear with, why fit_transform for train data and only transform for test. Now i understood this concept.
    Thank you!!!

  • @abdulrahiman8111
    @abdulrahiman8111 Před 2 lety +1

    Hi Krish. Your video was really informative and helped me understand the requirement as well as the difference between fit(), transform(), fit_transform() very well. Thank you

  • @harikrishna-harrypth
    @harikrishna-harrypth Před 3 lety +1

    Krish, you are a LEGEND!!!!!!!!!!!! Thanks much for making these enlightening tutorials!!!!!!!

  • @saurabhbarasiya4721
    @saurabhbarasiya4721 Před 3 lety +3

    Thanks for this

  • @optimusagha3553
    @optimusagha3553 Před 2 lety

    Simple and straightforward! Thanks!!👏

  • @notmimul
    @notmimul Před 2 lety +1

    God bless you!!! Your videos make everything simple.

  • @Annisa-yc5zp
    @Annisa-yc5zp Před 2 lety

    Hello, you're such a good teacher! This helped me a lot. Thank you!

  • @biswajitnayaak
    @biswajitnayaak Před 2 lety

    Crystal clear and detailed . Awesome. Keep it up

  • @jamalhasanzakarneh9837
    @jamalhasanzakarneh9837 Před 3 lety +5

    Thank you Krish; it is just another beautiful video of your very helpful videos

  • @akshayg.p8201
    @akshayg.p8201 Před rokem

    Krish sir , I got lots of idea on fit(), transform(), fit_transform() and predict() methods. Thanks a lot.

  • @01kumarr
    @01kumarr Před 2 lety

    Just in one sort u cleared all doubts. Thanks 👍

  • @AromonChannel
    @AromonChannel Před 3 lety +1

    Thank you so much krish naik! i've been trying to understand this and you explain it in the very easy way, so we can easily understand it, thank you!!!!!!!!!!!!!!!!!!!!!

  • @parikshithh4991
    @parikshithh4991 Před 3 lety +3

    Beautifully Explained

  • @nguyenthituyetnhung1780
    @nguyenthituyetnhung1780 Před 9 měsíci +1

    So clear explanation that i can also understand process of machine learning. Thanks a lot

  • @eitanshirman9072
    @eitanshirman9072 Před 2 lety +1

    Thank you so much for such a brilliant explanation!

  • @soajack
    @soajack Před 3 lety +1

    Clearly Explained ! Thanks a lot !!!

  • @shaelanderchauhan1963
    @shaelanderchauhan1963 Před 3 lety +1

    Thanks a lot you have contributed a lot to this community

  • @suvamgupta2914
    @suvamgupta2914 Před 2 lety +2

    Hats off sir!! Your explanation is of God level 💯 Thank you sir ❤️

  • @doumansarouei6994
    @doumansarouei6994 Před 2 lety

    Thank you so much for all of the valuable content you shared!

  • @zaindeen4490
    @zaindeen4490 Před 2 měsíci

    Thank you so much krish sir. It was quite informative!
    I was searching for this kind of video but wasn't able to find it
    Thanks for all of your great efforts ❤

  • @shashikarathnasinghe1241
    @shashikarathnasinghe1241 Před 2 lety +1

    Thank you soo much ,, was struggling to understand this concept .superrr well explained

  • @theforester_
    @theforester_ Před 2 lety

    awesome video thanks very much. big shout out to all indians out there helping out the world. big greetings from brazil

  • @gangeshwarinetam413
    @gangeshwarinetam413 Před 2 lety

    thank you sir ..I always get confused but now its clear. Thank you soooo much

  • @aishwaryanarkar2954
    @aishwaryanarkar2954 Před 3 lety +5

    THANK YOU KRISH AMAZINGGGG BLESSINGS TO YOU

  • @saheedajayi7352
    @saheedajayi7352 Před 2 lety

    Again, Thank you Krish, well explained.

  • @bhargavikoti4208
    @bhargavikoti4208 Před 3 lety +4

    Finally😁..Thanks for uploading

  • @Harshpatel-uw2dw
    @Harshpatel-uw2dw Před 4 měsíci

    it amazing video i had come through a great understanding and very easy to understand the concept thank you sir

  • @jlxip
    @jlxip Před 3 lety +1

    Thank you so much, this helped me a lot :)

  • @bkpusprajkumar8744
    @bkpusprajkumar8744 Před 3 lety +2

    Thank you so much, sir for this lecture.

  • @sabaamanollahi5901
    @sabaamanollahi5901 Před 2 lety

    Excellent Explanation !!!!

  • @akshitagoel3099
    @akshitagoel3099 Před 2 lety

    Well Explained. Really very informative. Thankyou so much :)

  • @bluejadoo6912
    @bluejadoo6912 Před rokem

    thank you for clearing my doubts sir

  • @robertoespinoza199
    @robertoespinoza199 Před 3 lety +1

    thanks so much for the value of your videos 💯💯

  • @darshanayenkar
    @darshanayenkar Před rokem

    you have cleared my concept

  • @KiranGunda-ph7df
    @KiranGunda-ph7df Před 2 měsíci

    Superbbb explanation brother...

  • @ajayjaadu42
    @ajayjaadu42 Před rokem +1

    Sir you explain so good .Thankyou for this

  • @rishabhkumar-qs3jb
    @rishabhkumar-qs3jb Před 3 lety +2

    Awesome explanation :)

  • @ajaykuruba1738
    @ajaykuruba1738 Před 3 lety +4

    Hi Krish
    It would be really helpful if you create a playlist on tensorflow serving and tensorflow lite.

  • @BytemeMaybe
    @BytemeMaybe Před 7 měsíci

    amazing explanation, thx bro

  • @heliyahasani6859
    @heliyahasani6859 Před 2 lety +1

    I love you man you are a game changer god bless you please load more videos

  • @ayaansk99
    @ayaansk99 Před 2 lety

    Its very helpful video sir
    Thanks for guiding

  • @oyesinghji7910
    @oyesinghji7910 Před 2 lety +1

    hi krish, can you make a full video of how to do deployment full process video, including all steps.

  • @pavanviswanadhapalli3512

    compared to all other channels ], your classes are so detail and very understandable, so
    sir please can you make a complete vedio on pca...? please sir

  • @muhammadzeerakkhan6300
    @muhammadzeerakkhan6300 Před 3 lety +1

    Great explanation and intuition (Y)

  • @naeymaislamph.d9976
    @naeymaislamph.d9976 Před 2 lety

    Excellent!

  • @hiral9591
    @hiral9591 Před 2 lety

    It's amazing👍

  • @shivamshinde9810
    @shivamshinde9810 Před 2 lety +1

    very helpful!! Thanks!!

  • @pranaypakhale
    @pranaypakhale Před 3 lety +3

    Can you please make video on different types of transformation viz standardscaler, minmaxscaler etc and when to use which

  • @louerleseigneur4532
    @louerleseigneur4532 Před 3 lety

    Thanks Krish

  • @vanditha07
    @vanditha07 Před 3 lety

    Thank you so much!!

  • @1111Shahad
    @1111Shahad Před 2 lety

    Thank you Krish

  • @sanyamsharma3940
    @sanyamsharma3940 Před 2 lety

    You are amazing !

  • @rkkcode
    @rkkcode Před 2 lety

    Thank you .

  • @mangkhongsai9029
    @mangkhongsai9029 Před rokem

    Thank you so much...

  • @anirbanc88
    @anirbanc88 Před rokem

    superb

  • @adipurnomo5683
    @adipurnomo5683 Před 2 lety +1

    Classifier algorithm whose using distance usually do normalize the datasets before put to model

  • @tonnysaha7676
    @tonnysaha7676 Před 3 lety +1

    Thank you very much sir🙏

  • @arshad1781
    @arshad1781 Před 2 lety

    Nice 👍

  • @aryankaushik3761
    @aryankaushik3761 Před 2 lety

    Thank you sir

  • @kumarprince5054
    @kumarprince5054 Před 3 lety

    Thanks

  • @tatendaVIDZ90
    @tatendaVIDZ90 Před 2 lety

    this is beautiful

  • @milliesadie486
    @milliesadie486 Před rokem

    thank you

  • @priyadarshanr9950
    @priyadarshanr9950 Před 2 lety +2

    Sir , I can understand that it formats the test data in the same format of train_data , but how does transform function helps to overcome overfitting,

  • @adityasharma5876
    @adityasharma5876 Před 3 lety +1

    Hi Krish please make a video on difference between map(), flat_map() and apply() in tf.Dataset

  • @preetamchakravarty
    @preetamchakravarty Před rokem

    Can you state the screen recording software and the settings you have used for this recording? Thank you.

  • @paulkang2806
    @paulkang2806 Před 2 lety +1

    if you are fitting, and transforming for the scalers and normalization, and you fitted (mean, stdev) for the training data, and say if you are applying it to the test data, isn't that something related with data leakage?

  • @krishnabhadke6161
    @krishnabhadke6161 Před 2 lety

    nice sir

  • @munawarabbasi9683
    @munawarabbasi9683 Před rokem

    Thanks for making it a complete halwa.

  • @bivasbisht1244
    @bivasbisht1244 Před rokem

    amazing

  • @adipurnomo5683
    @adipurnomo5683 Před 2 lety +1

    13:46 sir, what the real world application when we don't use test data instead we use unseen data. Is the data from unseen data need to be normalize before put into model?

  • @kiranvanukuri9382
    @kiranvanukuri9382 Před 3 lety +1

    Plz make video on image recognition in jupyter note book and deployment technique with deep explanation

  • @akashkumar-bq7cl
    @akashkumar-bq7cl Před 3 lety +1

    hi krish ,what will happen if i apply fit_transform to my test data as well?what will be the outcome?why shudnt we do it?is it because new mean and sd will be calculated for the test data?but we need the same mean and sd and formula of the train data to be applied to the test data aswellright?is that the reason we use only transform?just did not get this part and the rest of the video im so happy that so much content in just half an hour that too for free,GOD BLESS YOU PLEASE HELP

  • @shivu.sonwane4429
    @shivu.sonwane4429 Před 3 lety +3

    Fit_transform use on training data but transform only on testing /new data
    Applies the same transformation to both set of data which creates consistent column and prevent data leakage it means learning something from testing data this is not allowed

  • @hanscesa5678
    @hanscesa5678 Před rokem +1

    So where should you use fit(), transform(), fit_transform() during a K-Fold Cross Validation? Before CV or During CV?

  • @subhashvarma4551
    @subhashvarma4551 Před 3 lety +9

    sir, if we apply the same mean in transforming the test data as in train data, this may be the case of data leakage where we are leaking information of train to test. which might not be preferable in the real-time scenario as future data should be totally anonymous to the train data. we should also perform a fit transform on the test data in such cases. Need your thoughts on this.

    • @mouleshm210
      @mouleshm210 Před 3 lety +1

      No bro, we should be cautious only on the data leakage from test to train data where, future data parameters like mean or min/max values must not be leaked while doing preprocessing, thats why we do only transform() in test data.

    • @naiduvinay4911
      @naiduvinay4911 Před 2 lety

      Thank You, understood

  • @manishaundale7458
    @manishaundale7458 Před 2 lety +1

    If the train data and test data unique values are different then how can we apply label encoder with fit and transform?

  • @rezapourbahreini4473
    @rezapourbahreini4473 Před 2 lety

    thank you for your tutorial. There's one serious issue that I want to address here. As far as I know, we're not allowed to do anything that results in leakage from test data to train data. So when you do a fit_transform on a train_data and save the parameters in the scaler, it's okay to do scaling on the test data based on that very scaler, but not the other way around!! Because there would be a leakage for mean and s.d from train data to test. This way always the result would be better but it's because of the cheat that is happening and the model really. So be careful with the order of steps you go through when scaling train and test data.

    • @swethanandyala
      @swethanandyala Před rokem

      I too feel the same...we have to fit and transform on the test data also..to avoid data leakage

    • @omsonawane2848
      @omsonawane2848 Před rokem

      what if we fit on whole data and then split and transform train and test data. This way test data will not depend on training parameters. also no data leakage will occur

  • @gunamrit
    @gunamrit Před 2 lety

    magician!

  • @harshagrawal5613
    @harshagrawal5613 Před 3 lety

    100℅ clear

  • @khaboninamasemola1970
    @khaboninamasemola1970 Před 2 lety

    You saved my backside with this video. Thank you.

  • @nagamohan1412
    @nagamohan1412 Před 3 lety

    Hi krish, I am Naga Mohan. I want to use data science or data analyst technology for my fathers agriculture land but I don't how to start actually I am so much confused. I have no data. I don't know how to create my own data for my farm land. Can you please give me tips. How to start the project and how to create the data. We have 2 acres of paddy land and 2 acres of banana land

  • @pragavipul1563
    @pragavipul1563 Před 2 lety

    what is the writing pad you use ?

  • @merveozdas1193
    @merveozdas1193 Před 2 lety

    In which platform did you tell this lesson? you can use your pencil properly.

  • @frankdearr2772
    @frankdearr2772 Před rokem +2

    Hi, I understood about well what you told, but could you tell me WHY y_train is not scaled like X_train ???
    For me that is because values are like false or true , if the y_train values were different like 10, 5 , 41, 5.8, etc , I think I will have to scale y_train ??
    Please show me the way for that small question about your video :))
    Thanks for your great video about that topic
    Laurent

    • @kavanadeshpande9690
      @kavanadeshpande9690 Před rokem +1

      Hi, as per my knowledge, scaling of dependent feature is not necessary when we have less cardinality for classification problem. For regression, if we scale the dependent feature then automatically Mean Square Error will also get scaled.

    • @frankdearr2772
      @frankdearr2772 Před rokem

      @@kavanadeshpande9690 Thanks, great information. That give me the right way to go ahead.
      Please have a nice day :)
      Laurent

    • @frankdearr2772
      @frankdearr2772 Před rokem

      @@kavanadeshpande9690 Hi, thanks a lot for your answer.. I understand better now :)
      Please have a nice day
      Laurent

  • @RAHUDAS
    @RAHUDAS Před 2 lety

    Where to put outlier detection in ur data processing chain ??

  • @yashub9580
    @yashub9580 Před rokem

    sir can you please tell me how to resolve this error "Deprecated distribution is specified in `adstock__tv_pipe__carryover__strength` of param_distributions. Rejecting this because it may cause unexpected behavior. Please use new distributions such as FloatDistribution etc."

  • @nellitharun8466
    @nellitharun8466 Před 3 lety +1

    Sir unable to access your github filescode IAM learning python from 12 April 10:00am

  • @shadmanansari5750
    @shadmanansari5750 Před rokem

    Hi, You mentioned that Fit_transform() is applied on Training data and only Transform() is applied on Test data, So, in case of StandardScaler, Fit_transform(Train) will have mean and std dev of train data, and then we are using same mean and std dev on 'Test data'
    Should'nt we apply Fit(on entire data) to calculate mean and standard dev of entire data, then transform(train) and transform(test)? Please clarify

  • @SUMITKUMAR-qi6mz
    @SUMITKUMAR-qi6mz Před 3 lety +2

    I am having experience of 1 year in customer service in BPO but I want toh become Data scientist . But I'm having difficult toh get job in same because they are asking for experience in data science. Pls help me how to portrait my resume to get job

  • @subhamsaha2235
    @subhamsaha2235 Před 3 lety +1

    Sir, you didnt tell one thing is that if we are applying fit and transform to X_train which means (for standard scalar) fit(calculating mu and sigma) then transform(applying z formula to every value), and ONLY transform to X_test which means mu and sigma are not calculated then how is it transforming the values? I think something else is also there in fit which is used to teach the model? Kindly clear my doubt. Thank you

    • @saikiranreddykondapalli279
      @saikiranreddykondapalli279 Před 2 lety

      while transforming test data we are using actually the mue and sigma values of trained data and comparing the transformed test data with predicted data .(this is what he actually mean).but it is wrong to do we cant use mue and sigma values of other data.so it is always better to split only after all the data set is fit and transformed.the it is quite valid to check predicted and actual test values

  • @unmeshmandal3071
    @unmeshmandal3071 Před 2 lety

    What if first I scale the whole X table and then split using train_test_split?