Video není dostupné.
Omlouváme se.

How to Find Outliers with Excel

Sdílet
Vložit
  • čas přidán 2. 12. 2017
  • Find outliers in your data in minutes by leveraging built-in functions in Excel.

Komentáře • 123

  • @dinadawood
    @dinadawood Před 3 lety +10

    I had to go through a set of over 300 data points, this helped me so much, I'm so glad I found this video just in time. Thank you!!

  • @quinndougan
    @quinndougan Před 3 lety +2

    Writing an Exam right now using excel and this saved my life!! Thank you!!

  • @PayneDeathz
    @PayneDeathz Před 4 lety +2

    I regret not opening this video first. Great video!!

  • @rynolin342
    @rynolin342 Před 4 lety +6

    This is going to help me so much on my math test. Thank you.

  • @isabellaforguson
    @isabellaforguson Před 4 lety +12

    This helped me so much I can’t thank you enough!!

  • @garinehmelikian8814
    @garinehmelikian8814 Před 3 lety +1

    Thank for making this video, you are using my favorite PP template!

  • @nickbabulal2526
    @nickbabulal2526 Před 3 lety +1

    Thank you so much this is the best video to explain this you just earned yourself a subscriber

  • @jorge960621
    @jorge960621 Před 4 lety +2

    Thank you for such a nice and patient explanation!

  • @tjdeleon13
    @tjdeleon13 Před 2 lety

    This was awesome. I had a whole page 9 columns wide, saved me so much work. Thank you

  • @tomasvesely6447
    @tomasvesely6447 Před 6 lety +8

    Thank you for the video. Just a quick question why are we multiplying by 1.5?
    Thanks!

    • @datascienceds7965
      @datascienceds7965 Před 5 lety +9

      Outliers live outside the inner quartile range. By statistical definition, they live 1.5 times below or above your inner quartile range.( copied from the link he ut in the descriotin box)

  • @mslx9496
    @mslx9496 Před 4 lety +1

    If I have no negative values, I asume that I have to use the absolute value for the lower bound, right?

  • @madoomadoo6382
    @madoomadoo6382 Před 4 lety

    I really love your explanations.Thanks a bunch!

  • @mimiallen7873
    @mimiallen7873 Před 3 lety

    Sir also thank you so much for this video, gosh this made it so much simpler to understand how to calculate!!!

  • @linspositivevibesonly1046
    @linspositivevibesonly1046 Před 7 měsíci

    This video helped so much❤

  • @nigelhutchinson5332
    @nigelhutchinson5332 Před rokem +1

    Thanks for this great explanation! Curious what the 1.5 factor is as it relates to the upper and lower bounds? Is this some type of outlier exclusion factor? Or just a suitable one based on the data set?

    • @absentdata
      @absentdata  Před rokem +2

      1.5 is a constant that is used in this statistical formula. Think of it like PI

    • @damionc
      @damionc Před rokem +1

      @@absentdata I found that I had to change that 1.5 to 1 for the dataset I was working with for it to pick up other values I considered outliers as well. Will do some more research on that factor and what it is exactly but I had to change it to get a better result.

  • @ynn9173
    @ynn9173 Před 3 lety +2

    When can outliers be positive and when can they be negative for the data set?

    • @absentdata
      @absentdata  Před 3 lety +1

      It depends on the data set. For example even if you had a outlier boundaries that's negative, it doesn't always apply. For examples a child birth dataset might not use any negative outliers then you would just use 0 as your range.

  • @kadichidu4171
    @kadichidu4171 Před 2 lety

    Please make one complete tutorial video on Excel from basic to advanced

  • @igi-chan
    @igi-chan Před 3 lety +1

    Thank you so much! this is so well made! it explains so well! you're the best!!! :D

  • @albert276
    @albert276 Před 3 lety +1

    how do we know we want 1.5?

  • @mariaalejandraramirezmarti7724

    You are the best! really really helpful!

  • @Tfol077
    @Tfol077 Před 2 lety

    This was super helpful brother thank you so much

  • @zg5672
    @zg5672 Před 2 lety +1

    Thanks dude u saved a depressed student about to end it all

    • @absentdata
      @absentdata  Před 2 lety

      I am glad this brightened up your day!!!

    • @zg5672
      @zg5672 Před 2 lety +1

      @@absentdata no. it brightened up my life.

    • @tentacleprincess5772
      @tentacleprincess5772 Před 2 lety +1

      Keep going dude! Stay strong!

    • @zg5672
      @zg5672 Před 2 lety

      @@tentacleprincess5772 i am strong now because of absent data

  • @biankaolivier8207
    @biankaolivier8207 Před 5 lety +2

    This helped me alot, thanks!

  • @zzz11221
    @zzz11221 Před 7 měsíci

    BTW dont forget to sort the data first this will be crucial at bigger datasets as it will be problematic otherwise
    2

  • @catherineshepperdley7951

    Thank you - This was really helpful!

  • @tammykonarski6528
    @tammykonarski6528 Před 5 lety +2

    Thank you, thank you, thank you!!

  • @sirtopinhat9217
    @sirtopinhat9217 Před 4 lety

    I like this video, very helpful thankyou

  • @ragexbabayega4058
    @ragexbabayega4058 Před 2 lety

    What if the lower quartile is negative in value, then our IQR turns in a really large value of Q3+Q1, will this affect the values of outliers?

    • @absentdata
      @absentdata  Před 2 lety +1

      No it should be fine.. However it depends on the data. Should you have negatives or not might be a better question

  • @eddyrealon1105
    @eddyrealon1105 Před 3 lety

    What is the basis for the 1.5 multiplied to the IQR? What is the rationale for that? Thank you!

    • @absentdata
      @absentdata  Před 3 lety +4

      This is part of the accepted original formula. Essentially quartiles can be scaled to standard deviations. The math equates 1.5×IQR to under 3 standard deviations from the mean. So the 1.5 was designed in the formula to capture the majority of outliers assuming the data is normally distributed.

  • @deepankumarr4256
    @deepankumarr4256 Před 2 lety

    How this works on positive only data. I have list of numbers coming daily say. I want to know if the trends are inline.
    What is the best way to use that. If there is any dip or spike in the trend, how to t find out.

  • @tvoyodelo2858
    @tvoyodelo2858 Před 4 lety

    Extremely helpful!

  • @Hertesler
    @Hertesler Před 5 lety

    Thanks for this great video!

  • @aqsanaveed7976
    @aqsanaveed7976 Před 3 lety

    can we delete the outliers and then proceed by leaving the spaces blank or do we have to fill them?

    • @absentdata
      @absentdata  Před 3 lety

      You could do that by using an if function

  • @lBubbleable
    @lBubbleable Před 3 lety

    I found this very helpful, thank you! I'm trying to compare different business with their respective values to find the most "different" one. Is this method ok to use in that case?

    • @absentdata
      @absentdata  Před 3 lety +2

      In a sense yes, you may be doing a univariate analysis of the three businesses. For example you may be looking at Revenue across all the three businesses in comparing their distributions. You can compare the shape of the distribution whether it's a normal distribution or a very skewed distribution. Skewed distributions are affected by outliers.

    • @lBubbleable
      @lBubbleable Před 3 lety

      @@absentdata Thank you so much!!

  • @pipertaylor3641
    @pipertaylor3641 Před 4 lety +1

    this is a lifesaver

  • @kelsichesnut
    @kelsichesnut Před 6 lety

    Super helpful, thank you!

  • @StellaNimas
    @StellaNimas Před 3 lety

    Hii! can i use this for a likert scale data?

  • @simonbreecher8542
    @simonbreecher8542 Před 3 lety +1

    why do we multiply by 1.5?

    • @absentdata
      @absentdata  Před 3 lety +1

      This is standard multiplier for outlier detection. However you can increase it

  • @loverose4732
    @loverose4732 Před 3 lety

    Thank you so much, sir.

  • @williamtully6161
    @williamtully6161 Před 5 lety

    Fantastic! Awesome Thank you!!

  • @hienle-kc6qw
    @hienle-kc6qw Před 3 lety

    Can i just plug in whatever the Q1 and Q3 I want? Thank you!

    • @absentdata
      @absentdata  Před 3 lety

      Welll not really because that invalidates the need for calculating the quartiles. However, I think the approach you are trying to take is more maual where you are setting the boundaries for the outliers yourself.

    • @hienle-kc6qw
      @hienle-kc6qw Před 3 lety +1

      @@absentdata yep, I work with big listings at my work and we look for values that out of range based on unique base values. Thank you!

  • @adamatraore1883
    @adamatraore1883 Před 8 měsíci

    Hello, I tried to use the function "OR", I couldn't get the right answer. Instead I used the Function "AND". Thanks for the Video

    • @absentdata
      @absentdata  Před 8 měsíci +1

      Thanks I'm glad you solved the issue in the end

  • @kasy26x0x
    @kasy26x0x Před 5 lety

    Thanks for the video!

  • @jedn1660
    @jedn1660 Před 5 lety

    Very helpful, thank you

  • @payammadani3781
    @payammadani3781 Před 4 lety

    liked and subscribed. a ton thanks

  • @leonamoliveira7034
    @leonamoliveira7034 Před 5 lety

    Thank you for your help

  • @mimiallen7873
    @mimiallen7873 Před 3 lety

    Can you tell me in general what the reason is why a cell won't calculate?

    • @absentdata
      @absentdata  Před 3 lety +1

      Its hard to say without having the actual file. There are a host of reasons..
      data is not a number
      incorrect formula

    • @mimiallen7873
      @mimiallen7873 Před 3 lety

      @@absentdata Ok thank you for that!

  • @gabrielvaz7014
    @gabrielvaz7014 Před 3 lety

    amazing. thank you!

  • @iroha283
    @iroha283 Před 2 lety

    What does it mean when u say click F4?

    • @iroha283
      @iroha283 Před 2 lety

      Is it on the keyboard? What if I’m using excel on an iPad

    • @absentdata
      @absentdata  Před 2 lety

      ⌘⌥R or Command T in Mac I believe it will help you lock the position of the cells. BTW I've never had a Mac so I am hoping Google is correct :)

  • @manas806
    @manas806 Před 3 lety

    how to count the number of true and false after this for the outliers

  • @Tubeytime
    @Tubeytime Před 3 lety

    No fluff, thanks

  • @evachristine2138
    @evachristine2138 Před 4 lety

    I have over 20,000 columns of data, how do i do this without doing each one seperately?

    • @absentdata
      @absentdata  Před 4 lety

      Well a column should be seen as a unique dimension in your data. so if you have 20,000 columns. By Excel only has 16,384. I would question the structure of your data. I would see big you can melt the columns down to rows

  • @akifrahman
    @akifrahman Před 3 lety

    I tried to do this method but I would always get negative values for the lower bound so I can't tell if there's an outlier or not

    • @absentdata
      @absentdata  Před 3 lety

      Probably indicates that your data is very normally distributed with no outliers

    • @akifrahman
      @akifrahman Před 3 lety

      @@absentdata would you mind if I showed you my data and say if you think I have correctly identified the outliers?

    • @akifrahman
      @akifrahman Před 3 lety

      @@absentdata also my data are decimals so like 0.0347 etc. Would that affect the method at all?

  • @adewaletijani7023
    @adewaletijani7023 Před 2 lety

    Is 1.5 a constant ?

    • @absentdata
      @absentdata  Před 2 lety

      Yes, its a best practice to use 1.5. However, you can alter it but then release that you will lose data. It conforms to a standard normal distribution which means that it's a reliable way of saying that most of your data falls under this about 97.5%

  • @VyTran-nu3on
    @VyTran-nu3on Před 2 lety

    thank you!

  • @dileep3809
    @dileep3809 Před 2 lety

    Can U please prepare a proper statistics series.i went through many videos but I like the way you teach

    • @absentdata
      @absentdata  Před 2 lety

      I am glad you enjoy the way I teach. Yes, I want to do a purely statistic playlist. Hopefully there is enough interest :)

  • @kameshkumar1239
    @kameshkumar1239 Před 4 lety

    Thank you so much

  • @Metia_9898
    @Metia_9898 Před 3 lety +2

    THANK YOU BRUHHHHH

  • @JassiLive2810
    @JassiLive2810 Před 4 lety +1

    BRO THERE IS SOMETHING WRONG IN THIS DATA....BECAUSE WHENEVER I JUST APPLIED A DIFFERENT FORMULA TO FIND FIRST AND 3RD QUARTILE I JUST GET DIFFERENT VALUES WHICH IS NOT POSSIBLE

  • @mahmoudfaouzichaoubi5855

    شكراً، Tank you

  • @ellereal8216
    @ellereal8216 Před 4 lety

    How to apply this to 3 data points only? Can you help?

    • @absentdata
      @absentdata  Před 4 lety

      Do you just 3 rows of data? That would be the best use of this.

    • @ellereal8216
      @ellereal8216 Před 4 lety

      @@absentdata No. I mean just three data. I conducted an analysis with 3 trials so I generated 3 data points. And I need to know whether I can average all those three and report it.
      Edited** What I meant with "analysis" is "experiment"

    • @absentdata
      @absentdata  Před 4 lety

      @@ellereal8216 Yes that could work for your specific application.

    • @ellereal8216
      @ellereal8216 Před 4 lety

      @@absentdata Is that the best option for my case to use to identify the outlier? Can you advise me some other stat tools I can easily apply on a calculator.

    • @absentdata
      @absentdata  Před 4 lety

      @@ellereal8216 Unfortunately I can't really prescribe what to do cause I would need to know your data and context. However I think you can just explain your logic when presenting your findings. You can also can also use percentiles.

  • @TheGollo10
    @TheGollo10 Před 4 lety

    What is the name of the test?

    • @mickeylandcaster3
      @mickeylandcaster3 Před 4 lety

      I am not sure the test has a name. It's more of a technique to classify the borders of your data.

  • @vivekkoli4479
    @vivekkoli4479 Před rokem

    Why 1.5?

  • @karanpatil8218
    @karanpatil8218 Před 4 měsíci

    WHY HAVE YOU USE 1.5* IN UPPER BOUND

    • @absentdata
      @absentdata  Před 4 měsíci

      1.5 is a constant value that I used in the equation

  • @ask-wj2pf
    @ask-wj2pf Před rokem

    Thankyou

  • @ghaidferas6146
    @ghaidferas6146 Před rokem

    why the 1.5?

    • @absentdata
      @absentdata  Před rokem

      Its a statistical formula. Its similar to using a constant like 3.14 for PI

  • @prasannakumar-qv8nf
    @prasannakumar-qv8nf Před 4 lety

    why do you multiply 1.5 to upper bound

    • @fuad471
      @fuad471 Před 4 lety

      it is just a statistical rule

  • @wilsonbrown3521
    @wilsonbrown3521 Před 4 lety +4

    How to calculate outliers by Kermit the Frog

  • @ryujuano8414
    @ryujuano8414 Před 6 lety

    F4 (absolute value) won't work in windows :(

  • @marissakachadoorian4772

    I have a mac and f4 also does not work

  • @erniestremathic2290
    @erniestremathic2290 Před 3 lety

    Hello Sir. If the Result Show True. Meaning That we have to delete the data (true only) or the significance true data? (Im Using Panel Data).
    For eg:
    2013 TRUE apple.co
    2014 False apple.co
    2015 TRUE apple.co
    2016 False apple.co
    2017 False apple.co
    SHOULD I dELETE ALL.. or just the true ?

  • @Window20230
    @Window20230 Před 3 lety

    Very helpful video, a quick question: Can I use this method to emit outliers in a degradation trendline? or this method is for data that fit normal distribution or some other distribution only?

    • @absentdata
      @absentdata  Před 3 lety +1

      Yes, absolutely. However then you are going to be dealing with standard deviation for normal distribution. This would be a different approach

    • @Window20230
      @Window20230 Před 3 lety

      @@absentdata Thanks!

  • @berto_osorio_fl
    @berto_osorio_fl Před 3 lety

    thank you!