Boxplots & Outliers in SPSS - Identify and Deal with Outliers (4-8)

Sdílet
Vložit
  • čas přidán 6. 07. 2024
  • The boxplot serves up a great deal of information about both the center and spread of the data, allowing us to identify skewness and outliers, in a form that is both easy to interpret and easy to compare to other distributions. It is the graphical equivalent to the five-number summary. All of that in a simple graph made of just a few lines.
    This video teaches the following concepts and techniques:
    SPSS Chart Builder
    Boxplots
    Outliers
    Daniel, T., & Tivener, K. (2016). Effects of sharing clickers in an active learning environment. Educational Technology & Society, 19 (3), 260-268. Retrieved from www.j-ets.net/ETS/journals/19...
    Link to a Google Drive folder with all of the files that I use in the videos including the Bear Handout and the Clickers.sav dataset. As I add new files, they will appear here, as well.
    drive.google.com/drive/folder...
    Table of Contents:
    01:30 - The Box
    01:50 - The Whiskers
    06:48 - Correct the Outlier
    07:08 - Univariate vs. Multivariate
    08:48 - When To Delete Outliers
    09:32 - Correct the Outlier
    09:53 - What Does The Boxplot Show?
  • Jak na to + styl

Komentáře • 61

  • @beckycanty
    @beckycanty Před 2 lety +7

    I learned more from this guy in 12 minutes than I did six weeks from my hopeless professor in an introductory stats course lmao. thank you dr todd!!

    • @ResearchByDesign
      @ResearchByDesign  Před 2 lety

      Very pleased to know that the videos are helpful. Thanks for watching. Good luck in your class

  • @bloneric
    @bloneric Před 4 lety +1

    Than you so much Dr. Daniel for helping me understand when to get rid of outliers or simply keep them in the analysis.

  • @enchnated
    @enchnated Před rokem

    you are amazing ,your explanations are passionate, thorough and just beautifully explained, you've made the abstract come to life

  • @ThePookie25
    @ThePookie25 Před 7 lety +1

    This video is great! And explained just perfectly.

  • @elmakkiamiri3912
    @elmakkiamiri3912 Před rokem

    thank you Dr. Daniel. I do really appreciate your professional work

  • @naylianas8159
    @naylianas8159 Před 4 lety +1

    Thanks a lot Dr. Daniel for your sharing..really usefull for beginner like me.

  • @ikosimisimo3363
    @ikosimisimo3363 Před 5 lety

    I can't thank you enough! This video was helpful.

  • @duduetsangsemele159
    @duduetsangsemele159 Před 4 měsíci

    Thanks a lot Doc. (thumps up). Excellent teaching sir.

  • @bogdanmanole8427
    @bogdanmanole8427 Před 6 lety +1

    Thank you for the explanation. Looking sharp.

    • @Espina907
      @Espina907 Před 4 lety

      I LOVE DR Todd! His voice soothing and is helping me in grad stats. Am crushing on Dr. Todd Daniel. Smart men are dreamy...😍😍😍😍Must focus on stats.. 🤓

  • @aimanhalim2150
    @aimanhalim2150 Před 6 lety +2

    I want to thank you for this great video. I've been thinking about how to deal with outliers in my Likert scale questions. This video is perfect! Thank you so much for sharing such useful knowledge. :-)

    • @ResearchByDesign
      @ResearchByDesign  Před 6 lety +1

      Good point. If you have an outlier in a single Likert item, you have a data entry error. If you have an outlier after combining multiple items, then you have a true multivariate outlier.

  • @helennguyen652
    @helennguyen652 Před 2 lety

    Dear Professor Todd, thank you very much for your great clips. I am able to understand statistic concepts within a week to complete my Master degree assignment thanks to your simplified explanation. Please take care and I wish you all the best!

    • @ResearchByDesign
      @ResearchByDesign  Před 2 lety +1

      You are very welcome! Thank you for the wonderful comment and well wishes

  • @emmanuelakpaklikwasi4300

    You are the best

  • @dsavkay
    @dsavkay Před 3 měsíci

    Thanks! 💯

  • @sadilanmohammed6091
    @sadilanmohammed6091 Před 4 lety

    My analysis got done man. Thank you.

  • @adebolaadefurin8
    @adebolaadefurin8 Před 4 lety

    Great video!! I found the picture of the crazy outlier guy very funny though!

    • @ResearchByDesign
      @ResearchByDesign  Před 4 lety +1

      Thanks for the comment. I agree that humor helps reinforce the ideas and clarify what an outlier is.

  • @chiaradadamo7050
    @chiaradadamo7050 Před 3 lety

    Thank you again for sharing your videos! I have a question. Is there a way to calculate whether any outlier values I identify are more than 3 times the IQR beyond the nearest quartile value? Or do I simply just rely on whether the outlier is represented by an asterisk or a circle?
    Thank you again!

  • @kobayashiakato1044
    @kobayashiakato1044 Před 2 lety

    extremely str8 to the point and useful

  • @Dr.UdaraSenarathne
    @Dr.UdaraSenarathne Před 2 lety

    Thank you very much!

  • @saruttayajitprapai5366

    Thank god this video exists

  • @alizahita6881
    @alizahita6881 Před 3 lety

    Sir, THANK YOU!!!

  • @crazystatistician
    @crazystatistician Před rokem

    Hello,
    Thank you very much for this eye-opening video! I learned lots of things. I have some questions. Furthermore, I computed univariate outliers in SPSS. I diagnosed some Asterix (extreme) outliers. When I checked them in the data, these Asterix outliers range from the lowest (e.g., 1) to the highest (e.g., 7) values. They are legitimate outliers, although they were shown as Asterix in the box plot. In this case, should I remove these outliers, or should keep them in the data because they represent legitimate outliers?
    Thank you very much in advance!
    Best,

    • @ResearchByDesign
      @ResearchByDesign  Před rokem

      That sounds correct. If your data points only go from 1 to 7, you may have a skewed variable, but you do not have extreme outliers because they have been bounded by the measurement scale (1-7). Check out this video and the one after it for more details: czcams.com/video/4CNLHO3xOyc/video.html

  • @worldofinformation815
    @worldofinformation815 Před 2 lety

    Thank you Sir

  • @famavevershima2006
    @famavevershima2006 Před 3 lety

    Thanks Dr. Daniel. I want to ask these questions.
    - Does Boxplot indicates Median. But also take note of the Mean?
    - Can continuous variables be computed to check outliers, using Boxplots?
    Thank You.

    • @ResearchByDesign
      @ResearchByDesign  Před 3 lety +1

      The box plot just uses the median (50th percentile), and yes, boxplots are a great way to check for outliers because SPSS labels the outlier cases for you.

  • @saminos9
    @saminos9 Před 4 lety

    Thank you for a great video. One question, if anyone could answer: when creating your chart at 03:20, you have gender on the x axis. Is there anyway you can have gender (that is male and female) as two boxplots and then a third one, "total" (that is the whole data set) as a separate boxplot but all in the same graph?

    • @ResearchByDesign
      @ResearchByDesign  Před 3 lety

      Good question...I don't know of any way to have both a split box plot and the combined (non-split) box plot on the same graph. I usually just create them separately and then combine them in the paper into a single graph.

    • @saminos9
      @saminos9 Před 3 lety

      @@ResearchByDesign Thank you. What do you mean by "in the paper"?

  • @kamleshpatel6998
    @kamleshpatel6998 Před 4 lety

    Informative vidio thanks sir

  • @claudiakrogmeier557
    @claudiakrogmeier557 Před 3 lety

    What does it mean if I have values which are not shown on my boxplot? For example, I can see on my boxplot that my .4 is considered an outlier, however I have values at .7 and .8, but .7 and .8 are not even shown on the boxplot. Is this an SPSS bug? Thank you

    • @ResearchByDesign
      @ResearchByDesign  Před 3 lety

      Hmmm, not sure...the outliers may or may not be labeled, but if they are in the variable, they should be displayed on the box plot. Not sure what to tell you. Hope you get it worked out. (Do a Frequencies on the same variable and see if you notice something unusual)

  • @nandaeldya917
    @nandaeldya917 Před 3 lety

    Hi, sir.
    May I ask you sir? Is there a reference citation for the outliers with boxplot? Thank you

    • @ResearchByDesign
      @ResearchByDesign  Před 3 lety

      I don't recall what stats text I originally learned that from, but I know that it is the the Andy Field text (Discovering Statistics with SPSS) and in the Tabachnick & Fidell text on Using Multivariate stats. Both are excellent resources

    • @nandaeldya917
      @nandaeldya917 Před 3 lety

      Ok, thank you so much sir

  • @hotTamale2629
    @hotTamale2629 Před 5 lety

    You really need to use an example with multiple data set because when you remove the outliers the data shifts and this should be noted. Also, if the selection of data is only scale then the x axis cannot be used and the blox plot is not feasible. How do you remove outliers in those situations?

    • @ResearchByDesign
      @ResearchByDesign  Před 5 lety

      Thanks for the comment. I think that both the raw and clean datasets are available in the google folder, but I will check. Link to the folder is in the description and channel art. Let me see if I can answer your other question: when you have scale data and are looking for outliers, they will show up in histograms, stem-and-leaf, and box plots. In SPSS, I use the Explore command because it has an option to look for outliers. There are also tests like Mahalanobis or just converting to z-scores that can help identify outliers in scale numeric data. Hope that helps.

  • @bhuviranga1298
    @bhuviranga1298 Před 3 lety

    how come values of 1 are considered as a outlier but not of 7 they both are extreme end of our scale but the values for 27 and 43 were both 1 and we still have values of 7 in data set. Btw i love your videos and your intuitive approach , thank you for making it easy for me to understand SPSS and stats better

    • @ResearchByDesign
      @ResearchByDesign  Před 3 lety

      The identification of an outlier depends on the other data. So the score may not be extreme compared to other scores around it. When you combine multiple items into a single score, you will be more likely to find extreme values. Thanks for watching the videos!

  • @miscelleneoustubes
    @miscelleneoustubes Před rokem

    Hi Prof, could you explain more on Winsorizing thing please? An example would be great! Great thanks!

    • @ResearchByDesign
      @ResearchByDesign  Před rokem +1

      No problem...winsorizing (after biostatistician Charles P. Winsor) is cutting the outlier down to the next most reasonable value. If the data are 1, 2, 3, 5, 6, 35... then 35 gets cut down to 7. It is still the highest value but no longer has the leverage of an outlier. Works well on income values. This new video goes into more detail: czcams.com/video/Mf9R-OKQUrU/video.html

    • @miscelleneoustubes
      @miscelleneoustubes Před rokem +1

      @@ResearchByDesign Thanks you professor. I have not studied any statistics in my life and currently doing masters where no stat course is being offered. However, I feel more confident in analyzing data and handling SPSS. All because of your excellent videos and I still recommend your lectures to my colleagues. I watched all your videos more than thrice that are relevant to me. Great thanks to you.

    • @miscelleneoustubes
      @miscelleneoustubes Před rokem

      @@ResearchByDesign I have age variable that has outlier, will it be okay to winsorize? Thanks.

  • @blackmambazo5528
    @blackmambazo5528 Před 6 lety

    where can i get the clickers.sav file?

    • @ResearchByDesign
      @ResearchByDesign  Před 6 lety

      I'm working on adding all data sets to the RStats Institute website. For now, email me at Missouri State University and I will send you a copy

  • @sarahohara4022
    @sarahohara4022 Před 2 lety

    anyone know how to find how many and what percentage of values are outliers? I'm in a data mining class and am working with very large data set.

    • @ResearchByDesign
      @ResearchByDesign  Před 2 lety

      I am working on some videos about data cleaning for the semester. Send me an email ToddDaniel at MissouriState.edu and I will send you my notes on outliers. Its too much for a comment reply. Good luck

  • @risawhite2644
    @risawhite2644 Před 2 lety

    Overall, it saves me. But if more detailed information about the 1st option for outliers, it would be better

    • @ResearchByDesign
      @ResearchByDesign  Před 2 lety

      I'm on it...working on a new set of videos and I will make sure the new box plot script includes more detail. Thanks.

  • @zaneonpower8135
    @zaneonpower8135 Před 2 lety

    Ni bukan gameonzz kenal hmm abang iman