Correlation Doesn't Equal Causation: Crash Course Statistics #8

Sdílet
Vložit
  • čas přidán 13. 03. 2018
  • Today we’re going to talk about data relationships and what we can learn from them. We’ll focus on correlation, which is a measure of how two variables move together, and we’ll also introduce some useful statistical terms you’ve probably heard of like regression coefficient, correlation coefficient (r), and r^2. But first, we’ll need to introduce a useful way to represent bivariate continuous data - the scatter plot. The scatter plot has been called “the most useful invention in the history of statistical graphics” but that doesn’t necessarily mean it can tell us everything. Just because two data sets move together doesn’t necessarily mean one CAUSES the other. This gives us one of the most important tenets of statistics: correlation does not imply causation.
    Crash Course is on Patreon! You can support us directly by signing up at / crashcourse
    Thanks to the following Patrons for their generous monthly contributions that help keep Crash Course free for everyone forever:
    Mark Brouwer, Justin Zingsheim, Nickie Miskell Jr., Jessica Wode, Eric Prestemon, Kathrin Benoit, Tom Trval, Jason Saslow, Nathan Taylor, Divonne Holmes à Court, Brian Thomas Gossett, Khaled El Shalakany, Indika Siriwardena, Robert Kunz, SR Foxley, Sam Ferguson, Yasenia Cruz, Daniel Baulig, Eric Koslow, Caleb Weeks, Tim Curwick, Evren Türkmenoğlu, Alexander Tamas, D.A. Noe, Shawn Arnold, mark austin, Ruth Perez, Malcolm Callis, Ken Penttinen, Advait Shinde, Cody Carpenter, Annamaria Herrera, William McGraw, Bader AlGhamdi, Vaso, Melissa Briski, Joey Quek, Andrei Krishkevich, Rachel Bright, Alex S, Mayumi Maeda, Kathy & Tim Philip, Montather, Jirat, Eric Kitchen, Moritz Schmidt, Ian Dundore, Chris Peters,, Sandra Aft, Steve Marshall
    --
    Want to find Crash Course elsewhere on the internet?
    Facebook - / youtubecrashcourse
    Twitter - / thecrashcourse
    Tumblr - / thecrashcourse
    Support Crash Course on Patreon: / crashcourse
    CC Kids: / crashcoursekids

Komentáře • 234

  • @snowballeffect7812
    @snowballeffect7812 Před 6 lety +352

    This needs to be mandatory viewing for EVERYONE.

  • @sugami82
    @sugami82 Před 6 lety +255

    "Correlation does not equal causation" was my old stats teacher's favourite phrase along with "always interpolate, never extrapolate." :)

    • @xsaberfaye
      @xsaberfaye Před 6 lety +15

      Extrapolation is actually necessary in certain circumstances though - for example predicting growth of global human population, economic forecasts, environmental forecasts regarding climate change.... anything that has to do with the future.

    • @Unordung
      @Unordung Před 6 lety +6

      Post hoc ergo propter hoc!

  • @SilortheBlade
    @SilortheBlade Před 6 lety +216

    Bah. I know my rock keeps away tigers because I have never seen a tiger for as long as I have had it.

    • @Tuckems
      @Tuckems Před 5 lety +2

      SilortheBlade Makes sense to me

  • @Pfhorrest
    @Pfhorrest Před 6 lety +223

    Nicholas Cage movies are correlated by yet another unmentioned variable: summer. Nicholas Cage is an action movie star. Action movies are generally targeted for summer releases. Summer is also hot, which is the cause behind air conditioner sales and swimming, the latter of which is of course the cause of drowning.

    • @aido92
      @aido92 Před 6 lety +57

      Pfhorrest Or it could be that people who have endured a Nicholas Cage movie are more likely to drown themselves ...

    • @polyjohn3425
      @polyjohn3425 Před 6 lety +11

      That's true, but the data shows a close correlation over multiple years, not just over the seasons of a given year. It just so happens that the summers of years with more Nicholas Cage movies also happen to have more drownings.

  • @qilinxue989
    @qilinxue989 Před 6 lety +644

    *Me:* I used to think correlation implied causation.
    *Me:* Then I watched this video. Now I don't.
    *Friend:* Sounds like the video helped.
    *Me:* Well, Maybe.

    • @polyjohn3425
      @polyjohn3425 Před 6 lety +24

      lol. Well, probably.

    • @jedisentinel4879
      @jedisentinel4879 Před 6 lety +20

      The video explains that it's not because two elements are correlated that one is the cause of the other. One '''can''' be the cause, but it's not logical to imply it just from their correlation. It was not the floor itself that broke the glass even though it is related to the breaking, it was it's impact with the glass, '''caused''' by gravity.

    • @verdatum
      @verdatum Před 6 lety +25

      XKCD is a pretty good comic :)

    • @HerodotusVon
      @HerodotusVon Před 6 lety +8

      Kachimbo somebody missed the joke

    • @noobnoobyify
      @noobnoobyify Před 6 lety +10

      Herodotus Von 8428 no, someone got the joke, but felt the need to expand our knowledge.

  • @murphygreen8484
    @murphygreen8484 Před 6 lety +23

    This has been my favorite CrashCourse season by far. Really enjoying the material and the host!

  • @aude1979
    @aude1979 Před 6 lety +13

    A class on non linear relationships would be FANTASTIC :) And more classes in general (e.g., on general versus mixed effects models; GAMs etc...) Thank you for your dynamism!

  • @jesusosegueda422
    @jesusosegueda422 Před 6 lety +6

    Crash Course, thank you so much. This awesome course is definitively above the curve!

  • @BlackCatGodess
    @BlackCatGodess Před 6 lety +31

    Puppy cat! I didn't know that they'd made a stuffed animal of him. This has greatly improved my day.

  • @laterkater4213
    @laterkater4213 Před 6 lety +30

    Better explanation then my university level stats class. 👍

  • @Deedj1
    @Deedj1 Před 6 lety +23

    Everyone needs to see this! Just because things seem connected on the surface doesn’t mean they’re related and Visa Versa!

  • @txt3567
    @txt3567 Před 5 lety +2

    Thank you so much for sharing. You're so much better at explaining than my professor.

  • @ginohobayan001
    @ginohobayan001 Před rokem

    Thank you!!!
    Learned so much from this video.

  • @gymotc
    @gymotc Před 4 lety +1

    Excellent video! Thank you!!!

  • @earth2ellie
    @earth2ellie Před 5 lety +4

    “Mr. Fluffy misses you.”
    *pouts thinking of the cat I don’t have missing me*

  • @hem89180
    @hem89180 Před 6 lety

    Love the series!!!

  • @aaronmarks9366
    @aaronmarks9366 Před 5 lety +9

    I wish all my scatterplots ended up making pictures of dinosaurs.

  • @greyareaRK1
    @greyareaRK1 Před 6 lety +45

    I haven't watched Nicholas Cage movies, AND I haven't drowned. Aha!

  • @MaureenMurphy_
    @MaureenMurphy_ Před 6 lety

    Thank you for thissss!!

  • @JackieChenpi
    @JackieChenpi Před 6 lety +20

    Watching Stat for fun again.

  • @xionpentagast
    @xionpentagast Před 6 lety

    Loved it!

  • @user-hb2rt7ek8x
    @user-hb2rt7ek8x Před 6 lety

    Как же замечательно вы рассказываете! Даже переводить ничего не надо! (Russian is deliberate here)

  • @xmems
    @xmems Před 6 lety +12

    Love this upload 😍

  • @akankshaandadityasingh9888
    @akankshaandadityasingh9888 Před 6 lety +71

    When she apologises for using imperial units......

  • @aaronmarks9366
    @aaronmarks9366 Před 5 lety +6

    "Air Cons, and Con Airs"
    Amazing

  • @mielthebee
    @mielthebee Před rokem +4

    "..if people blink more when they're lying!"
    Our Professor: 😳

  • @user-ic6gv2ih3t
    @user-ic6gv2ih3t Před 5 lety

    很棒的视频, 对学习统计学非常有帮助

  • @GameOver321
    @GameOver321 Před rokem +1

    wow! Thank you

  • @darrenreuben4222
    @darrenreuben4222 Před 6 lety

    this was an awesome video

  • @gamereditor59ner22
    @gamereditor59ner22 Před 6 lety

    Great video!!!😊

  • @maruisaiahnapa7381
    @maruisaiahnapa7381 Před 6 lety +1

    I was JUST reading up on this in class! 😂

  • @daniels4209
    @daniels4209 Před 6 lety

    Thank You.

  • @verdatum
    @verdatum Před 6 lety +1

    Anecdotally, after playing Simpsons: Hit & Run (a GTA clone), I genuinely drove more recklessly for a little while. Not like I got into an accident, but like I was cutting corners tighter, and being a little heavier on the pedal. I had to work at it to knock it off. Really really good game though.

  • @Angelusloco15
    @Angelusloco15 Před 5 lety

    EXCELLENT!

  • @luminias.upscmentor
    @luminias.upscmentor Před 6 lety

    Gain in my knowledge is perfectly correlated with the number of crash course videos I watch and shows the value of absolute +1 as the correlation coefficient #CrashCourse ..... 😁😁😁

  • @KASSISHOT
    @KASSISHOT Před 6 lety +11

    Every time I see one of these videos I look at the view count and know that there's that many more people out there that are better educated about this topic and that makes me very optimistic for the future keep up the great work guys

  • @ramseszeeman4076
    @ramseszeeman4076 Před rokem +2

    without you guys i would not pass my exams thank you so much

  • @rkpetry
    @rkpetry Před 6 lety +2

    ...how do you fit a regression line through a circle (or fat ellipse) on a 2D-scattered, plot...
    ...how do you define accuracy where there are fewer data points, even though the fitted-curve looks similar, (do you overlay random information certitude measure sigma bars)...
    *_...(in case you missed the first question: flip the plot axes for a different regression line...)_*

  • @lovepeople951
    @lovepeople951 Před 6 lety +11

    Thank u Crash Course

  • @wenhong5852
    @wenhong5852 Před 5 lety +2

    Watching this video at work, miss my cat. Burst into tears

  • @easysnake205
    @easysnake205 Před 6 lety +29

    I feel some people go so far in this argument that they seem to argue the correlation disproves causation.
    Eg. "thats only correlation it doesnt prove causation, obviously you are wrong"
    Yes correlation doesnt prove causation, but it most definitely does not disprove causation. Further it might suggest causation, or that a 3rd factor is causing both phenomena to occur. Its frustrating to give data in an argument, to have the other side counter with, "thats only correlation, it doesn't prove causation, you are wrong."

  • @NataPal
    @NataPal Před 6 lety

    i love this

  • @youknuckle
    @youknuckle Před 6 lety

    Love this video and the channel, also - @1:43 You've spelled eruptions wrong...

  • @teen-at-heart
    @teen-at-heart Před 6 lety

    Good episode, but some things would need exercise and ‘usage’ in order to be memorized well and longer-term, like r and r squared.

  • @ComedyCorner619
    @ComedyCorner619 Před 6 lety +1

    Hello great video

  • @MrGustaphe
    @MrGustaphe Před 6 lety

    The example of changing the units on the y-axis is only relevant if you're not doing your dimensional analysis properly. If the slope of the feet-feet plot is 0.5, then the slope of the meter-feet plot is 0.15m/foot=0.5

  • @ElforTheLandstander
    @ElforTheLandstander Před 6 lety

    This was the funniest Crash Course video I've ever seen. Her comedic timing is excellent. Though I still don't know if that clever mayor was a man or a woman.

  • @BCsenge97
    @BCsenge97 Před 4 lety

    I love this chanel

  • @davidsweeney111
    @davidsweeney111 Před 6 lety +18

    This needs to be essential viewing for EVERYONE.

  • @h0rban
    @h0rban Před 6 lety

    You have mentioned that the steep line can have a strong correlation but there was no support of the graphics. Emphasis for users: the slope and correlation are different

  • @flippersnyder
    @flippersnyder Před 6 lety +4

    So this was great. You are definetly one of my favorite crash course hosts. And I took statistics back in 1994. I have one question that boggles me. When and who is right, who determines the reality or that there is causation?
    Example .... cigarette smoking and lung health. The negative effects are clearly visible, the correlation is there ... but is it really the cause? When and how do we get to a positive causality?
    Or is it left to the interpreter? Or is it just all relative? Or by the end of the day it's meaningless and everyone can make the statement "correlation doesn't equal causality" and your data and beautiful charts and correlations just fizzle out?

    • @24680kong
      @24680kong Před 6 lety +2

      That's the tricky part! Ultimately they all need to be interpreted. Overall, there is no true "proof", just higher levels of confidence. I am confident that the city of Paris exists, even though I've never been there. The process generally starts by asking "is this even possible?" and "Does this make some sense?" Then you can go back and try to find some other cause of the data you got. Eventually, you have to do experiments carefully. But even well-planned experiments can have hickups and biases (there have been many cases of seemingly high-confidence experiments not being repeatable by other professionals). Often, multiple experimenters need to come up with the same results on their own (and usually with their own equipment) before the scientific community is convinced. Overall, it's a difficult and time consuming process.

    • @sammyinengland
      @sammyinengland Před 6 lety +3

      In health data like the lung example, there is a set of criteria called the Bradford-Hill criteria. Google it. This is criteria for determining if something can be considered causation. It is not a checklist: you still need to do your own scientific interpretation. But it’s a good way to get an idea of whether the data your looking at implies causation or not. The criteria are: effect size, consistency, specificity, temporality, biological plausibility, dose-response relationship, coherence, analogous results. Interestingly, Bradford Hill who came up with this list, is the same Hill who co-authored the original Doll and Hill paper that established the linked between smoking and lung cancer!

  • @thegodofalldragons
    @thegodofalldragons Před 6 lety +2

    I've seen people both conflate correlation with causation in situations that are clearly coincidence and insist that correlation does not equal causation when the pattern of cause and effect are obvious.

  • @tvvt005
    @tvvt005 Před 19 dny

    Just noticed puppycat on her table! 💗

  • @maddijackson134
    @maddijackson134 Před 6 lety +2

    Please do more literature!!

  • @AnanthaSKrishnan
    @AnanthaSKrishnan Před 5 lety

    @crash course team, not all the graphs in the datasaurus dozen shown in the end doesn't seems like having same correlation coefficient. Few look like having r=1, few r=0. Please correct me if I'm wrong

  • @UnknownRefrigerator
    @UnknownRefrigerator Před 6 lety +1

    I love this series! However, you made one, small lie: R^2 does not have to be between zero and one, but can in fact be negative.
    You spoke of the mx + b, but failed to mention what value it has to determine b (and if chose horribly wrong, it can give you negative R-values, due to estimate a model that is worse than random).
    Keep up the series! :)

    • @mishadonchenko4362
      @mishadonchenko4362 Před 6 lety +4

      Squares of real numbers are always nonnegative, by definition. They can never be less than zero -- the square of -5 is 25, for example.

  • @tvit
    @tvit Před 6 lety

    Those movie computer tick noises (when charts are presented) drive me mad, and I don't even have EQ in my setup to damp them down. Good vid though!

  • @MakeMeThinkAgain
    @MakeMeThinkAgain Před 6 lety +4

    I was TRICKED into watching this by the title. How hard would it be to add, "WARNING! THIS IS STATISTICS, DWEEB" to what appears on my temptation screen?
    It was really good.

  • @amohamoud3992
    @amohamoud3992 Před 6 lety +1

    While taking my stats course I started sleep talking and explained empirical rule to my mon

  • @MrCanada4evr
    @MrCanada4evr Před 5 lety +1

    Cool-Cage Act; hilarious.

  • @comareja4
    @comareja4 Před 6 lety

    Its was hillarious ,the data present by the reporter.

  • @HrishikeshPalande
    @HrishikeshPalande Před 4 lety

    We don't predict the temperature in Fahrenheit we calculate it using the formula (c*9/5)+32

  • @mariafranciscalopez3694
    @mariafranciscalopez3694 Před 5 lety +1

    Me: focus, you have a test this week
    Also me: OMG PUPPYCAT!!

  • @elijahsassercollins3685
    @elijahsassercollins3685 Před 4 lety +3

    now go teach the media this so they can stop blaming video games for all the worlds problems

  • @diamoniqueallen2231
    @diamoniqueallen2231 Před 5 lety +2

    The Bee and Puppy-cat doll in the back is sooo cute (๑>◡

  • @kevinye1041
    @kevinye1041 Před 4 lety +2

    Squared correlation r^2
    Line of regression
    Can anyone explain a little more in depth standard deviation? Im still not sure what information it tells us in a scatter plot

  • @NaihanchinKempo
    @NaihanchinKempo Před 6 lety

    wish you'd touch on poker. Math and Data is very important in poker

  • @twiggyvlogs6441
    @twiggyvlogs6441 Před 6 lety

    Any chance of crash course architecture (history of?)

  • @rkpetry
    @rkpetry Před 6 lety +1

    *_...there'd be a negative-correlation where reducing air conditioning increases swimming..._*
    *_...or, an overriding 'cause' leading to watching-speeding or doing-it, another, negrelation..._*
    *_...so...what's the mathematically-concisely-stated-statistical-rule for causality-guessing..._*
    *_...(making statistics, like modulo arithmetic: where compounded moduli may get better)..._*

  • @PatrickMichaelOLeary
    @PatrickMichaelOLeary Před 5 lety +1

    At 0:27, it must have taken everything you had to not blink.

  • @SangoProductions213
    @SangoProductions213 Před 6 lety +5

    Correlation does not neccesarily state that causation is found between two variable.
    However. don't walk away thinking correlation disproves causation. This isn't politics. There are more than two possibilities. (There are in politics too, but ignore that.) Thanks, and have a good day.
    As a final note: Time taken to get from point a to point b is negatively correlated with speed. There is (by definition no less) causation there.

    • @verdatum
      @verdatum Před 6 lety +1

      Sango, that's a good tip. But I fear that addressing people as the "scientifically illiterate" might not be the best way to get your message across. (What I would give for Crash Course: Rhetoric).

    • @SangoProductions213
      @SangoProductions213 Před 6 lety

      Everyone was illiterate (scientific and otherwise) at one point. It is one's duty to make sure they do not continue to be.

    • @xsaberfaye
      @xsaberfaye Před 6 lety

      There is no causation only chaos.

    • @verdatum
      @verdatum Před 6 lety +1

      It is absolutely true that everyone begins illiterate, and there should be no shame in that. However, referring to people as such can cause them to misinterpret your message as being condescending, even though you had no intention to be that way. Regardless, they are now slighted, and in retaliation, they ignore your advice, no matter how reasonable it was.

    • @verdatum
      @verdatum Před 6 lety

      kaizersabre, there is no Dana, only ZOOL.

  • @badcookies308
    @badcookies308 Před rokem

    PuppyCattttt!!!! so cute

  • @renovationgaming5438
    @renovationgaming5438 Před 6 lety +1

    The first eruption scatter plot has a typo

  • @malteeaser101
    @malteeaser101 Před 4 lety

    If A caused B then there is a correlation between A and B.
    The rising of the Sun caused the eating of an ice cream by John.
    Therefore, there is a correlation between the rising of the Sun and the eating of an ice cream by John.
    My question is, how would you quantify those events and plot the correlation between them on a graph? Would I count the number of times these events occurred? What if an event only causes another once? What if John died after the first ice cream? Can we still say that there was a correlation?

  • @StKozlovsky
    @StKozlovsky Před 4 lety

    3:12
    > Hummer, the epitome of in-your-face Americanness
    > Russian license plate

  • @childfs6865
    @childfs6865 Před 6 lety +18

    Comment containing the word EVERYONE in caps lock.

  • @PatrickMichaelOLeary
    @PatrickMichaelOLeary Před 5 lety

    In Pearsons study, did he take into account that people often shrink as they get older?

  • @fame2011xoxo
    @fame2011xoxo Před 6 lety

    Does anyone know how to interpret a Bland-Altman analysis?

  • @brittbrat756
    @brittbrat756 Před 5 lety

    omg! PUPPYCAT 😭💗

  • @dbuyandelger
    @dbuyandelger Před 6 lety

    Hmm. I may have needed this video 2 years ago when I was toiling in the halls of grad school

  • @yetigriff
    @yetigriff Před 6 lety +12

    That's not the graph Jim Carrey and Jenny McCarthy showed me.

  • @ternvall
    @ternvall Před 6 lety +4

    y = mx + b , is this some American standard? In Sweden it's y=kx+m

    • @HeinerS
      @HeinerS Před 6 lety +1

      It doesn't really matter either way. The general consensus is that the last letters from the latin alphabet, i.e. x, y and z are being used as placeholderds for unknown quantities, whereas letters from the beginning (e.g. a, b and c) or middle (e.g. k, l, m and n) are being used as placeholders for known quantities (to be supplied or deduced when doing a specific example). The placeholders for know quantities may be different in different countries for many reasons (ease of pronounciation, legibility, tradition, etc.). Tradition usually also means that often the same equation uses different placeholders in math and physics. Example: in Math class the may use y = ax + b, in Physics class they may use y = mx + c, just because ... (and then of course in the kinetic equations this becomes e.g. v = at + v0 representing physical quantities).

  • @robertpalumbo9089
    @robertpalumbo9089 Před 6 lety

    Wow this is my doctor and his funny science

  • @AdamShaiken
    @AdamShaiken Před 6 lety +8

    This was very interesting...though, I wonder, just how significant it is ? Can you give me a chi squared on that ?

  • @redstone8513
    @redstone8513 Před 6 lety

    1:20 They spelt eruptions wrong on the y-axis...

  • @surajshahi4966
    @surajshahi4966 Před 4 lety

    Beginning and end were in a circle.

  • @ThePeaceableKingdom
    @ThePeaceableKingdom Před 6 lety

    The other shoe never dropped! So what does equal causation? :)

  • @nickwilsonxc
    @nickwilsonxc Před 5 lety +1

    I’ll have you know that my cat, Mr. Whiskers, loves me.

  • @omarkhalaf7014
    @omarkhalaf7014 Před 4 lety +3

    Wait... Technically everything is connected. Maybe the relationship between 2 variables are correlated even tho it doesn't make sense that they cause each other, but that happens because these 2 variables are connected to other variables that we didn't observe yet these variables can indirectly influence the relationship between the main 2 variables we are comparing. So I guess that means, one way or another, correlation DOES imply causation. Error 404

  • @DudeWhoSaysDeez
    @DudeWhoSaysDeez Před 6 lety

    Are regression lines ever parabolic?
    What would be some examples if so?

    • @dabomba1951
      @dabomba1951 Před 6 lety

      optimum angle for maximum range. Range in terms of angle would have a turning point around 45 degrees where it reaches its max range then goes back down. one example

    • @chelseaparlett8069
      @chelseaparlett8069 Před 6 lety

      you can use a parabola to fit data. It would be polynomial regression where your x's are taken to various powers. Sometimes it's really useful to do so, since often data isn't perfectly linear.

  • @z4m01
    @z4m01 Před 6 lety

    You can "learn" more spurious correlations here:
    www.tylervigen.com/spurious-correlations
    and even discover new correlations here!
    tylervigen.com/discover

  • @gregoryfenn1462
    @gregoryfenn1462 Před 6 lety

    Can we do a talk on how you DO identify causation, not just rule out plausible causal relations? Or are we taking a Humean view of causation and saying there is no real force of causation at all, just a fixed regularity that humans imagine happens?

    • @CarlyDayDay
      @CarlyDayDay Před 6 lety

      I think it requires an experimental study

  • @danielmclaughlin5573
    @danielmclaughlin5573 Před 6 lety

    Mr. Fluffy does not miss me.
    Mr. Fluffy ran away...

  • @ibnufajar8733
    @ibnufajar8733 Před 6 lety +2

    does the "r²=0.7" mean that we could predict accurately by 70% ?

  • @unacomn
    @unacomn Před 6 lety +5

    I don't know, Nic Cage may be dragging people to the deep after they see his movies. The evidence is there.

  • @moonemonne2318
    @moonemonne2318 Před 4 lety

    Is that a puppycat omg i wish to have it too

  • @rparl
    @rparl Před 5 lety

    But if you have a time turner. . . ?

  • @ZoggFromBetelgeuse
    @ZoggFromBetelgeuse Před 4 lety

    I watched this video without having seen the previous ones, and spent a considerable amount of time wondering "what the heck is an 'old faithful eruption' ?"

    • @ZoggFromBetelgeuse
      @ZoggFromBetelgeuse Před 4 lety

      (For those who have the same problem: "Old Faithful" seems to be the name of a geyser. (I don't know where it is, but when an English CZcams show refers to a location, person, event or sports ritual you have never heared of, you can be pretty sure it's in North America.)

  • @bigpapi3636
    @bigpapi3636 Před 6 lety

    The narrator is very easy to listen too. Even I understood the content

  • @muhammadabdulhakeem7152

    i will like to confirm that is the equation of a line equals y=mx +b or y=mx+c