9. Understanding Experimental Data

Sdílet
Vložit
  • čas přidán 18. 05. 2017
  • MIT 6.0002 Introduction to Computational Thinking and Data Science, Fall 2016
    View the complete course: ocw.mit.edu/6-0002F16
    Instructor: Eric Grimson
    Prof. Grimson talks about how to model experimental data in a way that gives a sense of the underlying mechanism and to predict behavior in new settings.
    License: Creative Commons BY-NC-SA
    More information at ocw.mit.edu/terms
    More courses at ocw.mit.edu

Komentáře • 26

  • @mikets42
    @mikets42 Před rokem +2

    ""regression" does not relate to error minimization. The term "regression" appeared first in the article describing statistics of people's height through generations. If a father was tall, his son would be likely taller than average, but ... less so because it is a "regression to the mean". See The Art of Statistics: Learning from Data by David Spiegelhalter for more details.

  • @shobhamourya8396
    @shobhamourya8396 Před 5 lety +7

    @44:44 Best ever explanation of coefficient of determination R and variability R^2

  • @haneulkim4902
    @haneulkim4902 Před 3 lety +1

    Fun, on point, and in-depth lecture. Thanks you MIT.

  • @leixun
    @leixun Před 3 lety +4

    *My takeaways:*
    1. An example: spring model 3:43
    2. Coefficient of determination 38:03

  • @nealyee6160
    @nealyee6160 Před 6 lety +8

    These jokes are so cool that I would hang out with them for sure

  • @ParisienDBS
    @ParisienDBS Před 7 lety +4

    Out of curiosity, at 19:01, what would trying to minimize the area of the triangle result in? as opposed to minimizing the distance y?

    • @mtp1376
      @mtp1376 Před 5 lety

      Since it contains an X difference, I think that the result would not have something significant.

    • @rsd2dcc
      @rsd2dcc Před 5 lety

      Nothing to do with the area of triangle. Trying to to find best line which stands at a minimum distance from observed value. So that means, you are trying to minimize the y value in the picture.

  • @binaria010
    @binaria010 Před 4 lety +1

    Great lecture!

  • @haneulkim4902
    @haneulkim4902 Před 3 lety

    @18:37 is he refering to line P?

  • @o3bvv
    @o3bvv Před 3 lety +3

    Trivia: while dealing with real data, one might not want R2 to get close to 1, as that might indicate overfitting, which is really not good, especially for prediction models, which is nicely illustrated by the case of 16-degree polynomial

    • @frankieboyseje
      @frankieboyseje Před rokem

      anything over a 5-degree polynomial is extremely rare in mathematics rather do a non parametric / non linear fit

  • @cjlion7081
    @cjlion7081 Před 4 lety +1

    would have been nice to see the slides

    • @tobalaba
      @tobalaba Před 4 lety +5

      Here: ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-0002-introduction-to-computational-thinking-and-data-science-fall-2016/lecture-slides-and-files/MIT6_0002F16_lec9.pdf

  • @kamellogb
    @kamellogb Před 6 lety +2

    cracked up with those jokes

  • @quocvu9847
    @quocvu9847 Před 8 měsíci

    38:28

  • @Zinzin09
    @Zinzin09 Před 7 lety +3

    Love the jokes!

  • @danielstankiewicz3747
    @danielstankiewicz3747 Před 3 lety +2

    ROFL because of the spring joke!

  • @xianhaozhu5315
    @xianhaozhu5315 Před 4 lety +1

    Not sure if R^2 is always positive.

    • @nbgarrett88
      @nbgarrett88 Před 4 lety

      R square is the percentage of explained variance/total variance. It falls between 0 and 1 accordingly. It records the amount of variance (error) explained by the model.

    • @fredfeng1518
      @fredfeng1518 Před 4 lety +4

      By definition (R2=1-RSS/TSS), the R2 will be negative when the model is worse than a "mean model" (y_hat = y_bar). In general, a model can be arbitrarily bad (RSS >> TSS), so R2 can certainly be negative.

    • @nbgarrett88
      @nbgarrett88 Před 4 lety +1

      Thank you @@fredfeng1518. I have looking into this more to better understand. Rhetorically, why are we being taught the range is 0-1? Is it just more practical? Admittedly, I am new to the field and only have a grasp of the basic concepts, but I can find many resources that I would find credible that state R^2 it is definitively 0-1. "It's a proportion." "It's a squared term.", etc. Is this contentious? Are negative r^2 more theoretical and so rare they aren't worth discussing?
      Anyways, thank you for elucidating the point and setting me straight. I will try to understand this better.

    • @fredfeng1518
      @fredfeng1518 Před 4 lety +4

      @@nbgarrett88 No problem. This is indeed more on the theoretical side. In practice, any useful model would have a positive R2, because if it performs even worse than the mean model (in which case RSS > TSS, and thus a negative R2), we could simply pick the mean model instead, which is always at our disposal.

  • @programmer1010
    @programmer1010 Před rokem

    32:28