Confidence Intervals, Clearly Explained!!!

Sdílet
Vložit
  • čas přidán 8. 07. 2015
  • Confidence Intervals can be confusing, but with bootstrapping, they are a piece of cake. BAM!
    For a complete index of all the StatQuest videos, check out:
    statquest.org/video-index/
    If you'd like to support StatQuest, please consider...
    Buying The StatQuest Illustrated Guide to Machine Learning!!!
    PDF - statquest.gumroad.com/l/wvtmc
    Paperback - www.amazon.com/dp/B09ZCKR4H6
    Kindle eBook - www.amazon.com/dp/B09ZG79HXC
    Patreon: / statquest
    ...or...
    CZcams Membership: / @statquest
    ...a cool StatQuest t-shirt or sweatshirt:
    shop.spreadshirt.com/statques...
    ...buying one or two of my songs (or go large and get a whole album!)
    joshuastarmer.bandcamp.com/
    ...or just donating to StatQuest!
    www.paypal.me/statquest
    Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
    / joshuastarmer
    #statquest #statistics #confidenceinterval

Komentáře • 335

  • @statquest
    @statquest  Před 2 lety +3

    Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/

  • @kylebecker5083
    @kylebecker5083 Před 3 lety +157

    Josh, when I finish the StatQuest Statistics Fundamentals playlist, will you send me a BAM certificate? I want to be BAM certified.

    • @statquest
      @statquest  Před 3 lety +50

      BAM!!! One day I will make certificates. :)

    • @reetanshukumar1865
      @reetanshukumar1865 Před 11 měsíci +4

      @@statquest can I make one for you 👻

    • @Wahkyascene
      @Wahkyascene Před 2 měsíci

      Oh! He will get your BAM certified

  • @joarvat
    @joarvat Před 4 lety +170

    Guys, I can't believe you are doing all this. I am trying to break into the field of data science, and your videos are really great because you are doing it in such an entertaining way. Big thank you!

    • @statquest
      @statquest  Před 4 lety +14

      Thanks and good luck with Data Science! :)

    • @Dupamine
      @Dupamine Před 4 lety +3

      how is it going man

    • @joarvat
      @joarvat Před 4 lety +1

      @@Dupamine I am still just in the beginning, but I have just started in my first analyst job

    • @applepeel1662
      @applepeel1662 Před 3 lety

      @@joarvat hey I'm 20 rn and learning all bout statistics for data science. Is it worth it?

    • @joarvat
      @joarvat Před 3 lety

      @@applepeel1662 I have got my first analyst job, and then I decided to go through the Data Science track with dataquest.io It's a great program.

  • @kurosakishusuke1748
    @kurosakishusuke1748 Před 2 lety +7

    Months ago, I found an idea to know how machine learning may make varied prediction given different sample orders in both train and test set under influence of bootstrapping for my thesis. But, knowing that I had spent too much time thinking on how to clearly communicate the CI results to supervisor, promptly I jumped to watch this video and this is exactly what I have been searching for. You have my deep gratitude.!

  • @06Amruta
    @06Amruta Před 3 lety +23

    Respect and gratitude to you!! Your videos are in my interview prep playlist! Thanks so much for making math understandable!!

    • @statquest
      @statquest  Před 3 lety +2

      Good luck with your interviews! Let me know how they go.

  • @yonatansegal1615
    @yonatansegal1615 Před 2 lety +13

    I am medical student with an Bachelors in Science and this is possibly the only Stats tutorial, I have EVER been able to understand!!! Thank you

  • @marshalljordan2416
    @marshalljordan2416 Před 3 lety +4

    Thanks for such a clear explanation of bootstrapping and confidence intervals. The two concepts do go together so that understanding bootstrapping makes confidence intervals and their interpretation easy to understand.

  • @annel5546
    @annel5546 Před 3 lety +16

    I'm currently writing my bachelor thesis and this video helped me a lot, thank you! What I like most is, that it's not too long and on point. Moreover, I'm not a native English speaker, but the video was very clear and easily understandable.

  • @siyuguo3300
    @siyuguo3300 Před 4 lety +8

    I learned stat for 6 years, and this is the best tutorial about CI. Thank you very much.

  • @vincentlin9926
    @vincentlin9926 Před 2 lety +2

    You are a true life saver for person like myself who needs such knowledge but never had a chance to get educated in school…thank you.

  • @Birdsneverfly
    @Birdsneverfly Před 5 lety +1

    Wonderful series.
    Thankyou for sharing your knowledge.

  • @siddhft3001
    @siddhft3001 Před 3 lety +2

    This is by far one of the best videos I've seen. Thank you so much!

  • @user-ql1qw7gu7y
    @user-ql1qw7gu7y Před 2 lety +8

    You know, I am a Data scientist and work in the banking sphere for 3 years. I noticed your videos in my CZcams recomendation section and was like: "easily explained? Ye ye haha just another video for those who hope to easily learn ML and statistics, well let me watch it during my breakfast". And I was shocked. I realized that the use of Python made me completely blind about some connections between measurements. Sometimes I run tests without any true understanding. For example, those last 10 seconds about "when we should run t-test" were completely new for me! And that can be told about a lot of your videos. There is always a tiny detail that makes me say "oh, wow, that was something I've never noticed".
    You should definetely run a course on Coursera...

    • @statquest
      @statquest  Před 2 lety

      Wow!!! Thank you very much!!! :)

  • @nicolethm2002
    @nicolethm2002 Před 7 měsíci +1

    This was great. I’m taking and finding stats complicated but this broke down the basics of what it was supposed to. Thanks 🙏🏾

  • @healingmyselfalone
    @healingmyselfalone Před 9 měsíci +1

    I usually don't comment on YT videos, but I'm eternally grateful that you are posting such incredibly useful videos. Thank you very much!!! God Bless!!!!

  • @kuanjuchiu9450
    @kuanjuchiu9450 Před 5 lety +16

    This saves the world, thank you so much

    • @statquest
      @statquest  Před 5 lety +4

      Thank you! I'm glad I could save the world! I thought only Spider-Man could do that. ;)

  • @z8709
    @z8709 Před 4 lety +4

    I am also a fan and I highly recommend the videos from StatQuest to student in my class.

    • @statquest
      @statquest  Před 4 lety

      Awesome! Thank you very much! :)

  • @leonardogoes683
    @leonardogoes683 Před 5 lety +12

    This video helped me to better understand the p-value besides the confidence intervals.

    • @billytheweasel
      @billytheweasel Před 2 lety

      That _was_ good, very clearly related. The introduction of the term 't-test' threw me however.

  • @rrrprogram8667
    @rrrprogram8667 Před 6 lety +115

    I am trying to complete all ur videos

  • @CE-wg5gn
    @CE-wg5gn Před 3 lety +7

    If I ever finish my PhD, I propably need to credit you for every knowledge I have about statistics. And I actually learned this stuff beforehand.

    • @statquest
      @statquest  Před 3 lety +4

      Good luck finishing your PhD! You can do it!!! :)

  • @jorgevalero4819
    @jorgevalero4819 Před rokem +2

    Thanks so much. I have been working on hydrology for many years and finally I understood this valuable concept.

  • @baay81
    @baay81 Před 7 měsíci +1

    thanks for sharing. will use this example on my students for sure. will link to the video, of course

  • @SergeySenigov
    @SergeySenigov Před 9 měsíci +1

    Josh, you are genius. Finally i got the idea how t-test are made and confidence intervals and p-value relate to each other! And why one can simply check if "0" statistics belongs to conf int!

  • @DreamCodeLove
    @DreamCodeLove Před 4 lety +4

    One of best tutorials I watched on net paid or otherwise...

  • @emelyannett
    @emelyannett Před 4 lety +2

    This is so helpful. Thank you

  • @dewinmoonl
    @dewinmoonl Před 2 lety +1

    another exciting quest complete!

  • @destinnguon4877
    @destinnguon4877 Před 8 měsíci +1

    Excellent, thank you so much!

  • @apnp6787
    @apnp6787 Před 2 lety +1

    Dude, beautifully and simply explained!

  • @nikolenarepousi8189
    @nikolenarepousi8189 Před rokem +1

    Amazing! Thank youuuu

  • @luchan1638
    @luchan1638 Před rokem +1

    your videos are god sent

  • @redcat7467
    @redcat7467 Před 2 lety +2

    Mr. Josh Starmer's singing abilities has significantly advanced since year 2015.

  • @Hersh0828
    @Hersh0828 Před 2 lety +1

    You are a godsend Josh!

  • @jjbotha6242
    @jjbotha6242 Před rokem +1

    Excellent!!

  • @viktorsemenov7208
    @viktorsemenov7208 Před 4 měsíci +1

    it is brilliant. thanks!

  • @LittleMonsterswiftie
    @LittleMonsterswiftie Před 3 lety +1

    Very clear! thank you

  • @ashimay4722
    @ashimay4722 Před 3 lety +1

    Amazing explanation..!!

  • @manuelargos
    @manuelargos Před 2 lety +1

    YOU ARE THE BEST OUT THERE!

  • @Majso11
    @Majso11 Před 2 lety +4

    Im gonna pass my stat exam thanks to you, you explain it so well :'))))))

  • @marioestrada2233
    @marioestrada2233 Před 2 lety +1

    Thanks, confidence intervals seemed so tricky! Till now!!!

  • @AcademiaDados
    @AcademiaDados Před 2 lety +1

    Beautiful.

  • @response2u
    @response2u Před rokem +1

    Thank you sir!

  • @brienwashington4019
    @brienwashington4019 Před 2 lety +1

    This is so simple and eloquent.

  • @xnoreq
    @xnoreq Před 5 lety +1

    Your example also shows how backwards confidence intervals and p-values are.
    You already assume a mean of ~26. But you end up calculating a p-value to make a probabilistic statement about the mean being lower than ~21 ... given samples from a distribution with a fixed mean of ~26.

  • @ryan_chew97
    @ryan_chew97 Před 4 lety +1

    this is the best LOL too simple and easy to understand

  • @henriqueazank5254
    @henriqueazank5254 Před 4 lety +1

    I'm curently speedrunnig all your videos

  • @chrishayward7969
    @chrishayward7969 Před 3 lety +1

    Fantastic video :)

  • @bnv8514
    @bnv8514 Před 3 lety +6

    "A 95% confidence interval is just an interval that covers 95% of the means." 😁

  • @littlesparkle3938
    @littlesparkle3938 Před 5 lety +1

    Thank you so much

  • @mandeepbaluja5401
    @mandeepbaluja5401 Před 4 lety +1

    Very nice man

  • @ritika.upadhyay
    @ritika.upadhyay Před 5 lety +6

    Hi Josh! Great videos (I'm currently on a StatQuest marathon and it has been incredibly helpful!)
    I have a question though. Could you explain the bit about p-value being less than 0.05 in case of the weights of female and male mice? Instinctively I understand that there's a statistical difference between the true means of these two but I'm struggling to relate it to the idea of p-value.
    Thank you!

    • @funny__bean
      @funny__bean Před 4 lety

      Same thing occurred to me!! If in a 95% confidence interval, the remaining 5% do not cover means right? If so, then how come its p-value is significantly different?

    • @busyshah
      @busyshah Před 3 lety +1

      By definition p-value denotes a probability of (something other which is equally rare + something rarer than null hypothesis) happening. Here the null hypothesis is that the means for both male and female mice are from same population. But we already know that 95% confidence intervals of both male and female means don't coincide. So there is only one possibility left that less than 5% cases will have the possibility of their means coinciding. Which is why p-value is

  • @pupface
    @pupface Před rokem +1

    Thank you. It's crazy how nobody else seems able to explain this clearly

  • @vahidnajafzadeh4137
    @vahidnajafzadeh4137 Před 3 lety +1

    I consider myself one of the most stupidest people on earth in learning stats. and yet here I understood the CI concept very well. a big fat thank you to you 😊.

  • @mmk34
    @mmk34 Před 3 lety

    Josh, I love the line diagrams you use in your illustrations, how do you put these together?

    • @statquest
      @statquest  Před 3 lety

      For details on how I create my videos, see: czcams.com/video/crLXJG-EAhk/video.html

  • @Tiffahorror
    @Tiffahorror Před 3 lety

    Thank you for explaining this like a normal person and not like you're teaching people who already know how to do it.

  • @BibleSamurai
    @BibleSamurai Před 2 lety +1

    the humor is great

  • @globalshooky5030
    @globalshooky5030 Před 2 měsíci +1

    youre a life saver

  • @foedeer
    @foedeer Před rokem +1

    You mean the world to me man.

  • @dharam8060
    @dharam8060 Před 5 lety +1

    How do we calculate 95% cover from the Bootstrap means?

  • @adirozeri7162
    @adirozeri7162 Před rokem

    Thank you so much for the explanation! i have one question tho - could there be more that a single 95 interval for the example above and does it matter? how do you construct it? thanks!

    • @statquest
      @statquest  Před rokem

      Any interval that covers 95% of the bootstrapped means qualifies, but usually you select the one that is centered on the original mean.

  • @ellenpasternack9750
    @ellenpasternack9750 Před 3 lety

    I love this video! I have a question about comparing the confidence intervals for different distributions. Let's say you want to use the confidence intervals from the female vs male mice to make a statement about the confidence interval of the difference in mass between the sexes, how could you go about doing that? As in, 'it would take a really unusual female mouse and a really unsual male mouse, such that the probability of both being chosen is 5% or less, to get a difference more than y or less than z'

    • @statquest
      @statquest  Před 3 lety +1

      I'm not sure how to do that with confidence intervals, but we could estimate it from the data by randomly selecting a female mouse and a male mouse and measuring the difference in mass. Do that a lot of times (if the dataset is relatively small - do it for every permutation of pairs - if larger, just do a lot of random sampling) and then plot a histogram of those differences and use the histogram to calculate the probabilities of getting differences between y and z.

  • @alexandergarcia6479
    @alexandergarcia6479 Před 3 lety

    Hi joshua, thanks for the video, can you tell me what do you use to make those sample plots? I don't find that tool in python, thank you so much

    • @statquest
      @statquest  Před 3 lety

      For details on how I create the images, see: czcams.com/video/crLXJG-EAhk/video.html

  • @haoranzhang3993
    @haoranzhang3993 Před 2 lety +4

    Hi Josh, thank you for the nice video! One quick question, I learnt the interpretation of 95% confidence interval is 95% of confidence intervals will contain the true mean (i.e. if we have n=100 random samples of size 5, there are 95 confidence intervals will contain the true mean). It seems different from your explanation here?

    • @statquest
      @statquest  Před 2 lety +3

      It's the same. However, we are arriving at the confidence interval differently and we need to make sure we don't confuse a bootstrapped mean for a population mean. The interval that contains 95% of the bootstrapped means is a 95% CI, and thus, if we repeated the process a bunch of times, 95% of the intervals calculated that way will contain the population mean.

  • @matthewdong9368
    @matthewdong9368 Před 5 lety +2

    In the example where you want to get the p-value for true mean less than 20, and the result is less than 0.05. Does that mean it's very unlikely that the true mean is less than 20? Thanks!

  • @user-sl9wi7tl4f
    @user-sl9wi7tl4f Před 4 lety

    Thank you for your wonderful video, here I have a question. When 95% CI do not overlap, we could say there is a significant difference between the two sample sets. I want to ask is we can conclude if the significant difference when the SD of two sets do not overlap, and how about SEM? Hope for your reply. :)

    • @statquest
      @statquest  Před 4 lety

      95% confidence intervals reflect the SEM, rather than the standard deviation of the raw data. For more details, see: czcams.com/video/A82brFpdr9g/video.html

  • @jimmyxu1735
    @jimmyxu1735 Před 4 lety

    Hello Sir, great videos, thanks. One quick question, is this one tailed or two tailed p-value? if two tailed, then the p-value would be 0.025 given 95% CI. Please clarify, thanks a lot again, J

    • @statquest
      @statquest  Před 4 lety +1

      95% confidence intervals are not 1 or two tailed p-values, they are intervals. 95% of them will cover the true mean.

  • @rprana12777
    @rprana12777 Před 5 lety +1

    I like this

  • @shilupangrak1593
    @shilupangrak1593 Před 3 lety +1

    great!! statquest apps will be a good platform

  • @sinadehesh6884
    @sinadehesh6884 Před rokem

    Love you❤

  • @deuteros
    @deuteros Před 2 lety

    Thank you, Josh. Great video. However, I don't know how to calculate the confidence interval. Is it calculated through 2 times the standard deviation of the mean of the sample means?

    • @statquest
      @statquest  Před 2 lety

      There are lots of formulas for calculating confidence intervals. Conceptually, the easiest one to remember is bootstrapping, however there are lots of other formulas you can use. For details, see: www.statisticshowto.com/probability-and-statistics/confidence-interval/

    • @deuteros
      @deuteros Před 2 lety +1

      @@statquest Thanks! I will check that out. Cheers

  • @TheTessatje123
    @TheTessatje123 Před rokem

    Thanks for the video! Are confidence intervals always defined for distributions of the sample means (i.e. means obtained by boodstrapping)? Or can you also calculate them for one single experiment? Or the means of multiple experiments without bootstrapping?

    • @statquest
      @statquest  Před rokem +1

      Because of the central limit theorem, all means are normally distributed, so there is a closed form equation for all confidence intervals based on that and you don't need bootstrapping. In other words, you can calculate the CI with a single set of measurements. However, I believe the concept is easier to understand with bootstrapping.

    • @TheTessatje123
      @TheTessatje123 Před rokem +1

      @@statquest I see, thanks!

  • @alexanderlewzey1102
    @alexanderlewzey1102 Před 5 lety

    i'm a little confused, is it true that if you use bootstrapping of a sample that will only tell you with what confidence you can state the mean of that particular sample? wouldnt you need to know the population standard deviation to get the confidence intervals for a sample from the population?

    • @alexanderlewzey1102
      @alexanderlewzey1102 Před 5 lety +1

      i think i've worked it out, i wasnt adjusting the margin of error in accordance with the sample size ie changing the t/z value, that is when you sample is really small the confidence interval becomes massive to account for that. Either way i would still like to hear your answer if you have time, thanks.

  • @theforester_
    @theforester_ Před 2 lety +1

    i lost it when u said u didnt weight every single female mouse on the planet, just twelve... hahaha thanks anyway

  • @viktormaximiliandistaturus7660
    @viktormaximiliandistaturus7660 Před 7 měsíci +1

    you deserve a like

  • @saudzaman1243
    @saudzaman1243 Před 3 lety +1

    So if p value for a sample is < 0.05, does that imply that the sample is not a good representative of the population?

    • @statquest
      @statquest  Před 3 lety +2

      It suggests that the sample may come from a different population than the one you think you are collecting it from.

  • @ai1888
    @ai1888 Před 6 lety

    Is it always true that you don't need any other statistical tests for two distributions with non-overlapping confidence intervals?

  • @mizmayo
    @mizmayo Před 2 lety +1

    I like that these are silly :)

  • @minhaoling3056
    @minhaoling3056 Před 2 lety

    Hi Sir, can you make series of videos for Bayesian inference & Bayesian credible interval ?

    • @statquest
      @statquest  Před 2 lety

      I hope to do that in the spring.

  • @edydev6775
    @edydev6775 Před 3 lety

    Josh, although you're crystal clear, i still don't get the following point: according to my interval i have a range of weights that can be considered an estimate of the true mean. But, now I get this 20 weight, and I know that it is out of my interval, so it's very unlikely that it represents a significative diference (it happened by chance). So what I do next? Discard this sample, and run another? What needs to occur so I say that yes, this value of 20 really show something that I need to pay attention?
    And, you're saving my as...s with all these simple explained knowledge. I cannnot thank you enough. greetings from brazil ;)

    • @statquest
      @statquest  Před 3 lety

      You might need to learn about hypothesis testing to understand the value of the confidence interval. Here's the link: czcams.com/video/0oc49DyA3hU/video.html

  • @maxinelyu2693
    @maxinelyu2693 Před 3 lety

    The intro kinda reminded me of IT crowd lol!

  • @Reonsi
    @Reonsi Před rokem

    Why does the 95% CI select some means/values and not others? Does it need to be in the center? If so, how? I would suppose that if you force the mean of the interval to be the mean of all means, it would give you a CI similar to the ones you showed in the video.

    • @statquest
      @statquest  Před rokem

      Traditionally, we center the 95% CI over 95% of the means, but you don't have to do it that way. You just need to cover 95% of the means.

  • @venkilfc
    @venkilfc Před 2 lety +1

    Thank you so much Josh, I just watched your videos of standard error and confidence interval. Could you please verify if I understood it correctly?
    95% confidence interval = mean of means ± 2 Standard Error

    • @statquest
      @statquest  Před 2 lety +2

      It depends on how you calculate it. If you are using bootstrapping, then your method is correct. If you are using a formula to approximate bootstrapping (so you are not using bootstrapping), then you have to appeal to the t-distribution (instead of the standard error). This is because for small sample sizes, the t-distribution is a little wider than a normal distribution, and that compensates for the fact that a small sample size means we have very limited knowledge of what is going on.

    • @venkilfc
      @venkilfc Před 2 lety +2

      @@statquest you're a god sent Josh. Thank you 😄

  • @renatamirra5294
    @renatamirra5294 Před 8 měsíci

    Any chance you could do a video on frequentist confidence intervals, based on the central limit theorem? Also, with the bootstrap method, is the interpretation that you're 95% confidence that the population mean is contained in the interval still valid? Thank you.

    • @statquest
      @statquest  Před 8 měsíci +1

      Confidence intervals always have the same interpretation. If we repeated the procedure to calculate the CI a bunch of times, 95% of them would overlap the population mean.

  • @liamhoward2208
    @liamhoward2208 Před rokem

    Hello Josh, @ 3:22 is it correct to say that 95% of all confidence intervals will contain the population mean? I am having a hard time understanding if this interpretation is the same as yours. Also, I am a bit confused about the bootstrapping. How do we construct the interval? How do we adjust the interval for different levels of alpha? Thanks again bro!

    • @statquest
      @statquest  Před rokem +1

      At 3:22 I say that when using bootstrapping a 95% CI is an interval that covers 95% of the (bootstrapped) means. Now, if we made a lot of 95% CIs using this method, then 95% of them would contain the population mean. For more details on bootstrapping, see: czcams.com/video/Xz0x-8-cgaQ/video.html

    • @liamhoward2208
      @liamhoward2208 Před rokem +1

      @@statquest Thank you for the quick reply. It really separates you from the rest.

  • @kal5211
    @kal5211 Před 2 lety +1

    I am glad my wife hasn't caught me watching these videos.

  • @harithagayathri7185
    @harithagayathri7185 Před 4 lety

    Hi Josh, a little confused about the p-value here as, if less than 0.05 is considered as less likely to reoccur then why are we considering variables with less than 0.05 as highly significant variables in the regression models?

    • @statquest
      @statquest  Před 4 lety +1

      A p-value < 0.05, means, in general terms, that the result is probably not due to random chance. Thus, when we do linear regression, a small p-value tells us the the relationship between the independent and dependent variable is probably not due to random chance.

    • @harithagayathri7185
      @harithagayathri7185 Před 4 lety +1

      @@statquest Thanks Josh 😊😊

  • @chucknor9708
    @chucknor9708 Před 3 lety

    Does the number of random selections to calculate the bootstrap mean, from the sample need to equal the sample size as it does in your example? i.e. could you have chosen 8 random samples from selection and calculated the mean and bootstrapped mean and repeated this 10000 times?

    • @statquest
      @statquest  Před 3 lety +1

      The bootstrap sample is always the same size as the original sample.

  • @portiseremacunix
    @portiseremacunix Před 3 lety +1

    subscribed!

  • @rahulmukherjee8060
    @rahulmukherjee8060 Před 11 měsíci

    How do I check hypothesis for individual distribution sampling?

    • @statquest
      @statquest  Před 11 měsíci

      I'm not sure I understand your question. Are you asking about the distribution of the samples? (like, are you asking about the whether or not the data come from a normal distribution?)

  • @ananyaagarwal7108
    @ananyaagarwal7108 Před rokem

    You are simply Awesome !! I have a doubt here though, Let's say the point estimate is the sample mean. We can repeatedly keep taking the sample means and then plot all these sample means in a histogram and we would observe a normal distribution called the sampling distribution of the sample means. The mean of this distribution would be a better estimate of the population mean and its standard deviation, called standard error would be the population standard deviation/sqrt (number of points in a sample). Won't the confidence interval(say 95%) range be (sampling distribution mean - 2 SE,sampling distribution mean + 2 SE) instead of (point estimate - 2 SE,point estimate + 2 SE)?
    Why would we use the sample mean(point estimate) in calculating the confidence interval range? What if that particular sample mean was like an outlier in the sampling distribution of the mean? In that case, doing +/- 2*SE wouldn't be a good judge to measure population mean right?

    • @statquest
      @statquest  Před rokem +1

      The technical definition of a 95% confidence interval is that if we repeat the process a lot of times, calculating 95% CIs each time, 95% of the CIs we calculate will cover the true (population) mean. So, sure, sometimes we get outliers, and our CI is bad, but that is expected about 5% of the time we calculate a 95% CI.

    • @ananyaagarwal7108
      @ananyaagarwal7108 Před rokem

      @@statquest Thanks for responding, Josh ! Does this mean that while calculating the 95% CI, we are assuming that our point estimate (sample mean) is always 1.96 SD away from the population mean(mean of the sampling distribution) ?

    • @statquest
      @statquest  Před rokem

      no

    • @ananyaagarwal7108
      @ananyaagarwal7108 Před rokem

      @@statquest Thanks for responding again :). I'm having a lil bit of tough time connecting all the dots, sorry for the long questions !
      If we are given the pop SD and use z stats to calculate 95% CI for mu, we say z=xbar-mu/(sigma/sqrt(n)) where z=1.96, xbar is the sample mean or point estimate and sample SD can also be computed. Based on the definition of z score, does this not mean that xbar is 1.96sd away from mu ? In any case, What is the intuition behind using this formula.
      Thanks in advance !

  • @isaacbarbozavilchez6773
    @isaacbarbozavilchez6773 Před 3 lety +1

    If all teachers could explain like Josh, more people loving statistics would be

  • @aszx-tv4pq
    @aszx-tv4pq Před 19 dny +1

    Heaven Of statistics!

  • @mycotina6438
    @mycotina6438 Před rokem

    I'm wondering if there's a correlation between this method and central limit theorem? Because if my understanding is correct, we can also construct a confidence interval using the latter.

    • @statquest
      @statquest  Před rokem

      The central limit theorem makes it possible to create confidence intervals for the estimate of the mean, but only the mean. In contrast, bootstrapping allows us to create confidence intervals for any statistic we want.

    • @mycotina6438
      @mycotina6438 Před rokem +1

      @@statquest Thanks! It makes so much sense

  • @amnont8724
    @amnont8724 Před rokem

    Hey Josh, so if I build a confidence interval to the mean in a 95% confidence level, 95% of the bootstrapped means will be in the confidence interval, and there's a 95% chance the mean will be in that confidence interval?

    • @statquest
      @statquest  Před rokem +1

      95% of the bootstrapped means will be in the interval, but that doesn't mean there's a 95% chance that the interval covers the true mean.

    • @amnont8724
      @amnont8724 Před rokem

      @@statquest Ok, thanks!

  • @xsli2876
    @xsli2876 Před 4 lety +1

    Dir Sir: I really enjoy watching your videos. Thank you very much! I still have one question regards to 'confidence interval'. Let's say that based on the sample of 12 female mice, we have got the 95% confidence interval is: 200grams +/- 30 grams. One interpretations is: any simple random sample of 12 female mice, its mean weight will be in the range of 170g to 230g, with 95% confidence. I guess if the sample size is 20 (female mice), I cannot say the sample mean will be in this range (170-->230g), with 95% confidence. Correct? (Because the sample size is not 12). Another question: I also intend to say, any random female mouse, its weight will be in this range(170-->230g), with 95% confidence. I cannot say that, correct? Thank you!

    • @statquest
      @statquest  Před 4 lety +3

      Unfortunately, the language surrounding confidence intervals is very tricky. A 95% Confidence Interval should be interpreted like this: "If I re-do the exact same experiment a lot of times and use the exact same method to calculate the 95% confidence interval, 95% of the intervals will cover the population mean". If you're not familiar with the concept of a "population mean", check out my StatQuest on Population Parameters: czcams.com/video/vikkiwjQqfU/video.html

    • @xsli2876
      @xsli2876 Před 4 lety

      @@statquest Thank you very much for your reply! It is very helpful. I think if the sample size is large enough(12 is not large enough), the sample mean(x-bar) could be treated as population mean(miu). If based on the sample statistics, the 95% confidence interval is 170g-->230g, indeed I can get the density curve and make this statement: the weight of a random female mouse will fall in this range, with 95% confidence. Is this correct?

    • @statquest
      @statquest  Před 4 lety +3

      @@xsli2876 Unfortunately, that statement is not correct. Again, the Confidence Interval only tells us that if we repeated the exact same experiment and used the same method to calculate the confidence intervals, 95% of the intervals would cover the population mean. In other words, the confidence interval tells us about the "mean" and not individual measurements. If you want to make a statement about an individual measurement, like "there is a 95% chance that the weight of a random female mouse will fall within this range", then you need look at the distribution (probably normal distribution), and find the range that 95% of the values fall in. That is the range you are interested in. In math terms, the 95% CI is usually the mean +/- 2 times the standard error. In contrast, for individual measurements, a region where 95% of the measurements fall is the mean +/- 2 times the standard deviation. If you want to learn more about the standard error vs the standard deviation, check out: czcams.com/video/A82brFpdr9g/video.html

    • @xsli2876
      @xsli2876 Před 4 lety +1

      @@statquest Thank you so much. I got it. For individual measurements, a region where 95% of the measurements fall is the population mean(miu) +/- 2 times the population standard deviation(sigma). In real life, we don't know miu or sigma. However, if we have one large sample, we may use the sample mean(x-bar) as the approximate population mean, the sample standard deviation(s) as the population standard deviation.

    • @statquest
      @statquest  Před 4 lety +1

      @@xsli2876 Yes that is correct. If the sample size is not very large, you can use a t-distribution (which has fatter tails than the normal distribution) to compensate for the uncertainty in your measurements of the mean and standard deviation. However, as your sample size increases, the t-distribution converges to the normal distribution and then then you can just use the sample mean and sample standard deviation.

  • @1tsvaishnav
    @1tsvaishnav Před 4 lety

    Can you give some intuition of prediction interval? What is the difference between confidence interval and prediction interval?

    • @SergeySenigov
      @SergeySenigov Před 7 měsíci

      PI estimates range of RVs or some statistics of RVs. Its limits are not random (are not drawn from samples). For example for standard norm distribution ~99% PI for RV = mu+-3*sigma. And we must _know_ mu and sigma. On the opposite CI estimates range of not random but exact value - the population parameter (mu for example) though we don't know it's exact value. Its limits are drawn every time from different samples so they are random.

  • @koustubhmuktibodh4901
    @koustubhmuktibodh4901 Před 29 dny

    Sir, I am looking for a Calculus series. Because I'm going for M.S. in Business Analytics.

    • @statquest
      @statquest  Před 29 dny

      3Blue1Brown has an excellent series on calculus. I also believe Khan academy has some good stuff.

  • @manikdhingra1606
    @manikdhingra1606 Před 4 lety

    Hello Josh, don't have words to describe how amazing your videos are!! A big thanks for that!!!!
    I'm not not 100% confidence whether I got the concept. Here's my understanding, can you pls comment:
    Since we don't have time or money to measure all the female mice in the world, we pick 12 mice and calculated the sample mean. Then applied bootstrap to come up with a range (between 22 to 31 units of X-axis scale) of population mean for all the female mice on the planet?
    Therefore, a 95% confidence interval means, if we draw a sample of lets say another random 12 female mice from the same population then we are 95% confident that sample mean will be within above range?

    • @statquest
      @statquest  Před 4 lety +1

      The 95% CI tells us that if we repeated the process of collecting data and calculating the CI, 95% of the CIs we calculate will overlap the true, population mean. They don't really tell us about future sample means.

    • @manikdhingra1606
      @manikdhingra1606 Před 4 lety +1

      @@statquest Thank you Josh. I know similar statement is already made in the comments below, but I was not able to follow. So I again watched the 'Population Parameters' and Confidence interval videos and all makes sense.

    • @statquest
      @statquest  Před 4 lety

      @@manikdhingra1606 Great! :)

  • @visionarynjy5491
    @visionarynjy5491 Před 8 měsíci

    In 5:02 , shouldn't the p-value be 0.025? Given the Confidence Interval is both direction.

    • @statquest
      @statquest  Před 8 měsíci

      Sure. The point, however, is that 0.05 is the usual threshold for making a decision about the hypothesis. So as long as we are < 0.05, we will reject the hypothesis.

  • @sdsa007
    @sdsa007 Před 2 lety +3

    my brain is exploding with knowledge, but i don’t have a brain tumour. I am happy that you are explaining core concepts. i especially liked the one on entropy. entropy has lots of narratives associated with it, like Maxwell demon, and code breakers during ww2, and this wacky idea of replacing energy with entropy to unify Einsteins general relativity with quantum physics. These were wonderful embellishments. But I didn’t get the core concept mathematically until you explained it very well! Thank you! I should post this on the entropy video but i need to get some Zzzzz. goodnight!

    • @statquest
      @statquest  Před 2 lety +1

      Thank you very much! I'm glad the videos are helpful. :)

  • @rajkamalingle9144
    @rajkamalingle9144 Před rokem

    Total types of samples that we can take : 23C12 = 1352078, out of this we are asked to find sample mean of around 10,000 samples. Now, we can define confidence interval as : 95% confidence interval is just an interval that covers 95% of the above calculated sample means.