False Discovery Rates, FDR, clearly explained
Vložit
- čas přidán 9. 01. 2017
- One of the best ways to prevent p-hacking is to adjust p-values for multiple testing. This StatQuest explains how the Benjamini-Hochberg method corrects for multiple-testing and FDR.
For a complete index of all the StatQuest videos, check out:
statquest.org/video-index/
If you'd like to support StatQuest, please consider...
Buying The StatQuest Illustrated Guide to Machine Learning!!!
PDF - statquest.gumroad.com/l/wvtmc
Paperback - www.amazon.com/dp/B09ZCKR4H6
Kindle eBook - www.amazon.com/dp/B09ZG79HXC
Patreon: / statquest
...or...
CZcams Membership: / @statquest
...a cool StatQuest t-shirt or sweatshirt:
shop.spreadshirt.com/statques...
...buying one or two of my songs (or go large and get a whole album!)
joshuastarmer.bandcamp.com/
...or just donating to StatQuest!
www.paypal.me/statquest
Lastly, if you want to keep up with me as I research and create new StatQuests, follow me on twitter:
/ joshuastarmer
#statistics #pvalue #fdr
Support StatQuest by buying my book The StatQuest Illustrated Guide to Machine Learning or a Study Guide or Merch!!! statquest.org/statquest-store/
My PhD dissertation relies heavily on bioinformatics and biostatistics, although my background is neuroscience. Naturally, I had a lot of learning to do, and your videos have helped me immensely. Every time I want to learn about a stats concept, I always type in my Google search, "[name of concept] statquest." Seriously, this is almost too good to be true, and I just wanted to thank you for providing this absolute gold mine.
Wow! Thank you very much and good luck with your dissertation.
You make without a doubt the best videos about statistics on CZcams: funny, clear, intuitive, visual. Thank you so much.
Thank you! :)
Totally second this...
God bless you, I made screenshots of this video to explain this concept to my lab. This isn't the first time you've helped me with RNA-seq procedures. I have bumbled through a differential expression analysis. Trying to understand the statistical methods and knowing which option amongst several is the most logical is a mental hurdle. I am the only student in my lab currently undertaking bioinformatics and I am essentially trying to teach myself. There is a huge vacuum of knowledge in this realm amongst biologists and it's daunting. We all can generate data until we're blue in the face, but it doesn't do anyone any good until someone knows how to analyze it.
Awesome! Good luck learning Bioinformatics.
Can't thank you enough!! Your methods are truly amazing. Being able to deliver them to us so cleverly is a true indication of how much effort you must have put into understanding these concepts .
Wow, thank you!
BAM BAM BAM, thanks a lot man...Your 20 minutes most likely saved hours of trying to understand from wikipedia...
Sweet!!! Glad I could help you out. :)
Fantastic video, thank you for taking the time to put this together.
Wow wow wow how intuitive and visual. Can’t thank you enough for saving me from spending hours struggling to understand this concept🙏
You're very welcome!
Awesome explanation !! Thanks for taking the time to make these videos and also answering questions from viewers so well. Going through them already answered some queries that I had :)
Im from China and I watched your channel in Bilibili but I cant had enough so I catch you all the way up ended here, a paradise of data science! thank you Josh, wish you the best!
Wow, thank you!!!!
I love you ❤️. I was so afraid of FDR adjustment because I thought the math behind was empirical and worked like magic but you made it surprisingly intuitive.
Thank you! :)
I'am currently learning to do RNAseq data analysis, these videos are extremely helpful.
OMG!! This is the most beautiful explanation I've ever experienced...... Thank you so much professor.
Awesome!!! Thanks so much.
Simple, informative, and to the point. Absolutely perfect.
Glad you liked it!
Great tutorial for FDR. The adjusted p-value is a p-value for the remaining result after cutting off some results you know that are not significant just by the distribution. It will be better if you can tell something about Q-value and how Q-value reflects the quality of a experiment.
I have always hated math and you just make it clear and interesting! Can't thank you enough
Hooray!!! I'm glad the video is helpful. :)
Thank you! This was really helpful and made me smile during my intense evening revision :)
Glad it helped!
Cool, thanks for posting this, very intuitive! An equivalent method for eyeballing the # of true null hypotheses is to plot ranked 1 - p-value on the x-axis and the hypothesis test rank on the y-axis, then fit a line to the scatter plot, starting at the origin. Where the line hits the y-axis is your estimate of the # of true null hypotheses.Would like to see an intuitive explanation for the Benjamini-Yekutieli procedure, used in studies where the tests are not completely independent!
Thanks for the awesome explanation! Really informative and easy to follow. And the DOUBLE BAM in the end actually made me laugh out loud :D
Awesome! :)
The clearest explanation of BH correction so far. Quadruple BAM!
This is simply great!!! Thanks for sharing Joshua.
This is the best video that explains FDR. Thank you,
Great video, your example was clear and very will illustrated.
This is amazing. Very well explained and easy to understand!
Glad it was helpful!
First and foremost, I extend my heartfelt gratitude for providing such a series that elucidates concepts in an easily comprehensible manner. Bam !☺
Thank you!
As always, by far the best explanation on the web!
Thanks!
Wow was seriously struggling with my research since I dont know the first thing about statistics and I love this so so so much. So instructional I had to like
BAM! :)
Thank you very much indeed for the perfect explanations and examples of the FDR concept. I really get my answer.
Thanks!
Great explanation, thanks ! clear explanation, amazing balance between theory and examples
Thank you!
Just wow!! Thank you for this.
Thank you so much for this great movie!! Great explanation.
From the way my university teachers (didn't) explain to me Benjamini-Hochberg, and after watching this video, I can claim I now understand Benjamini-Hochberg better than them, at a 99.7% confidence level!
BAM! :)
شكرا جاش. ماقصرت. مقطع مختصر ومفيد
Thank you!
Nice video, simple and fast.
Thanks!
Josh is a genius. Really appreciate your work statquest.
Thank you! :)
As always, it is a great explanation. Thank you Josh 👏
Thank you!
Dude thanks so much, this video is AWESOME!!!
Very nice explaination!
I have to keep saying that I love this channel so much
Hooray!!! Thank you so much!!! :)
This was SUPER helpful, thank you!
Thank you! :)
I love you StatQuest. Thank you for never letting me down. You were always present to answer my deepest and most shameful doubts. You never abandoned me during the darkest hours of my PhD.
I'm so happy to hear my videos helped you. BAM! :)
Thanks for your effort and simplified explanation!!! live saver ))
Glad it helped!
Thanks a lot. Mr. Joshua
This is my first time fully understanding FDR ...
bam!
I just love your videos. Thank you so much!
Thank you! :)
Nicely explained.
Thank you!
Good explanation
Great explanations!
Thanks!
Thank you sir, was very useful 🙏
Glad it helped
Nice explaination!
Thanks!
Thank you very much por the explanation, very very clear!!
Muchas gracias!
BAM!!!finally i understand it, which confused me half a year!!
BAM! :)
the second half is hard to understand, but I know I will come back later and watch it again, and again, and again until I finally understand it
Let me know if you have any specific questions.
This is a great video. And, could help me understand how the intuitive understanding (the histograms of p values coming from two distributions) connects to the mathematical procedure of the B-H procedure? thank you!
1 thumb down is a case of FDR :)
So true! :)
Thanks, that was preciuos (and spared me hours of frustration)
Thanks! :)
Thank you, bro!
It's so good, I want to give it more than one thumb up!
Double BAM! :)
Hey Josh, love you videos on stats, specifically centered around hypothesis testing. Can you do more videos on the different techniques of hypothesis testing, like (group) sequential testing and multi-armed bandit?
I'll keep that in mind.
This video is so beautiful.. Thank you so much
I'm glad you like it!
Thank you, nicely expalined
You are welcome!
Thank you Sir🌹
Thank you!
One part I don't quite understand is how the intuitive eyeball method translates into the B-H p-value adjustments you explain starting at ~15:00. To me, plotting a line along the H0 = True p-values sounds like you would be fitting a linear regression & identifying the outliers < .05.
I love the explanation!
Thank you! :)
I don't understand one thing. If samples are taken from the same population, p-value bins would NOT be evenly distributed, rather it is also skewed toward p=1 because it is normally distributed and most of the time samples close to average values are likely to be picked.
By definition, p-values are uniformly distributed. By definition, a p-value = 0.5 means that 5% of the random tests will give results equal to or more extreme. a p-value = 0.1 means 10% etc etc. etc.
Thanks a lot!
I'd like to know why when samples come from the same distribution, the p values are uniformly distributed? Thank you!
This is amazing. thank youu.
Thank you! :)
Hi Josh! Great stuffs here. Could you please make a video on "Significance Analysis of Microarrays". Mainly how it differs from T-stat/Anova. Really appreciate you for all the videos.
I'll keep it in mind, but I can't promise I'll get to it soon.
Would love a video about the target decoy approach
OK. I've added it to the to-do list. :)
Very nice video and I learned a lot from it. The only thing is when you give examples and told us when you do 10,000 times P value calculation, the distribution of P values will be like this or like that. But I don't know that's true or not. So, I am wondering can you explain a little bit more or is there any further reading I can do about P value and adjusted P value?
Great channel and fantastic content! I am wondering if you could make an episode about IDR, Irreproducible discovery rate. It is difficult to find a good explanation or usage guide on it.
I'll keep that in mind.
This is very great!!!
Thank you!
AWESOME! Thank you!
:)
thanks josh!
You are welcome!!! I'm glad you like the video! :)
Hey thanks for the video. Just a question, don't you have higher chance of having samples that come from the middle of the distribution than the tails resulting having more large p-values than small ones? I don't get why p-values are uniformly distributed? Thanks :)
You know, I found this puzzling as well. However, imagine we are taking two different samples from a single normal distribution. If we did a t-test on those samples, 5% of the time the p-value would be less than 0.05. Now imagine we created 100 random sets of samples and did 100 t-tests. 5 of those p-values will be less than 0.05. 10 will be less than 0.1, 15 will be less than 0.15.... 50 will be less than 0.5.... 90 will be less than 0.90, etc. This isn't a mathematical proof, but it makes sense - the whole idea of having any p-value threshold, x, is that we are only expecting, x percent of the tests with random noise to be below that threshold. Thus, we have a uniform distribution of p-values.
Also keep in mind that when computing p-values for the difference between two sample means, p-values of .05 or less cover a wider range of x values than say p-values between .50 and .55.
@@statquest Wow, I had the same question as Ken. Thanks for giving this super intuitive explanation!
@@Tbxy1 me too! been struggling to understand that part and thank god Ken asked 😅
It is a crystal clear about FDR and BH method, rather than my professor said
Awesome, this may be too niche but could you do a video on local FDR please?
Great video! Congratulations. I've seen the paper of Benjamini and Hochberg 1995, but (guided by my very limited knowledge of math) I was not able to find the formula in the way you explained. Please, could you give some clarifications on this issue, as some kind of transformation of the mathematical procedure? Thank you very much. Best wishes.
I'll keep that in mind.
I have the same questions. Did you figure out the logic behind the mathematical procedure? Thank you!
BAMMMM! Thank you!
Hooray! I'm glad you like the video. :)
I think you previously talked about how to calculate p value for one sample set that tells us how likely the sample set belongs to the distribution, but in here we are calculating the p-value of two sample sets, and try to tell whether they belong to the same distribution, how is it calculated? Or is it simply just comparing one sample set to the distribution and another and if they both likely belong to the same distribution we say we fail to reject the null hypothesis?
In this video I believe I'm using t-tests. To learn about those, first learn about linear regression (don't worry, it's not a big deal): czcams.com/video/nk2CQITm_eo/video.html and then learn how to use linear regression to compare two samples to each other with a t-test: czcams.com/video/NF5_btOaCig/video.html
Thanks for these videos! They are great!!
Can you help me understand the intuition behind why the p-values are uniformly distributed in the samples from the same distribution?
Think about how p-values are defined. If there is no difference, the probability of getting a p-value between 0 and 0.05 is... 0.05. And the probability of getting a p-value between 0.05 and 0.1 is also 0.5 etc.
Question on the application of the B-H method: I have a distribution of p-values and KS D-values from comparing two distributions: 1) a distribution of transcriptional changes (observed), and 2) a distribution of transcriptional changes formed from random shuffling (null). I wish to adjust the p-values to weed out any false positives. When I rank the p-values, can I simply choose all p-values in the "< 0.05 bin" of the observed distribution? That kind of mimics what you did in the first example starting @ 14.47. But in the second example @ 17:07, how did you actually compute the adjust p-vales? Did you just repeat the method on the blue boxes (observed) and on the red boxes (null) separately? Thanks, and keep up the great videos!
That makes sense. Your approach eliminates p-value adjustment: just select a cutoff where no more than 5% of the combined (and sorted) p-values come from the permuted set. Then for any p-value from that combined set I can say "this p-value has an FDR of
Joshua Starmer I'll try all three and see which samples get eliminated. Thanks again for your feedback, you're more helpful than most of my professors!
Thank you so much.
Hooray! I'm glad you like the video! :)
I've been reading publications for an hour and you solved my problem in 10 minutes.
Awesome!!! This is definitely one of those things that's easier to "see" then to read about. Glad I could help. :)
This is awesome. Imma save it for later reference hah
Hi Dr. Josh, I'm curious to get your thoughts on a simulation I'm running. It's very similar to the simulation in this video where you calculate 10,000 p-values by sampling from the same distribution.
When I run my simulation using a Welch t-test and n=3, only ~3.5% of p-values are less than 0.05. The percentage converges on 5% when I increase the sample size or use the Student's t-test.
It seems as though forgoing the equal variances assumption sacrifices some power, especially at low sample sizes. But I'm still trying to grasp why that is and what the implications are for using the Welch t-test with low sample size in real-life situations. For example, if the null hypothesis is that both samples come from the same population, then why not just assume equal variances and use Student's t-test all the time? (I know that last question is probably conflating some concepts that should be separate, but I'm having a hard time keeping track of it all, and I'm really interested to hear how you would respond to that question).
You seem to have a great way of explaining things like this intuitively. I'm curious to hear your thoughts.
Thanks so much! I've benefited greatly from your videos.
It makes sense to me that welch's t-test has less power with low sample sizes because it makes less assumptions - and thus, has to squeeze more out of the data by estimating more parameters.
@statquest: Josh, Thank you. I have a follow-up though. Sure, we could adjust the p-values to reduce the False positives, but could this adjustment cause an increase in False negatives? Is there a way to quantify that? Apologies if I am missing something obvious.
There are different methods to control the number of false positives, some do a better job than others at keeping the number of false negatives small. FDR is one of the best methods for limiting both types of errors. In contrast, the Bonferroni correction is one of the worst.
great video
Thank you!
Great video! I have a question on distribution of p-values: I am doing a likelihood ratio test and calculating significance p-values from Chi-square test. I see that the distribution of my uncorrected p-values is not uniform near p-value 1. It has a large peak at p-value 1 i.e. most of my data-points has p-value of 1. Do you have any insights on how that might happen? And what can be the best way to correct for multiple hypothesis testing in this case. Because, using BH, I lose all the significance :( Thanks!
Thanks Joshua!
Thank you for the intuitive video. I am awfully new to statistics so I have three questions: Suppose it is a classification problem 1. Are "samples" referred to as "classes" (types of genes) or is it samples of genes? 2. Will the null hypothesis be: there is no dependency between the gene and the samples? 3. Why 10,000 times? (I am bit confused what is relationship between 10,000 genes and 10,000 test as I understand for each test, the distribution plot is based on values of genes)?
1) I'm not sure I understand the question because we are trying to classify the expression as being "the same" or "different" between two groups of mice or humans.
2) The null hypothesis is that there is that all of the measurements come from the same population.
3) When we do this sort of experiment, we test between 10,000 and 20,000 genes to see if they are expressed the same or different between two groups of mice or humans or whatever. So, for each gene in the genome, we do a test to see if it is the same or different. This allows us to identify genes that play a role in cancer or some other disease.
I am glad to see this video as i am doing some FDR tests in my project. I have a question: what if the false positive samples remained after adjustment? Is it still acceptable if FDR is < 0.05?
You can not eliminate false positives, but you can use FDR to control how many there are. So typically people call all tests with FDR < 0.05 "significant".
Thanks
I truly love you...
Thank you! :)
in 6:45 when you mentioned the p-value or 3 technical samples - how do you calculate a p-value of 3 technical samples into one number? do you average them before calculating the p-value? summing them up? or average the p-values of each one of the 3 technical samples?
I'm not sure I understand your question. However, essentially what I'm saying at that point is that we start with a single normal distribution and randomly select values from it (for details, see: czcams.com/video/XLCWeSVzHUU/video.html ), then we perform a statistical test (for example, a t-test) to calculate the p-value. We then repeat this process 10,000 times and create a histogram of the p-values. This will create a histogram of p-values for when the null hypothesis is true (for details, see: czcams.com/video/0oc49DyA3hU/video.html )
Holy freaking nuts!! Thank you haha...
Yes! :)
Could you kindly explain the post hoc tests for ANOVA?
Thank you for your very helpful video. I have one question here: what I have understood from the calculation of the FDR is that it will make only the smaller p-values still be significant after the correction, am I right? (you suggested it in 12:09) Nevertheless, I got distracted at 17:20 because there are small-er values in the red area that, based on this, would not be "false positives" if I got your explanation. Could you clarify this? Thank you :)
The numbers in the blue boxes are p-values that were created from two separate distributions. Some of those p-values are below the standard threshold of 0.05 and some are not. The ones that are not are "false negatives". The numbers in the red boxes are p-values that were created from a single distribution. Some of those p-values are below the standard threshold of 0.0.5 and some are not. The ones below the threshold are false positives. However, in this specific example, after we apply the BH procedure (at 18:02 ), all of the false positives end up with p-values > 0.05 and are no longer considered statistically significant so the false positives are eliminated.
i love how he made that joke about wild type with monotone lol
:)
Hey sorry to bother you (or anyone else who reads this comment) but I am currently trying to understand the connection between FDR and p-hacking. I am not sure if I understood this right but:
Can an inflated FDR appear if researchers trying to get a significant result through multiple comparisons by running more than one independent test on the same data set.
Or have I misunderstood FDR completely?
Awesome!!
Thanks!
1000 Thanks!
One naive question: Why the distribution of p.values in testing samples taken from the same distribution is flat? I'd rather expected a distribution skewed towards high p.values (non significant).
Thanks again!
Thanks!
In fact, we just made a simulation (in R language) and we obtained the described behaviour (flat distribution). And it is true for any number of replicates. Your explanation is crystal clear to me. Thanks again! Nice channel!
Bump, I have the exact same question