FDR, q-values vs p-values: multiple testing simply explained!

SdĂ­let
VloĆŸit
  • čas pƙidĂĄn 23. 11. 2022
  • Why is multiple testing a big issue in biostatistics? In this video, we will explain why multiple testing is so dangerous when analysing large datasets, and how to correct for it. We will cover some of the most common methods: Bonferroni correction, Benjamini-Hochberg (BH) and q-values.
    Don't let the monster of multiple testing eat your data!
    --------------------------------------------------------------------------------------------------------------------
    Watched it already? If you liked this video or found it useful, please let me know! Your comments and feedback are very much appreciated😊 If you have questions, don't hesitate to leave me a comment down below, I will answer as soon as I can:) --------------------------------------------------------------------------------------------------------------------
    For more biostatistics tools and resources, you can visit: biostatsquid.com/ for more
    ‱ simple and clear explanations of biostatistics methods
    ‱ computational biology tools
    ‱ easy step-by-step tutorials in R and Python
    to analyse and visualise your biological data!
    Or follow me on Instagram at @biostatsquid: / biostatsquid
    Don’t forget to subscribe if you don’t want to miss another video from me! --------------------------------------------------------------------------------------------------------------------
    More multiple testing resources:
    Check the difference between different multiple testing corrections in R:
    www.stat.berkeley.edu/~mgoldm...
    A really cool explanation of the FDR from Statquest!
    ‱ False Discovery Rates,...
    ------------------------------------------------------------------------------------------------------------
    Trash FM by Alexander Nakarada | www.serpentsoundstudios.com
    Music promoted by www.free-stock-music.com
    Attribution 4.0 International (CC BY 4.0)
    creativecommons.org/licenses/...

Komentáƙe • 30

  • @jorgea.servert9490
    @jorgea.servert9490 Pƙed rokem +5

    This video is brilliant! You are a natural at explaining statistics. Thank you so much!

    • @biostatsquid
      @biostatsquid  Pƙed rokem

      Thanks for your kind words Jorge! Glad it was useful:)

  • @mihacerne7313
    @mihacerne7313 Pƙed rokem +2

    Multiple thanks for the video!

  • @ankushjamthikar9780
    @ankushjamthikar9780 Pƙed rokem +1

    This video is very good! You explained it in a nice way. Thank you for the video. Keep posting more videos on biostatistics.

    • @biostatsquid
      @biostatsquid  Pƙed rokem

      Thanks Ankush! I am glad you found it helpful:)

  • @svetlanavasileva8961
    @svetlanavasileva8961 Pƙed rokem

    It’s gorgeous !!! Please do more about biostatistics !!!!!

  • @user-fw2kc6iv1f
    @user-fw2kc6iv1f Pƙed 7 měsĂ­ci

    thanks, you make me truly understood q_value

  • @sanjaisrao484
    @sanjaisrao484 Pƙed rokem

    Thanks, I finally understood something about p value and FDR

  • @biotales371
    @biotales371 Pƙed rokem +1

    simply brilliant...

  • @mocabeentrill
    @mocabeentrill Pƙed rokem +1

    Thank you Biosquidee!

  • @anmolpardeshi3138
    @anmolpardeshi3138 Pƙed rokem

    this is an awesome video! Applaud the simple and fun explanation. just two things: (a) the "coffee" being NOT associated (among the significant outcomes) comes from a prior knowledge. but we might not always have this prior knowledge - then what do we do? (b)its not shown how the adjusted p values were calculated if you can pls make that clarification. otherwise this is a good video! Thanks.

  • @enzolong9085
    @enzolong9085 Pƙed 10 měsĂ­ci

    Thank you so much!!!

  • @ZLYang
    @ZLYang Pƙed 11 měsĂ­ci +2

    At 1:29, if you find a link, why p is still larger than 0.05?

    • @biostatsquid
      @biostatsquid  Pƙed 10 měsĂ­ci

      Oh nicely spotted! That's a typo, sorry for the confusion! It's p < 0.05. Thanks for noticing and commenting about it!

    • @ZLYang
      @ZLYang Pƙed 10 měsĂ­ci

      😁@@biostatsquid

  • @anphan7526
    @anphan7526 Pƙed rokem +2

    Shouldn't it be 1/16 at 7:07, since we have 16 objects being marked as significant?

    • @tgc7053
      @tgc7053 Pƙed 9 měsĂ­ci

      I think so.

  • @artarz9542
    @artarz9542 Pƙed 3 měsĂ­ci

    How do you determine the number of false positives? What are the criteria?

  • @carlosdomingues3551
    @carlosdomingues3551 Pƙed 3 měsĂ­ci

    Thank you for this great concise video, you can tell you put alot of work into it =] ..Any follow-up on red smarties linked to baldness??

  • @emotaph5709
    @emotaph5709 Pƙed rokem +1

    Thank you for this video and the effort that must've gone into this. Everything you explained was very easy to understand.
    I had a question:
    You spoke about "correlations" in the video but what about relations one way to the other such as regressions where we speak in terms of "dependent" and "independent" variables. In the examples you shared, the genes would be independent variables and we want to see their relation with the "dependent" variable of being a morning person. Now if we were to check if 1 gene in particular (independent variable) affects different things (different dependent variables)- blindness, baldness, wakefulness, color blindness, etc. would the same logic of q values hold?
    It would be lovely if you get the time to get back to this. If not-thanks anyway for the great video!

    • @biostatsquid
      @biostatsquid  Pƙed rokem +1

      Hi! Thank you so much for your comment! That was a great question. My answer is... yes and no:)
      So, in general, q-values are not typically used for linear regression. Let's see why.
      As we saw in the video, q-values are commonly used in the context of multiple hypothesis testing, specifically in controlling the false discovery rate (FDR). They are used to adjust p-values for multiple comparisons when conducting hypothesis tests on a large number of variables or features simultaneously (for example, gene expression studies).
      Linear regression, on the other hand, is a statistical method used to model the relationship between a dependent variable (let's take one of the ones you mentioned, for example, blindness) and one or more independent variables (genes). It aims to estimate the coefficients of the independent variables to predict the value of the dependent variable. We then see how well our model is by evaluating the overall goodness of fit (e.g., using R-squared of the RMSE).
      However, this is where the 'yes' comes in. We usually assess the statistical significance of the coefficients through p-values. In the context of linear regression, if you are performing multiple hypothesis tests-for example, when evaluating the significance of multiple coefficients (because you have multiple genes) or conducting variable selection-it may be appropriate to use q-values to adjust the p-values associated with each coefficient.
      Hope this cleared things up a bit. However, I recommend consulting a statistician or reading the literature to ensure you're applying the q-values correctly in the specific context of your analysis:)

    • @emotaph5709
      @emotaph5709 Pƙed 10 měsĂ­ci +1

      @@biostatsquid Thank you for the swift reply and the detailed explanation. And yes, good idea to keep reading the literature before making a decision!

  • @dome1844
    @dome1844 Pƙed 7 měsĂ­ci

    i did not get if q-value is more stringent than FDR. I had an analysis in which I used FDR for gene expression, but I think the results are too stringent un confront of difference I observed by experiments and to have a good G0 ontology analysis that represents the biological process going on. what to do in this case?

    • @biostatsquid
      @biostatsquid  Pƙed 7 měsĂ­ci

      Hi, thanks for your comment! Not sure I understand your question - could you rephrase? Perhaps this answer helps answer it? www.biostars.org/p/462897/
      In any case, when you are doing GO analysis it's good practice to correct for multiple testing and use p adjusted values,, even when there may not be many significant results.

  • @SmiladaXD
    @SmiladaXD Pƙed 9 měsĂ­ci

    Thank you so much for this video. Could you please just clarify how you calculated the P-adjusted values/Q-values? I've been looking everywhere for that and would truly appreciate if you can explain that to me.

    • @biostatsquid
      @biostatsquid  Pƙed 9 měsĂ­ci

      Hi Claire, thanks for your feedback! I don't think I've ever calculated p-adjusted values myself, usually when you get the output of a statistical test you get p-values and p-adjusted values. But I did a quick search and found this article: Why, When and How to Adjust Your P Values? It explains how to calculate p-adjusted values from your p-values. Hope it helps!
      www.ncbi.nlm.nih.gov/pmc/articles/PMC6099145/

  • @Nikolaj-qz9kw
    @Nikolaj-qz9kw Pƙed rokem +1

    The person in red was asking if smarties cause *blindness, not *baldness :)

  • @willychrosnik1925
    @willychrosnik1925 Pƙed 7 měsĂ­ci

    Now i want smarties

  • @shiyiyin3403
    @shiyiyin3403 Pƙed 2 měsĂ­ci

    start watching at 7:00 intro is too long