Redundancy Analysis (RDA) in R

Sdílet
Vložit
  • čas přidán 6. 09. 2024
  • Follow along to learn how to do redundancy analysis, or RDA, in R! A redundancy analysis looks at both relationships amongst response variables and explanatory variables. This tutorial looks at fauna mortality data and environmental cover data.You can find the R Script on my GitHub: github.com/nik.... Happy coding! Comment if you have any questions!

Komentáře • 40

  • @ismaelemilioaguilarbocaneg6915

    Thank u so much, everything was 100% clear! If someone wants to read more about the RDA or other ordination methods, you can refer to Numerical Ecology with R (Borcard). Saludos desde Peru!

  • @MichaDavidova
    @MichaDavidova Před 2 měsíci

    Amazingly clear explanation! I love that you explain even the basics of each step, it's so useful for someone who's just starting out with statistical analyses and struggles to understand high-level explanations common in other sources

  • @carloswillian90
    @carloswillian90 Před 8 měsíci +1

    Thanks a lot for your clear and simple explanation! You made rda way more clear to me after this video!

  • @gushan0805
    @gushan0805 Před 10 měsíci +1

    thanks you sharing, its so clear and really useful. I struggled in so many reading materials until your video.

  • @slimestage
    @slimestage Před rokem +3

    This was super helpful!! Great explanations! Thanks for guiding me to this video from your comment response on the PCA video. Much appreciated!

  • @liamtaylor7710
    @liamtaylor7710 Před rokem +3

    As usual, awesome explanations 🎉

  • @germanperez4934
    @germanperez4934 Před rokem +3

    Hi Nikki, many thanks for creating this tutorial and also all the thorough explanation provided :). I have a question, when you do the anova (anova(spec.rda2, by="margin", perm.max=1000) #tells you if the order of the terms is significant). I don't get what you mean by the "order of the terms"? Biologically speaking, what does it mean? I am also struggling in customizing the rda plot with ggord... do you have any experience on that? Many thanks in advance! Cheers from Uruguay :)

    • @justonebirdsopinion
      @justonebirdsopinion  Před 11 měsíci +2

      Hi! So sorry it took me so long to get to this comment! I just came back from 5 months of field work so am just getting back to this channel. I'm sure it's too late where my answer won't be helpful, but I will reply anyways. The 'order of terms' is related to how the anova function is handling the data. So generally, it will try to explain how much variance is explained by the first term, and then only consider the remaining variance in how much the second term might account for, and so on. This leaves less variance as an option to explain terms that come later in the model. This becomes a problem if there's correlation between variables. So if variable 1 and 2 are correlated, most of the variance will be attributed to variable 1 and very little variance attributed to variable 2, even if in theory they should be relatively equal. So assessing if the order of terms is significant is letting you know if there's a significant effect of the order you fed the variables into the model, with different (perhaps only slightly different) results depending on the order of terms. This is helpful information as you go forward in interpreting the model and determining if you are running into Type 1 error.

    • @germanperez4934
      @germanperez4934 Před 10 měsíci

      @@justonebirdsopinionMany thanks again 🙂

  • @farmIntegral
    @farmIntegral Před rokem +2

    Wonderful video and explanantion. Please can you proivde the script used and the data set as well?

    • @justonebirdsopinion
      @justonebirdsopinion  Před rokem +2

      Hi! You can find the R Script and data on my GitHub at this link: github.com/nikkireg1/Redundancy-Analysis. Happy coding! Let me know if you have any questions!

    • @farmIntegral
      @farmIntegral Před rokem +1

      @@justonebirdsopinion Thanks so much for your videos. I would like to ask if you could make vidoe on doing a mantel test and also canonical correspondency analysis. One question, is what is the major and underlying assumption for using a redundancy analysis to doing a canonical correspodnign analysis? Thanks for your time.

    • @farmIntegral
      @farmIntegral Před rokem +1

      @@justonebirdsopinion I have different categories of environmental variables such as the biotic factor (temperature, windspeed), spatial factor (longitude and latitude), and soil properties (soil ph) and I would like to perform Canonical Correspondence analysis. I successfully performed the analysis but I don't know how to customize the plot. I want to customize the different categories of environmental variables by allotting them with different colours to differentiate spatial from biotic. In addition to this, how can I group the species based on a variable such as elevation etc? I would also appreciate it if you can give me your email. Thanks

    • @justonebirdsopinion
      @justonebirdsopinion  Před rokem +1

      @@farmIntegral Hi! I will work on making a CCA video and Mantel Test video soon and let you know when it's posted (hopefully the first one within a week)!

    • @justonebirdsopinion
      @justonebirdsopinion  Před rokem +1

      @@farmIntegral Hi! Here is a link that might be relevant to helping you group environmental variables by color: stackoverflow.com/questions/61348422/how-to-create-ordination-plots-with-different-species-groups. If this isn't what you're looking for I'm happy to help problem solve more than this! You can reach me via email at nicole.regimbal@mail.utoronto.ca. Looking forward to hearing back from you and happy coding!

  • @lkc7253
    @lkc7253 Před 9 měsíci +2

    Are there more assumptions we have to meet or is it enough to do what you did, e.g. with the collinearity?

    • @justonebirdsopinion
      @justonebirdsopinion  Před 9 měsíci

      Hi! The main RDA assumption is linearity between the predictor and response. Besides that checking for those variance inflation factors like I do in my video is really what's important. Hope this is helpful!

  • @nabinsharma6459
    @nabinsharma6459 Před 11 měsíci +1

    Thanks, wonderful

  • @kariiamba7324
    @kariiamba7324 Před 10 měsíci +1

    Informative video. Thankyou

  • @user-cn8jm5ft9c
    @user-cn8jm5ft9c Před 10 měsíci +1

    Amazing work as always!! I also wanted to ask you how you could add ordinal variables into this, Is it possible? For example, vegetation growth habit (e.g. herbaceous, shrub, tree). Is it possible to mix it with numerical data as your environmental variables? On the other hand, could you do a video using a PCoA? I have been reading and it seems is the best way to represent beta diversity in terms of similarity among the groups because of the distance approach! I would appreciate your comments about this. Thank you so much!!

    • @justonebirdsopinion
      @justonebirdsopinion  Před 9 měsíci +1

      Thank you! It's definitely possible to add in categorical data like that - but you'll likely have to do what's called dummy coding. With dummy coding all the categorical factors would become their own columns and be assigned a 0 or a 1. So with the example you gave, rather than vegetation growth habit being the column name, there would be a separate column for herbaceous, shrub, and tree. If observation 1 was in a tree habitat, then herbaceous and shrub would be denoted with a 0 and tree with a 1. You can think of it like a presence-absence matrix for the category. I hope this is helpful! I will definitely add a PCoA video to my list - I am hoping I can be more active with the semester winding down. :)

  • @mattounou
    @mattounou Před rokem +1

    Nice explanation and nice code ty

  • @ofirshorshy8281
    @ofirshorshy8281 Před rokem +1

    in my case the anova anylsis that I ran was not sig. what so ever. but when I check only few choosen that was intresting for me to plot and than the anova turned sig. for each test of anova .
    I just don't know why mine is not fully seen on the screen of the plot.
    and how do I make sure I dont have overlapping of the final text in the plot?

    • @justonebirdsopinion
      @justonebirdsopinion  Před rokem

      Hi! Are you referring to the text on the rda biplot? If so, there is some ggplot code to repel overlapping text. But to be honest, I do a much easier perhaps less 'correct' method to fix my plots when stuff like this happens. I save the plot as a metafile and then insert it in powerpoint. Then if you right click and choose 'ungroup' you will be able to modify individual aesthetic elements of the plot. So you would be able to shift around any overlapping text boxes and even change the variable label if you want. Does this answer your question or am I misinterpreting?

    • @ofirshorshy8281
      @ofirshorshy8281 Před rokem +1

      @@justonebirdsopinion Exactly ! I had no Idea of this. it is amazing! Thank you. now my graph is way more readable for my propessor to read.
      Thank you so much ! it is really good explantion .😇

  • @danielcarvalho8933
    @danielcarvalho8933 Před 4 měsíci

    Thank you very much!! This helped me a lot!!!

  • @niloofarkh4779
    @niloofarkh4779 Před rokem +1

    thank u , u explained beautifuly

  • @ibrahimhalilyildirim9619
    @ibrahimhalilyildirim9619 Před 8 měsíci

    Error in step(spec.rda, scope = formula(spec.rda), test = "perm") :
    AIC is -infinity for this model, so 'step' cannot proceed. I got it problem?

  • @Souloether373
    @Souloether373 Před rokem +1

    Hi, first of all... ty for your video!
    I have a problem with my RDA analysis, the analysis is taking the same number of enviromental data as especies, even when my enviromental dataset has more columns than especies, do u why this happens?

    • @justonebirdsopinion
      @justonebirdsopinion  Před rokem

      Hi Miguel! When you are going through the step that is equivalent to my env_stand, are you including all the variables for the analysis? Make sure to include any variables that may have been transformed; cbind() them with the non-standardized values. When you do your first rda, all the variables should be there. But after the step() function some should be omitted. I guess my first suggestion would be to ensure that env_stand is including all the variables you want before putting it in the first rda analysis. Are you doing that? If so, let's chat some more. Happy coding!

  • @lauraviviani7472
    @lauraviviani7472 Před rokem +1

    Hi! Great video :)
    I'm stuck in a situation which is similar to your script: I'm performing an rda and obtaining a model with only RDA1 and PC1 as output, but the step function won't work on it. I passed the model the same way as you did (rda_step1

    • @justonebirdsopinion
      @justonebirdsopinion  Před rokem +1

      Hi Laura! Thanks for your comment! The error you are getting is indicating an issue with your explanatory (x) variables. Since your output is only showing RDA1 and PC1, I am thinking that the explanatory variables you are using are highly correlated - which is causing the crash with the step function. The RDA1 output is essentially saying that all the variance explained can be attributed to this one axis, and PC1 is indicating that there is residual noise that cannot be explained by a RDA axis and is likely due to high correlation as well.
      My suggestion would be to start with assessing for collinearity first using the vif.cca function. This will tell you if there are any variance inflation factors in your explanatory variables that is messing with the model output. You can also check to see if your explanatory variables are highly correlated using a Pearson's r. The R script for this would be "cor(df$vars, method = 'pearson')". Omit variables that are highly correlated.
      If everything looks good on that front, you may be missing explanatory variables. So if there's anything you omitted from the beginning, perhaps consider adding it back in. Or it is possible that with the data you have a RDA may not be the best method to use.
      If you want to chat about this in more detail or have me look over your data - feel free to email me as well! If you go to the About section in my channel profile you will find a link to my website which has my contact information. I hope this at least helps you get started with your problem-solving! Good luck!

    • @lauraviviani7472
      @lauraviviani7472 Před rokem

      @@justonebirdsopinion Thank you very much, this was really helpful.
      Unfortunately I can't use vif.cca because the step function stops before having concluded, due to the error I mentioned in the first comment. I've also found the function redun() that seems to work on my data; anyway I can't find a lot of literature using this one, whereas rda() seems to be much more used. What do you think about it?

    • @justonebirdsopinion
      @justonebirdsopinion  Před rokem

      @@lauraviviani7472 Hi Laura! You can still use the vif.cca function before doing the step function. You would just put the name of your original RDA in the brackets rather than the new rda name for your step function. Did you try the Pearson's r as well? Hopefully both help you omit highly correlated variables and variance inflation factors that might be causing the problem.
      I've never used the redun() function before and have only used the rda() function in vegan. I think rda() is more typically used, but since I've never used redun() I can't really speak to whether it's better or worse. But since your output is suggesting very little variance in the data, changing the function likely won't fix the issue.
      Start with trying the vif.cca and let me know if you are able to run it! If you are having issues you can email me and I will send screenshots and maybe some mock code. If you do get rid of VIFs and correlated variables and are still having issues, maybe RDA isn't the best analysis for what you want to do. You can tell me a bit more about your data and the question you want to ask and I can think about whether a different analysis might be more suitable. I hope this can be a bit helpful! Let me know how it goes!

    • @justonebirdsopinion
      @justonebirdsopinion  Před rokem

      @@lauraviviani7472 Hi Laura! I just wanted to check in and see if the issue was resolved!

  • @haraldurkarlsson1147
    @haraldurkarlsson1147 Před rokem

    Great but you code text is tiny - even at the higest res.

    • @justonebirdsopinion
      @justonebirdsopinion  Před rokem

      Hi! Yea sorry about that it was due to a monitor issue. You can find the code on my github as well if that's helpful. github.com/nikkireg1/Redundancy-Analysis