Video není dostupné.
Omlouváme se.

Running non-metric multidimensional scaling (NMDS) in R with vegan and ggplot2 (CC187)

Sdílet
Vložit
  • čas přidán 15. 08. 2024
  • Non-metric multidimensional scaling (NMDS) is an alternative to principle coordinates analysis (PCoA) and its relative, principle component analysis (PCA). All of these are popular ordination techniques that you can use to reduce the dimensions of data in R. In this episode, Pat Schloss will show how to perform NMDS in R and visualize the ordination in ggplot2. We'll use the metaMDS function from vegan and tools from ggplot2 and the tidyverse packages.
    You can find my blog post for this episode at www.riffomonas.... The data were generated in our Kozich et al. 2013 paper (doi.org/10.1128...) using samples from the Schloss et al. 2012 paper (doi.org/10.4161....
    #metaMDS #vegan #ggplot2 #R #Rstudio #Rstats
    Want more practice on the concepts covered in Code Club? You can sign up for my weekly newsletter at shop.riffomona... to get practice problems, tips, and insights.
    If you're interested in taking an upcoming 3 day R workshop be sure to check out our schedule at riffomonas.org...
    You can also find complete tutorials for learning R with the tidyverse using...
    Microbial ecology data: www.riffomonas...
    General data: www.riffomonas...
    0:00 Performing NMDS analysis in R
    4:08 Using vegan's metaMDS to perform NMDS
    7:26 Plotting ordination data with ggplot2
    8:23 Bringing metadata into ordination
    11:07 Comparing to PCoA ordination

Komentáře • 59

  • @justinerenaud5256
    @justinerenaud5256 Před 2 lety +5

    Happy birthday Pat! You saved my life! Trapped in a statistically rich internship, with almost no skills, your videos were lifesaving for me :)

    • @Riffomonas
      @Riffomonas  Před 2 lety +1

      Oh you’re too kind - thanks for remembering my favorite random number generator seed 🤓🎂

  • @jouberc
    @jouberc Před 2 lety +3

    I just started watching your awesome videos the other day, and today I found out you're presenting at my Department this Thursday. What a cool coincidence!

    • @Riffomonas
      @Riffomonas  Před 2 lety +1

      Fantastic- please be sure to say hello!

  • @11mgarrard
    @11mgarrard Před 2 lety +2

    Excellent - thank you for sharing. I am usually driven towards nMDS over PCoA for better representation of most of my ecological data.

    • @Riffomonas
      @Riffomonas  Před 2 lety

      Hi Mary - thanks for watching and writing in!

  • @luizacervenka
    @luizacervenka Před rokem +2

    Hi, thank you for your video! My matrix has 3272 row and 31 collums. R doesn't alow to use tibble for non squared matrix. What can I do?

    • @Riffomonas
      @Riffomonas  Před rokem +1

      Hi - I'm not totally sure what the problem is you are running into, but you should generate a square distance matrix from that tibble and use that as in put to build the NMDS data

  • @djangoworldwide7925
    @djangoworldwide7925 Před rokem +1

    You are such a great data sceintist

  • @Dispatern
    @Dispatern Před rokem

    Thank you for your Code Club content. It's very useful for the data analysis part of my thesis. :)

  • @DanBioGarcia
    @DanBioGarcia Před rokem

    Your video just helped me out a lot!

  • @laurentgapin689
    @laurentgapin689 Před 2 lety +4

    Great videos !! Still learning R but you make it look simple ;-)
    You showed that the samples where statistically different from one another, is there a way to determine what are the main drivers of the differences?

    • @Riffomonas
      @Riffomonas  Před 2 lety +1

      I’ll come back to that. Thanks for watching!

  • @sudiptatalukder714
    @sudiptatalukder714 Před rokem +1

    Thank you for all your efforts. I have 10 samples but in my NMDS plot, there are only 7 points. Do you know why that might happen?

  • @SOADisLegendary
    @SOADisLegendary Před 2 lety +3

    Is it necessary to use a distance matrix as an input to metaMDS? If so why? I have seen protocols using the raw OTU abundance tables as an input as well.

    • @Riffomonas
      @Riffomonas  Před 2 lety +1

      Hey Kate - thanks for watching! I suspect those protocols have a default distance they’re calculating like Bray-Curtis or they’re doing PCA which effectively calculates a correlation based distance. The latter is problematic because of how they treat double zeroes. I’d suggest reaching out to the developers to find out what they’re doing

  • @kathydunn2432
    @kathydunn2432 Před 2 lety +2

    Really enjoying these!!! Quick question on metaMDS, what if after adding a random seed you still don't get convergence?

    • @Riffomonas
      @Riffomonas  Před 2 lety

      Thanks for watching Kathy! How many seeds have you tried? If you’ve tried a few are the samples really similar or different?

    • @kathydunn2432
      @kathydunn2432 Před 2 lety

      @@Riffomonas I've tried a number of different ones hoping one might lead to convergence. The distances suggest they are very different.

  • @Student2-ro7zc
    @Student2-ro7zc Před rokem

    Thank you for all your video! I am a beginner in R but I already learned so much from your video. Can you please point me to the video where you coded the "sample_lookup" variables. I am doing NMDS and that's the only one I can't figure out. Thanks!

  • @ynattirb3
    @ynattirb3 Před 7 měsíci

    Hi there. Thanks for the great video! Are you able to explain the sample_lookout object by any chance?

  • @efratsharon1294
    @efratsharon1294 Před 2 lety +2

    Thank you, how do I get the "No convergence" by code, and set the seed dynamically?

    • @Riffomonas
      @Riffomonas  Před 2 lety +1

      You could set up a loop starting with the seed at one and continue until it converges

  •  Před 2 lety +1

    Great!

  • @meseretmuche6984
    @meseretmuche6984 Před rokem

    Dear Dr, Thank you for your unlimited help and remarkable videos
    Q. I do have vegetation abundant data across four sites. Is the non-metric multidimensional scaling an ideal statistical tool to manage it? Basically, my interest is to see the species distribution across four sites.

  • @nataliasalamanca1843
    @nataliasalamanca1843 Před 10 měsíci

    Great video! Do you have any other video/tutorial about weight MDS (INDSCAL)?

  • @hosseinkarimi1981
    @hosseinkarimi1981 Před rokem

    Thank you for all your fantastic job. I wonder if you can help me apply PERMDISP to determine differences in the dispersion of samples. I appreciate your help in advance.

  • @benjaminleyton8545
    @benjaminleyton8545 Před 2 lety +3

    how can I cite this content?

    • @Riffomonas
      @Riffomonas  Před 2 lety +3

      Argh. I’ve been sitting on writing something. For now you could probably safely cite the JOSE paper on my reproducible research materials

  • @orishapira4946
    @orishapira4946 Před rokem

    anybody knows what is this "sample_lookup" function?

  • @mikhaeldito
    @mikhaeldito Před 2 lety +1

    How important is it to ensure low stress value in NMDS analysis?

    • @Riffomonas
      @Riffomonas  Před 2 lety +2

      I don’t think there’s much you can do about high values. 3D is a non starter. I think it’s a lot like if PCoA only covers 30% of the variation what does it mean. If you see clusters even with poor stress they’re still likely to be real

  • @briannagarcia3233
    @briannagarcia3233 Před 2 lety +1

    Have you had any issues with the scores() function, I tried it with my own data as well as you data and it always gives an error due to the nmds$species being NA (which from the video it looked like yours also was, but you had no problem at the time of this video).

    • @Riffomonas
      @Riffomonas  Před 2 lety +1

      I haven’t. You could always use str() to find how to directly access the scores from the output without using the scores function

    • @briannagarcia3233
      @briannagarcia3233 Před 2 lety +1

      @@Riffomonas I have been using the nmds$points currently to plot the NMDS, but I just wanted to check and make sure that was an ok alternative to using the scores(nmds) function after running metaMDS.

    • @Riffomonas
      @Riffomonas  Před 2 lety +1

      @@briannagarcia3233 That should be perfectly fine

    • @briannagarcia3233
      @briannagarcia3233 Před 2 lety

      @@Riffomonas Thank you so much for the help, and for the extremely helpful videos!

    • @Rydaholic
      @Rydaholic Před 2 lety +1

      Hey, just experienced the same problem after returning to the code after some months. I found scores(nmds, display= c("sites")) gave the same results as nmds$points

  • @pankajsingh-xl8jr
    @pankajsingh-xl8jr Před rokem +1

    Hi Pat, any tutorial on RDA plot?

    • @Riffomonas
      @Riffomonas  Před rokem

      Thanks - I’ll add it to the list of things to get to in the future. It’s not something I use so it might be a while

  • @devikamenon2964
    @devikamenon2964 Před 2 lety +1

    hi, how bring "sites" in nmds plot

    • @devikamenon2964
      @devikamenon2964 Před 2 lety +1

      ^to

    • @Riffomonas
      @Riffomonas  Před 2 lety

      You’d need to map site to color or shape. Additionally you could draw ellipses with stat_ellipse

  • @josenicolasperez-garcia8119

    ANOSIM test please!

    • @Riffomonas
      @Riffomonas  Před 2 lety +1

      I’ll try to get to it. Why anosim rather than Adonis/permanova?

  • @shohannie
    @shohannie Před rokem

    Could you go through this but without using the GGPlot?

    • @Riffomonas
      @Riffomonas  Před 5 měsíci

      Sorry, I only use ggplot2 at this point

  • @SairaKhan-uc4rq
    @SairaKhan-uc4rq Před 2 lety

    Thanks for these informative videos! I got stuck on “scores(nmds)” when I run this command its written as “Error in x$species[, choices, drop = FALSE] : incorrect number of dimensions “. Can you please help me to understand this bug

    • @Riffomonas
      @Riffomonas  Před 2 lety

      Thanks! Are you sure you converged to a solution? If you run str(nmds) do you see the axes data?

    • @jessrobson3029
      @jessrobson3029 Před 2 lety

      @@Riffomonas I've been having this same error since I recently updated R-studio. When I run this same code on my old laptop (not updated since 2020) everything runs smoothly.

    • @jessrobson3029
      @jessrobson3029 Před 2 lety

      @@Riffomonas Also, happy (very belated) birthday.

    • @laurabachmaier3354
      @laurabachmaier3354 Před rokem

      Currently stuck on the same problem but its definitely converged. Anyone figured out how to fix it?

  • @eugeniadegano2799
    @eugeniadegano2799 Před 2 lety +1

    👏

  • @KiamatChange-ke8nd
    @KiamatChange-ke8nd Před 11 měsíci

    1:36 MD to 2D then 3D. It doesn't make sense. 😂 // It might do in Mars. 😂

  • @Unmasking_Viandalisme
    @Unmasking_Viandalisme Před 2 lety +1

    Yawn! 3D isn't that difficult. I'm going to talk to my cats.. they don't yap BS.

    • @Riffomonas
      @Riffomonas  Před 2 lety +3

      But have you used the ggcats package to do it? That would really empress your cats