Find markers and cluster identification in single-cell RNA-Seq using Seurat | Workflow tutorial

Sdílet
Vložit
  • čas přidán 1. 07. 2024
  • A detailed walk-through of steps to find canonical markers (markers conserved across conditions) and find differentially expressed markers in a particular cell type between conditions using Seurat's find markers functions in R. I hope you find the video informative. I look forward to your comments under the comments section!
    1) Data:
    drive.google.com/file/d/13I22...
    2) Link to code:
    github.com/kpatel427/CZcamsT...
    3) Vignettes:
    ▸ satijalab.org/seurat/articles...
    ▸ satijalab.org/seurat/articles...
    ▸ hbctraining.github.io/In-dept...
    4) Marker databases:
    1. SCSig: www.gsea-msigdb.org/gsea/msigd...
    2. PangloDB: panglaodb.se/
    3. CellMarker: bio-bigdata.hrbmu.edu.cn/CellM...
    Chapters:
    0:00 Intro
    0:36 findMarkers(), findAllMarkers(), findConservedMarkers()
    4:09 Study design
    4:57 Load data
    6:20 Visualize by clusters and condition
    9:15 findAllMarkers()
    12:51 DefaultAssay 'RNA'
    14:10 findConservedMarkers() for cluster 3
    17:12 Visualize canonical markers in a FeaturePlot
    20:15 RenameIdents
    21:55 Annotating clusters and marker databases
    23:56 Annotating rest of the clusters
    26:21 Perform differential expression in CD16 Monocytes between conditions (findMarkers())
    30:52 Visualize markers identified by findConservedMarkers() vs findMarkers()
    Show your support and encouragement by buying me a coffee:
    www.buymeacoffee.com/bioinfor...
    To get in touch:
    Website: bioinformagician.org/
    Github: github.com/kpatel427
    Email: khushbu_p@hotmail.com
    #bioinformagician #bioinformatics #findmarkers #findallmarkers #findconservedmarkers #deg #seurat #integration #cca #R #genomics #beginners #tutorial #howto #omics #research #biology #ncbi #GEO #rnaseq #ngs

Komentáře • 65

  • @poojasavla6240
    @poojasavla6240 Před 3 měsíci

    honestly as a computational biologist who just started working in this industry, you are so awesome

  • @yusufali5812
    @yusufali5812 Před 2 lety

    Thanks for this simplified and super informative video!

  • @nayeemanushrat3174
    @nayeemanushrat3174 Před 2 lety

    Thank you! Looking forward to your next video tutorials!☺

  • @georgegavriil8951
    @georgegavriil8951 Před 2 lety +3

    Thank you for this very detailed and informative video, can't wait for the next scRNA-seq videos!

  • @linus8490
    @linus8490 Před 2 lety

    Very informativet! Thank you! Looking forward to your next video tutorials!

  • @lukesimpson1507
    @lukesimpson1507 Před rokem +2

    Really amazing content. I could have saved myself months if I had found this channel earlier! Keep up the good work!

  • @janicexu1548
    @janicexu1548 Před rokem

    Thank for these tutorials :-) I wish I had them earlier! Thanks also for including when you have errors as it is helpful for learning how to troubleshoot.

  • @pragnyarishika5661
    @pragnyarishika5661 Před rokem +2

    Most useful channel for single cell RNA seq. Thank you so much for excellent explanation. Please make videos on building neural network models for single cell RNA seq data.

  • @aigerimk692
    @aigerimk692 Před měsícem

    God bless you and your videos! Thanks a lot!

  • @user-dp7sg7kd7f
    @user-dp7sg7kd7f Před rokem

    Really helpful tutorial. Thanks for your effort!!

  • @demetronix
    @demetronix Před 2 lety

    thank you for these videos. Very helpful!!

  • @tushardhyani3931
    @tushardhyani3931 Před 2 lety

    Thank you for this video !!

  • @user-mb5ld7re8m
    @user-mb5ld7re8m Před 2 lety

    brilliant work!

  • @user-gg1js5kg1p
    @user-gg1js5kg1p Před 9 měsíci

    Thanks a lot. I've been following your tutorial for the last 8/9 months. It helped a lot with my M.S thesis and my bioinformatics Knowledge.
    I appreciate your time and would like to request you that it would be helpful if you make a tutorial for Cell-cell communications for scRNAseq data analysis in R.

  • @siankangchong3617
    @siankangchong3617 Před 2 lety +1

    Thanks for the video! It is very helpful, I'm looking forward to seeing a video explaining the steps of performing GO enrichment analysis, appreciate your hard work!!

    • @Bioinformagician
      @Bioinformagician  Před 2 lety

      I shall make a video on GO enrichment analysis soon :) Thanks!

  • @abdou-samadkone6397
    @abdou-samadkone6397 Před 11 měsíci

    THANK you very much. You are amazing 🤩🤩🤩🤩🤩🤩🤩🤩

  • @sunghyoukpark7423
    @sunghyoukpark7423 Před rokem +1

    Your videos are just awesome! I am looking forward to the cell type identification video. Without cell type identification, all the painstaking previous steps do not have much meaning, I guess.

    • @Bioinformagician
      @Bioinformagician  Před rokem

      Absolutely, working on it. Hopefully should be able to come out with it soon.

  • @kitdordkhar4964
    @kitdordkhar4964 Před 2 lety +6

    New commands learn today, q10. It would be great if you show some datasets on the mouse model. Due to the lack of mouse atlas, it is a long road to annotate the cells. I believe that you will find something easy for us to do. I will be waiting for the pipeline. Thanks again! Great video as always!

    • @Bioinformagician
      @Bioinformagician  Před 2 lety +1

      I shall consider using data from mouse models for some of my upcoming single-cell videos. Thanks for the suggestion! :)

  • @user-ck3ki9hq9t
    @user-ck3ki9hq9t Před 9 měsíci

    Your tutorials make me feel like a first year grad student getting schooled by a 5th year. Nothing better than that! Thank you. Did you ever work in the Satija lab?

  • @yukaizhang2675
    @yukaizhang2675 Před rokem

    Nice content. Really helps me start from the beginning. Thank you! May I ask that how to fetch the relative expression of given genes of each animals/ conditions?

  • @xiaoliu6964
    @xiaoliu6964 Před rokem +1

    Your videos are super helpful and informative! Could you make a tutorial for how to integrate and analyze single-cell ATAC-seq and RNA-seq data? Thank you!!!!

    • @Bioinformagician
      @Bioinformagician  Před rokem +1

      That’s definitely in the pipeline. Please stay tuned :)

    • @xiaoliu6964
      @xiaoliu6964 Před rokem

      @@Bioinformagician You are awesome 🤩!

  • @fabiohbcosta
    @fabiohbcosta Před 10 měsíci

    Thanks for the amazing tutorials!
    One question: how do I perform this exact analysis starting from my filtered matrix.h5 files?
    I have two files, for two conditions, and wanted to do the same thing you did here.
    Thanks !

  • @chriskuo
    @chriskuo Před rokem +1

    This is extremely helpful. If i am interested to see if there’s a cell that expressed both cd163 and cd45 how do i do that?

  • @domenicoalessandrosilvestr7829

    Hi, what is in your opinion the best test to use in the findmarkers or findallmakers function when comparing two cell populations with very different cell numbers?

  • @efstratioskirtsios298
    @efstratioskirtsios298 Před 11 měsíci

    Lovely video! Many thanks. Do you prefer using the DEseq2 option as test.use instead of the default in the DEG analysis? Is edgeR also compatible with Seurat? Sorry, I am new to this

  • @marcelohurtadocastillo3982

    Great video, thank you so much for doing it! Sorry if I missed something but I didn't finish understanding why you choose to use FindConservedMarkers() to find markers differentially expressed between one clusters and all the others. As far as I knew, this is accomplished with FindAllMarkers() and FindConservedMarkers() will give you the markers that are conserved between two groups. Maybe the reason will be that you are calculating the differentially expressed markers from one cluster versus the other groups, but with similar expressions between the two conditions (treated and untreated)?. If this is the reason, you are not supposed to have the same result using FindAllMarkers()? Thanks again and hope you can help me :)

  • @abdou-samadkone6397
    @abdou-samadkone6397 Před 11 měsíci

    This is extremely useful. What about using the FindconservedMarkers function to separate our cells, ie high/low PD1 expression, rather than control/treatment. Is it the same method? thanks

  • @junxiao7009
    @junxiao7009 Před 2 lety +3

    Thanks for your informative video! I have a question. Your last video had mentioned that the batch correction method 'harmony' would not change the original expression data (included 'count' or 'data' in seurat data), but add a dimensionality reduction data. However, when we use the 'FindAllMarkers' to identify the different expression genes bewteen the 'STIM' and 'CTRL', this function will use the 'count' or 'data' in our seurat data. Dose it mean we actually compared the expression bewteen 'STIM' and 'CTRL' arcoss the data without batch correction?

    • @Bioinformagician
      @Bioinformagician  Před 2 lety +6

      Great question! No matter which integration method you use, the one which returns a corrected expression matrix or the one which do not (like Harmony), we always perform differential expression test on 'unintegrated' data. That is the reason we make sure our default assay is set to 'RNA' (the assay that stores unintegrated data) prior to performing this analysis.
      The integration procedure inherently introduces dependencies between data points. This violates the assumptions of the statistical tests used for differential expression.
      So the 'count' or 'data' slot is from RNA assay that stores unintegrated data.

  • @abhilashdasari
    @abhilashdasari Před 6 měsíci

    0:18: 🔍 The video discusses finding differentially expressed features and cluster identification in single cell RNA seq data using the seurat package.
    4:34: 🧬 The video discusses identifying gene expression changes in samples treated with interferon beta and the control group in a particular cell type.
    9:59: ⚙ The video discusses the parameters for testing genes in clusters and populations.
    14:35: 🔍 The video discusses the process of identifying cell clusters based on gene expression and grouping variables.
    19:23: 📊 The video explains how to use quantiles to divide data and rename cell identities in a biological dataset.
    24:20: ⚙ The video demonstrates how to perform cluster identification and find differential gene expression using pre-annotated cell data.
    29:14: 🔬 The presentation discusses comparing gene expression in cd16 monocyte cells between stimulated and control groups.
    Recapped using Tammy AI

  • @sumankundu762
    @sumankundu762 Před rokem

    Great presentation. simple, clear and to the point. Application and interpretation of many functions in Seurat package are now clear to me. Just wondering did you make any video how the processed dataset: ifnb_harmony.rds was constructed using the source data? This is just to appreciate the R codes better for my own understanding as I am relatively new in this space. Thank you.

    • @Bioinformagician
      @Bioinformagician  Před rokem

      This is the video - czcams.com/video/zEuqhiu341I/video.html where I explain how ifnb_harmony.rds was generated.

  • @KellyBlust
    @KellyBlust Před rokem

    Hi, Thanks for your great videos! You mentioned during this video that you want to make a new video about using the automatic cell annotation tools. Has this already been done?

    • @Bioinformagician
      @Bioinformagician  Před rokem

      No it hasn't been done yet, however it is very much on my list of videos to make, and hopefully I should be able to create one soon. Thanks for following up!

  • @patrickmellors8445
    @patrickmellors8445 Před rokem

    When we have used SCT to normalize data, I assume with should use the SCT assay for FindMarkers?

  • @luiseduardogoncalves2228
    @luiseduardogoncalves2228 Před 2 lety +1

    Hi, thank you so much for your videos and for this topic specifically. I was trying to run it myself and I came across through this error:
    Error in findconservedmarkers(seurat_loom, ident.1 = 3, grouping.var = "Patient") :
    could not find function "findconservedmarkers"
    So it suggested to install these packages
    install.packages('BiocManager')
    BiocManager::install('multtest')
    install.packages('metap')
    After installing these packages, the same error keeps poping. Do you have any suggestions of what I should do?

    • @Bioinformagician
      @Bioinformagician  Před 2 lety

      Did you load the libraries after installing these packages?

  • @chrisdoan3210
    @chrisdoan3210 Před rokem

    Hi Bioinformagicain,
    I try to run FindConservedMarkers() but I got this message:
    Warning: Identity: 8 not present in group B. Skipping VVWarning: Identity: 8 not present in group A. Skipping NCError in marker.test[[i]] : subscript out of bounds.
    This error appears in many clusters I chose. Would you have a suggestion to troubleshoot this error? Thank you so much!

  • @stacygenovese1761
    @stacygenovese1761 Před 2 lety +1

    This videos have saved me! I have three conditions: KO/WT/DBLKO. How do I do FindMarkers() on the integrated data? I can only specify ident.1 and ident.2. There is no ident.3. Any ideas???

    • @Bioinformagician
      @Bioinformagician  Před 2 lety

      One way I can think is you can make pairwise comparisons and then intersect the DE genes from both comparisons.

  • @singhh5050
    @singhh5050 Před 2 lety +2

    Hi! At what stage of the analysis workflow can you utilize GSEA?

    • @Bioinformagician
      @Bioinformagician  Před 2 lety +2

      GSEA gives you an idea on what pathways are differentially enriched. It could be after you identify markers for each cluster if you are trying to understand the biological mechanism of certain cells or it could be used to help you with cluster identification. If it is latter, then it would be used after you cluster your cells. So really depends on what your goal is.

    • @singhh5050
      @singhh5050 Před 2 lety

      @@Bioinformagician Thank you for your response! So does this mean that you can use GSEA to find enriched gene sets between different clusters in the same dataset/condition? Like can you compare different cell type clusters in one graph using GSEA? I’m used to thinking about it as something that you can only utilize when you have a specific control dataset and another experimental dataset and you compare similar cell types between the two conditions. I’m really new to the field of scRNA-seq analysis so any thoughts would be super helpful :)

  • @wasima4463
    @wasima4463 Před rokem +1

    @Bioinformagician did you already make a video on automatic cell annotation tools (23:15)?

    • @sayantidey6368
      @sayantidey6368 Před rokem

      @Bioinformagician ...Yes please if you have then that would be great for people who are struggling with the unbiased annotation using packages like SingleR. Thanks in advance.

    • @Bioinformagician
      @Bioinformagician  Před rokem

      That's next on my list. Hopefully should be able to come up with a video soon. Please stay tuned :)

    • @c.p.8689
      @c.p.8689 Před rokem

      @@Bioinformagician SingleR is a mess. I just used the scType. Not that great either.

  • @user-pn5gn8lw2o
    @user-pn5gn8lw2o Před 6 měsíci

    How to annotation other species other than mouse, like ferret?

  • @menu1006
    @menu1006 Před rokem

    Hello thx for making such informative videos plz create video on automated cell annotation using different packages in R.. this will be a great help thx

    • @Bioinformagician
      @Bioinformagician  Před rokem

      That will hopefully be published on my channel soon. Please stay tuned :)

  • @ifeoluwaemmanuel5093
    @ifeoluwaemmanuel5093 Před 26 dny

    What if the seurat ident has not been given?

  • @shubhrajitbarman3006
    @shubhrajitbarman3006 Před 2 lety +1

    I want to find number of cells present in each cluster? Please help me

  • @derejejima9420
    @derejejima9420 Před rokem

    How do you integrate and find markers for more than two conditions?

    • @Bioinformagician
      @Bioinformagician  Před rokem

      You could integrate data and follow the pseudo-bulking approach by aggregating counts for all cells to sample level.

  • @bioinfo3
    @bioinfo3 Před rokem

    You RenameIdents() on the Idents of the seurat object and then instead of renaming the remaining, just change the ident column to the existing annotations in the seurat object. This makes it very unclear of how someone would manually change the name of each cluster.

  • @HahaHub-gd4nz
    @HahaHub-gd4nz Před 8 měsíci

    Please talk slower

  • @chrisdoan3210
    @chrisdoan3210 Před rokem

    Thank you for your video! Would you please tell me why you choose only top gene in b.interferon.response. Could we choose more genes such as top 5 genes in this list of 1273 genes?