Bioinformagician
Bioinformagician
  • 59
  • 1 324 608
Comprehensive Guide to Downstream Analysis for Single-Cell ATAC-Seq in R | scATAC-Seq Workflow
A detailed walk-through of downstream analysis steps to annotate single-cell ATAC-Seq data by integrating with single-cell RNA-Seq dataset, to perform differential accessibility analysis and visualize genomic regions of interest. In this demonstration, we continue with our pre-processed single-cell ATAC sequencing dataset from 10X Genomics in R using the #Signac package. I hope you liked the video. I look forward to your comments under the comments section!
▸ Link to Data:
Pre-processed scRNA-Seq data: signac-objects.s3.amazonaws.com/pbmc_10k_v3.rds
▸ How to pre-process scATAC-Seq data (Part 1 of this video):
czcams.com/video/yEKZJVjc5DY/video.html
▸ Link to Vignette:
stuartlab.org/signac/articles/pbmc_vignette#non-linear-dimension-reduction-and-clustering
▸ Link to code:
github.com/kpatel427/CZcamsTutorials/blob/main/scATACSeq_downstream_workflow.R
▸ Other Useful Resources:
1. Strategies to annotate scATAC-Seq data: cdn.10xgenomics.com/image/upload/v1660261285/support-documents/CG000234_TechnicalNote_CellTypeAnnotationUsingATAC_RevB.pdf
2. How to pre-process scRNA-Seq? czcams.com/video/5HBzgsz8qyk/video.html
3. Find markers and cluster identification in single-cell RNA-Seq using Seurat: czcams.com/video/1i6T9hpvwg0/video.html
4. Single-cell playlist: czcams.com/play/PLJefJsd1yfhagnkss5B1YCsHaH0GWQfFT.html
Chapters:
0:00 Intro
0:44 scATAC-Seq Analysis Workflow
1:52 Strategies to annotate scATAC-Seq cells
4:30 Dataset and requirements for demonstration
6:06 Starting with pre-processed scATAC-Seq
6:30 What is a gene activity matrix?
8:29 Creating a gene activity matrix
10:52 Visualizing gene activity of canonical markers
13:38 Visualizing cell annotations in scRNA-Seq
17:02 Integrating scATAC-Seq with scRNA-Seq
17:58 Transfer labels from scRNA-Seq to scATAC-Seq
19:54 Visualizing scATAC-Seq after integration
22:00 Performing differential accessibility analysis
26:19 Extracting fold changes for differentially accessible regions
27:40 Visualizing genomic regions of interest
30:36 Create interactive shiny genomic browser to visualize regions of interest
Show your support and encouragement by buying me a coffee:
www.buymeacoffee.com/bioinformagic
To get in touch:
Website: bioinformagician.org/
Github: github.com/kpatel427
Email: khushbu_p@hotmail.com
#bioinformagician #bioinformatics #signac #atacseq #atac #seurat #singlecell #singler #illumina #bridgeamplification #sequencingbysynthesis #multiplex #alleles #10x #oxfordnanopore #pacbio #affymetrix #barcode #setseed #reproducibility #pseudorandom #singleR #singlecell #annotationdbi #reversestranded #directstranded #strandedness #survival #survminer #survivalanalysis #kaplanmeier #tcga #gdcportal #tcgaportal #nci #cran #bioconductor #funcotator #variantcalling #variants #gatk #vcf #gvcf #haplotype #alleles #geneticvariants #mutations #gff3 #gff #gtf #sam #bam #phred #fasta #fastq #singlecell #10X #ensembl #biomart #annotationdbi #annotables #affymetrix #microarray #affy #ncbi #genomics #beginners #tutorial #howto #omics #research #biology #GEO #rnaseq #ngs
zhlédnutí: 1 449

Video

How to analyze single-cell ATAC-Seq data in R | Detailed Signac Workflow Tutorial
zhlédnutí 3,2KPřed 2 měsíci
A detailed walk-through of standard preprocessing steps to analyze a single-cell ATAC sequencing dataset from 10X Genomics in R using the #Signac package. I hope you liked the video. I look forward to your comments under the comments section! ▸ Link to Vignette: stuartlab.org/signac/articles/pbmc_vignette#non-linear-dimension-reduction-and-clustering ▸ Link to code: github.com/kpatel427/CZcamsT...
Bioinformatics Q&A - PART 2 | In collaboration with @chatomics
zhlédnutí 883Před 4 měsíci
Welcome to our exclusive 2 part series of Bioinformatics Collaboration Q&A session with the expert Ming "Tommy" Tang! In this video, we dive deep into the world of bioinformatics, tackling your burning questions and unraveling the mysteries of this fascinating field. Have a burning bioinformatics question for our next Q&A? Drop it in the comments below, and Tommy might answer it in the next ses...
Bioinformatics Q&A - PART 1 | In collaboration with @chatomics
zhlédnutí 1,7KPřed 4 měsíci
Welcome to our exclusive 2 part series of Bioinformatics Collaboration Q&A session with the expert Ming "Tommy" Tang! In this video, we dive deep into the world of bioinformatics, tackling your burning questions and unraveling the mysteries of this fascinating field. Have a burning bioinformatics question for our next Q&A? Drop it in the comments below, and Tommy might answer it in the next ses...
A Guide to Next Generation Sequencing Basics and Terminologies | Bioinformatics 101
zhlédnutí 6KPřed 4 měsíci
In this video, I delve into the intricacies of a standard workflow for next-generation sequencing (NGS). We'll explore essential terminologies and concepts, including sequencing library preparation, multiplexing, sample pooling, as well as the roles of adaptors and barcodes. Furthermore, I provide an overview of different generations of sequencing technologies, spanning from the traditional San...
From Motifs to Pathways and Master-regulators (RNA-Seq + ChIP-Seq) using Genome Enhancer - PART 1
zhlédnutí 961Před 5 měsíci
In this comprehensive video, I delve into the intricate realm of multi-omics data to identify master-regulators using geneXplain's automated pipeline, Genome Enhancer. With its user-friendly point-and-click interface, GenomeEnhancer stands out as the premier tool for uncovering transcription factor binding sites (TFBS), master regulators, potential drug targets, and the corresponding treatments...
From Motifs to Pathways and Master-regulators (RNA-Seq + ChIP-Seq) using Genome Enhancer - PART 2
zhlédnutí 655Před 5 měsíci
Welcome back to the continuation of this series! In this video, I pick up where I left off in part 1, exploring the detailed report generated by Genome Enhancer. Join me as I delve into the comprehensive analysis of multi-omics data, identifying master-regulators using geneXplain's automated pipeline. Genome Enhancer's user-friendly point-and-click interface remains the go-to tool for uncoverin...
Metagenomics Taxonomic Classification using Kraken 2 in BioBam's OmicsBox
zhlédnutí 3,6KPřed 5 měsíci
In this comprehensive video, I demonstrate how to analyze metagenomics WGS shotgun sequencing data and perform Taxonomic Classification using Kraken 2 in BioBam's OmicsBox tool. I broadly discuss Targeted Sequencing (16S/18S/ITS) and Shotgun WGS, followed by going over metagenomics pipeline and workflow steps and the research questions that metagenomics analysis help to answer. Furthermore, I'l...
Demystifying Conda (Anaconda, Miniconda and Bioconda) and Virtual Environments
zhlédnutí 6KPřed 7 měsíci
Today, we're diving into a topic that every developer, data scientist, or a bioinformatician has experienced at some point: the dreaded installation errors for conflicts due to package versions and dependency nightmare. In this video, I'll walk you through the basics of setting up Conda and Renv. I talk about what's with the different condas - Anaconda, Miniconda and Bioconda. I'll show you how...
Motif discovery and enrichment in genomic regions (ChIP-Seq) using TRANSFAC
zhlédnutí 2,7KPřed 8 měsíci
In this comprehensive video, we perform motif discovery and find enriched transcription factor binding sites (TFBS) using TRANSFAC in genomic regions of interest i.e ChIP-Seq peaks. geneXplain's point and click interface requires no programming to run the workflows and generate meaningful insights from high throughput data. In this demonstration, we run the workflows using geneXplain's platform...
Introduction to Motif Discovery and Transcription Factor Binding Site Analysis
zhlédnutí 9KPřed 9 měsíci
In this comprehensive video, I cover basics of motifs, transcription factors, regulatory regions (promoters, enhancers, silencers, insulators), how motifs are represented and questions that Transcription Factor Binding Site (TFBS) analysis can help to answer. Furthermore, I list down various available tools to perform motif discovery and motif databases followed by demonstrating how to find enr...
Understanding File Formats in Bioinformatics: ChIP-Seq files - BigWig (Wiggle) and BED/bigBed
zhlédnutí 5KPřed 11 měsíci
In this comprehensive guide on ChIP-seq file formats video, I delve into details of ChIP-sequencing, experimental and computational workflow and types of controls used. Further, I briefly discuss Irreproducible Discovery Rate (IDR) and discuss some commonly used ENCODE terminologies. Lastly, I go over the details of .bigwig, .bigBed and BED (BED6 4 and BED6 3) files and discuss various ways one...
Understanding set.seed in R: Ensuring Reproducibility in Data Analysis
zhlédnutí 5KPřed 11 měsíci
Have you ever wondered what set.seed() does in R? What is it used for? Ever wondered why your UMAP or t-SNE appears different everytime you run it? In this video I have demonstrated 2 quick examples to explain how set.seed() can be used to make your code reproducible. I hope you find this video helpful! Leave your thoughts in the comment section below! ▸ Link to Data: www.10xgenomics.com/resour...
Automatic cell-annotation for single-cell RNA-Seq data: A detailed SingleR tutorial (PART 2)
zhlédnutí 7KPřed 11 měsíci
Continuing the discussion from previous video about cell type annotation, in this video I walk through various strategies to perform cell type annotation using multiple reference datasets. Furthermore, I talk about strengths and pitfalls of each strategy and demonstrate how to annotate cell types in #SingleR using multiple reference datasets from celldex package. I hope you find this video help...
Automatic cell-annotation for single-cell RNA-Seq data: A detailed SingleR tutorial (PART 1)
zhlédnutí 15KPřed rokem
One of the most challenging task in processing single-cell RNA-Seq data is to annotate cell types. In this video I walk through what is a typical cell annotation workflow, discuss various annotation strategies and their strengths and pitfalls. Further, I explain how #SingleR works and demonstrate how to annotate cell types using SingleR using a single reference dataset. Lastly, I discuss variou...
What is Strandedness in RNA-Seq data? | RNA-Seq Stranded Library Construction Methods
zhlédnutí 5KPřed rokem
What is Strandedness in RNA-Seq data? | RNA-Seq Stranded Library Construction Methods
Survival analysis with TCGA data in R | Create Kaplan-Meier Curves
zhlédnutí 16KPřed rokem
Survival analysis with TCGA data in R | Create Kaplan-Meier Curves
Elucidata's Bulk RNA-Seq OmixAtlas: The Effortless Dataset Discovery and Retrieval Platform
zhlédnutí 2,7KPřed rokem
Elucidata's Bulk RNA-Seq OmixAtlas: The Effortless Dataset Discovery and Retrieval Platform
Download data from GDC Portal using TCGAbiolinks R Package
zhlédnutí 16KPřed rokem
Download data from GDC Portal using TCGAbiolinks R Package
How to install packages in R? What is CRAN? What is Bioconductor? | Bioinformatics 101
zhlédnutí 12KPřed rokem
How to install packages in R? What is CRAN? What is Bioconductor? | Bioinformatics 101
WGS Variant Calling: Variant Filtering and Annotation - Part 2 | Detailed NGS Analysis Workflow
zhlédnutí 12KPřed rokem
WGS Variant Calling: Variant Filtering and Annotation - Part 2 | Detailed NGS Analysis Workflow
DESeq2 Error Fix: DESeqDataSetFromMatrix ncol(countData) == nrow(colData) is not TRUE
zhlédnutí 3,9KPřed rokem
DESeq2 Error Fix: DESeqDataSetFromMatrix ncol(countData) nrow(colData) is not TRUE
Add to PATH in Linux - Fix "command not found error" | Linux Basics
zhlédnutí 10KPřed rokem
Add to PATH in Linux - Fix "command not found error" | Linux Basics
WGS Variant Calling: Variant calling with GATK - Part 1 | Detailed NGS Analysis Workflow
zhlédnutí 36KPřed rokem
WGS Variant Calling: Variant calling with GATK - Part 1 | Detailed NGS Analysis Workflow
Understanding File Formats in Bioinformatics: VCF and gVCF
zhlédnutí 11KPřed rokem
Understanding File Formats in Bioinformatics: VCF and gVCF
Types of High Throughput Data in Bioinformatics | Bioinformatics 101
zhlédnutí 6KPřed rokem
Types of High Throughput Data in Bioinformatics | Bioinformatics 101
Weighted Gene Co-expression Network Analysis (WGCNA) Step-by-step Tutorial - Part 1
zhlédnutí 37KPřed rokem
Weighted Gene Co-expression Network Analysis (WGCNA) Step-by-step Tutorial - Part 1
Weighted Gene Co-expression Network Analysis (WGCNA) Step-by-step Tutorial - Part 2
zhlédnutí 19KPřed rokem
Weighted Gene Co-expression Network Analysis (WGCNA) Step-by-step Tutorial - Part 2
Weighted Gene Co-expression Network Analysis (WGCNA) Detailed Workflow Steps | Bioinformatics 101
zhlédnutí 16KPřed rokem
Weighted Gene Co-expression Network Analysis (WGCNA) Detailed Workflow Steps | Bioinformatics 101
Introduction to Weighted Gene Co-expression Network Analysis (WGCNA) | Bioinformatics 101
zhlédnutí 22KPřed rokem
Introduction to Weighted Gene Co-expression Network Analysis (WGCNA) | Bioinformatics 101

Komentáře

  • @melinaguillon2449
    @melinaguillon2449 Před 3 hodinami

    Hello :) I got an error message in my terminal when I try to unzip the file : gunzip: can't stat: GSE183947_fpkm.csv.gz (GSE183947_fpkm.csv.gz.gz): No such file or directory as well as "/Desktop/Demo/ zsh: permission denied"

  • @melinaguillon2449
    @melinaguillon2449 Před 4 hodinami

    Hi! I can't install GEOquery, I get this error message: Warning in install.packages : package ‘GEOquery’ is not available for this version of R

  • @faezedarbaniyan1787

    Thank you so much for elaborating this. I can't relate the definition of Allele Frequency that you mentioned here for rows 2 and 3 in your sample (at 23:44 minutes). Can you please explain it for those?

  • @fp2551
    @fp2551 Před dnem

    Thank you for this video, it is really helpful. I see a huge inflation in counts after the pseudobulk step. How do I normalise back down to counts that can align with bulk-RNA data for matched samples? I am thinking of taking the median count value per cell type and dividing my dataset by this

  • @sjwu571
    @sjwu571 Před dnem

    Hi loved your videos. Just want to point out, at 6:28, I think your labeling for Gene 1 is wrong. Please double check I'm not crazy. Thanks!

  • @kyounokuma
    @kyounokuma Před dnem

    What a fantastic overview of Conda. Thank you so much. You've answered all of my questions.

  • @miguelcuevas2976
    @miguelcuevas2976 Před dnem

    What is the benefit of monocle over other trajectory analysis tools such as slingshot?

  • @ryanwelch2831
    @ryanwelch2831 Před 3 dny

    How long did the gatk BaseRecalibrator algorithm take to make the table? Thank you!

  • @hourirazavi873
    @hourirazavi873 Před 3 dny

    That was great. Thank you so much

  • @Ice84letters
    @Ice84letters Před 3 dny

    thank you very much for this amazing video!! you help me a lot to understand how conda works!

  • @user-sl9wi7tl4f
    @user-sl9wi7tl4f Před 4 dny

    Hello Khusbu, when I run "> sweep.res.list <- paramSweep_v3(merged_seurat.filtered, PCs = 1:20, sct = FALSE)" It shows : Error in paramSweep_v3(merged_seurat.filtered, PCs = 1:20, sct = FALSE) : no slot of name "counts" for this object of class "Assay5" could you please help me to fix this problem, thank you.

  • @aishaa812
    @aishaa812 Před 5 dny

    Thank you. Its extremely helpful for me since I am a beginner in R studio and I am trying to apply data analysis in R studio.

  • @pariaalipour61
    @pariaalipour61 Před 6 dny

    Thanks for the helpful video! I was wondering how come batch effect correction is useful when we use the expression matrix as input?

  • @sanjanaghosh7
    @sanjanaghosh7 Před 7 dny

    Can you make a video elaborating about the data analysis? It would be of great help. Thank you.

  • @Subhash_mahamkali
    @Subhash_mahamkali Před 8 dny

    does base Qulity recalibration step is very important? beacuse, I am using this pipeline on WGS of sorghum data set. Now I have called the variants without this step ( filtering is done). I used the same vcf file for BQSR step and for some odd reason.. in my .g.vcf file there are no SNPs at all.....

  • @ezra47986
    @ezra47986 Před 8 dny

    Thank you for your video! I just have question, why did you extracted the unstranded counts, but not any other count type?

  • @SamipSapkota-zg8hy
    @SamipSapkota-zg8hy Před 8 dny

    what about tpm datasets?? i followed your video. you used same dataset as this video and i used the tpm data set then i saw there is no matching things happening help me out please sister

  • @shivanirai3626
    @shivanirai3626 Před 8 dny

    Best channel for any bioinformatician ❤❤

  • @vetlove4056
    @vetlove4056 Před 9 dny

    Sister can you make a video on raw data processing fron ncbi

  • @vetlove4056
    @vetlove4056 Před 9 dny

    Video was good

  • @SamipSapkota-zg8hy
    @SamipSapkota-zg8hy Před 9 dny

    yo sister can you make a video on raw data processing after we download from ncbi??????

  • @RaviKumar-cb7tw
    @RaviKumar-cb7tw Před 12 dny

    Can you please tell me how to install Hisat2 in my computer? I am using Macbook AIr M2 chip and Sonoma 14.5 OS

  • @naveennaveenkumar7127

    hello I am trying to install the packages in R I have installed the bioconductor package but the I have to instal the DESeq2 package when I am trying to install this package it is showing me like this Warning in install.packages : package ‘DESeq2’ is not available for this version of R A version of this package for your version of R might be available elsewhere, see the ideas can u tell me how to solve this problem

  • @faezedarbaniyan1787
    @faezedarbaniyan1787 Před 14 dny

    Hi, Thank you for all great videos you make. I have been delighted to find your channel and used it for several types of analysis. I do have illumina fastq samples for WES. I followed your pipeline and was going well untill HaplotypeCaller. In fact, I get zero variant for all of my 4 samples were they were all great in terms of duplicate/quality. Is there anything I am missing or need to check?

  • @vinayakkawale804
    @vinayakkawale804 Před 14 dny

    thank you so much i i have one dout I am get this error Error in gse[[1]] : this S4 class is not subsettable what should I do for data

  • @davidmartins7104
    @davidmartins7104 Před 15 dny

    Is there any problem applying TMP normalization in metagenomic paired-end sequencing data?

  • @nancyanderson5413
    @nancyanderson5413 Před 15 dny

    Is there a chance of a tutorial for CUT&RUN or chip-seq soon?

  • @nancyanderson5413
    @nancyanderson5413 Před 15 dny

    Why not to use EnsDb.Hsapiens.v86?

  • @cibelebandeira2625
    @cibelebandeira2625 Před 16 dny

    Thank you for this amazing video, slides and explanation!!

  • @shilpam2028
    @shilpam2028 Před 16 dny

    Can you please tell how to convert the peak txt file to bed file?? please help??

  • @shilpam2028
    @shilpam2028 Před 16 dny

    Thank you

  • @juanete69
    @juanete69 Před 17 dny

    How do you know it's not the opposite direction? How do you know that gene2 is not causing higher gene expression of gene1?

  • @Ice84letters
    @Ice84letters Před 17 dny

    Hellow! thank you so much for the videos!! are there a video on how to use liftover with command lines from hg19 to hg38? thanks a lot!!!

  • @juanete69
    @juanete69 Před 17 dny

    Why is it important the direction of the read?

  • @grace-426
    @grace-426 Před 17 dny

    Thankyou mam.. I want to know that is it essential to have phenotypic data for ung this in my transcriptomics data?

  • @juanete69
    @juanete69 Před 17 dny

    What happens with reads aligned multiple times?

  • @kedar_ghimire
    @kedar_ghimire Před 19 dny

    this was really helpful for a non-bioinformatic researcher like me to understand whats going on. Especially the parts where you showed command by command to create two environments. Most of the youtube videos overestimate the programming skills of non-bioinformaticians and forget that it is them who watch these videos. Also, i like how you explain everything without missing even the minor stuff :D thanks

  • @juanete69
    @juanete69 Před 20 dny

    Before getting the counts... do we need to align our reads?

  • @willianabrahamdasilveira5390

    Thank you very much, your tutorial saved me after hours trying to figure out how to do it only with the Package documentation.

  • @juanete69
    @juanete69 Před 20 dny

    Do I need to do alignment before counting?

  • @muhammadanees2008
    @muhammadanees2008 Před 20 dny

    (ERR): "SL3.0.dna_sm.toplevel.fa" does not exist Exiting now. Hi any solution for this error. I am getting this error in Hisat2 alignment

  • @naVn1111
    @naVn1111 Před 20 dny

    I could not find the link to the video for QC, could you please put that in description. Thanks.

  • @marinafernandez6778
    @marinafernandez6778 Před 21 dnem

    Hi, Great video! thanks! I got this error any idea how to solve it? seu.obj$mitopercent <- PercentageFeatureSet(seu.obj, pattern = '^MT-') Error in validObject(object = x) : invalid class “Seurat” object: 1: all cells in assays must be in the same order as the Seurat object invalid class “Seurat” object: 2: 'active.idents' must be named with cell names

    • @xinyuqu4407
      @xinyuqu4407 Před 7 dny

      I have the same issue. I use the AddMetaData function from Seurat to solve the problem. rownames(metadata) <- metadata$cell_id seu.obj <- AddMetaData(seu.obj, metadata = metadata) # Calculate mitochondrial percentage seu.obj[["percent.mt"]] <- PercentageFeatureSet(seu.obj, pattern = "^MT-")

  • @dennisscheper1
    @dennisscheper1 Před 21 dnem

    Excellent. Thank you!

  • @johncruise6989
    @johncruise6989 Před 21 dnem

    Relevant question, If you do not have the `.h5` file, and have data that is formatted as three files, a counts file (`.mtx`), a cell barcodes file, and a peaks file. how should i laod the data and also what about fragments file ?

  • @ifeoluwaemmanuel5093
    @ifeoluwaemmanuel5093 Před 22 dny

    What if the seurat ident has not been given?

  • @oumarsako8201
    @oumarsako8201 Před 23 dny

    Thank you a lot.

  • @gargiagravanshi355
    @gargiagravanshi355 Před 23 dny

    Hello ma’am ! I funckin need your help I’m stuck with a project and my mentor is very toxic please let me know how can I contact you.

    • @Bioinformagician
      @Bioinformagician Před 23 dny

      My contact details can be found in the video description :)

  • @andreslavore3928
    @andreslavore3928 Před 23 dny

    Excellent video!! how do you run GATK for poliploid data which variant calling was made using Freebayes? how do you extract several different haplotypes from this kind of data? Thanks

  • @nikitamaurya4518
    @nikitamaurya4518 Před 24 dny

    Thank you so much!