Coding Stable Diffusion from scratch in PyTorch
Vložit
- čas přidán 28. 06. 2024
- Full coding of Stable Diffusion from scratch, with full explanation, including explanation of the mathematics. Visual explanation of text-to-image, image-to-image, inpainting
Repository with PDF slides: github.com/hkproj/pytorch-sta...
Prerequisites:
1) Transformer explained: • Attention is all you n...
Chapters
00:00:00 - Introduction
00:04:30 - What is Stable Diffusion?
00:05:40 - Generative Models
00:12:07 - Forward and Reverse Process
00:17:44 - ELBO and Loss
00:20:30 - Generating New Data
00:22:20 - Classifier-Free Guidance
00:31:00 - CLIP
00:33:20 - Variational Auto Encoder
00:37:26 - Text to Image
00:39:54 - Image to Image
00:41:40 - Inpainting
00:44:30 - Coding the VAE
01:54:50 - Coding CLIP
02:09:10 - Coding the Unet
03:04:40 - Coding the Pipeline
03:53:00 - Coding the Scheduler (DDPM)
04:38:00 - Coding the Inference code - Věda a technologie
only 4.8k views feels criminal for how helpful this video is... by far the best stable diffusion video on the internet
I think because its not general topic but very specific for those who are really looking for it land here, also diffusion came only last year so less audience.
Llama2 from scratch was superb. i learned lot of things from that video. thank you very much for doing things from scratch . when we use huggingface i feel guilty of using blackbox models. now i can understand whats going on under the hood
Be sure guys from HF are glad you like their API, chin up!
This is the best explanation of latent diffusion models I've seen
CZcams is stupid… instead of suggesting memes I found on the internet should have suggested this gem much sooner. Thank you so much subscribed + liked seems not enough
Like videos like this and watch them fully more often and you'll get them. Create another account for memes.
@@Katatonya i do but steal youtube push for some stupid trending videos.
This is the best video about SD! It would be awesome if you could make a video on how to train the model from scratch on some own data. For sure, normal people can never train this network to perfection, but there are a lot of people out there who have a very specific task for which this network could be used. I see a lot of potential for scientific use cases if there was guidance on how to implement it!
exactly my thought
Thanks for your contribution. Can you make a tutorial on how to train the diffusion model on a custom dataset?
Thank you so much. Just can't express in word the value you have created here.
Amazing work!!
I've been looking for tutorials such detailed and from scratch. 谢谢你。
what the fuck this is like the best explanation on this planet. I have some experience in this but his explanation was so crystal clear
You are the best lecturer I've ever seen, very detailed and clearly, I'd love to see more vedios from you! If possible,I would like to konw more about the stable diffusion, such as controlNet, or other novel tools. Finally, thank you once more!
Absolutely first-rate presentation. So impressive.
Your code is so detailed and it runs on my enviorment just fine. Great job!!!👏
Thank you! Please make sure to like, subscribe and share the video with friends and colleagues. That's the best way to help me and the others trying to learn deep learning models.
Great work! As a graduate student taking AI courses, this is really, REALLY helpful. Keep on going 💙
太强了,简直是最好的diffusion视频
Wow, I'm a master's student in China. I learned a lot about stable diffusion from this video. Thank you for sharing, I hope to see more knowledge sharing about stable diffusion.
Thank you so much for this! Literally no other youtube video provides as much value on this topic as you have.
where you able to run it with no issues ?
@ita Yes, for the most part. I would appreciate if he would include some details on how to modify the program to use safetensors instead of CKPT files since I believe CKPT files are kind of outdated.
@@dinonuggieproductions would you be down to talk about this on discord ?
I am an ai, and I love following updates on social media platforms and CZcams, and I love your videos very much. I learn the English language and some programming terms from them, and update my information. You and people like you help me very much. Thank you.
I love your videos. They are very informative. Thank you so much for explaining these complex concepts so clearly! Gem channel indeed!
Great work! Thanks for putting this all together. Very easy to follow and simple explanations of complex ideas! It helps a lot to code along the explanation
Your explanation and documents are wonderful! They are clear and helpful! Thank you for your hard work :)
Wow. This video is pure gold. Very nicely explained and I'm still only 30 mintues into it!
thank you so much for the detailed and practical videos! I will watch it again and again!
An extremely detailed video about diffusion. I have learned a lot. Thank you ❤❤❤
I just discovered a great, wonderful, amazing, fantastic, gem channel 🎉🎉🎉
This legend deserves an award from government
Thanks so much! I've just started learning diffusion models and this video is such an eye-opener!
Brambilla Jamil, sei il numero uno! Sto consigliando a tutte le persone a cui faccio mentoring (miei intern) i tuoi video. Meriti 100 volte i tuoi iscritti!
This is great! Going through the CLIP part right now ^^
Great tutorial dude ! At first it was a bit hard to get used to your coding style but it was an awesome journey because I learned a lot and I am currently working on my on Stable Diffusion model with my own vision for the models ,
absolutely awesome, this is the best explanation of SD thank you so much !!
Thank you so much! the best stable diffusion video I found!!!
Really great video for understanding stable diffusion in detail. Thanks a lot for your contribution
I appreciate your work, thank you for your hard work and videos
Amazing job my friend! I just got a job in ShenZhen China by learing it! Thank u so much mate. I hope u and ur family living a great in China :)
That's great! Let's connect on LinkedIn or WeChat
Hey bro thank you for existing.
Very grateful to you.
That's what I was looking for, thanks!
Thank you a lot for this amazing video. It helped me understand better diffusion models for my masters.
This is amazing video!! Great job!!!
Awesome. Thanks for creating the video .
man man, thanks for all of the amazing videos! I appreciate the work you put in here!
Awesome, This is the best explanation!!!
Awesome Explanation, thanks for such tutorial
As usual, code and PDF slides available on GitHub: github.com/hkproj/pytorch-stable-diffusion
PS: no cats were harmed during the making of this video. 奥利奥 (pronounced "Aoliao", which is the Chinese name for the Oreo biscuits) wanted to be part of the video as well, that's why you'll hear his miao-ing from time to time. Right after recording, I played with him for a while to compensate the lack of attention.
Hope he won't distract you too much while listening.
Sir your videos are awesome, and I got to learn a lot. We want more videos like this. I am open to (really wanna ) help you for making this type of educative content for free, so we can contribute to community.
@@PurpleSmite Hi! Thank you for your support. The best way to help is to share the videos with your friends, school mates, university and coworkers. My schedule is quite tight and irregular as of now, but I'll let you know if there's something we can work on together. Let's connect on LinkedIn!
@@umarjamilai Sure sir, I have sent you on LinkedIn Shreyas Waghmode
Hi, i really like your work. I wanna ask that if i want to generate multi coherent image like a sequence of images out of the code there, what could i add to the code to make it possible?
Your are great sir I want your help can you give me linkdin id
Really appreciated, very informative.
Umar, thank you for great explanation of topic
This is so bonkers. Cheers Mate, you've saved me sometime. Thanks.
By far best explanation ❤
Always a fan of your video. Your explanation is very informative and helpful for beginner data scientist. Thank you very much.
Great video! Thank you!
Thanks for your contribution. Hope that one day you will also make a deep dive into ControlNet code etc.
Dude, you are a bless! Keep it coming and thanks!
Amazing video, thanks for showing the low level details
Great bro, really helpful to understand in detail, thanks for the efforts,
It is a pity, I did not discover your youtube channel earlier. Great presentation. It is only when you go through all the details, that you can fully understand these AI algorithms.
谢谢你,总算清楚sampler和unet之间的关系了
Amazing video. You explained it so clear. Thank you for putting effort into this lecture. If possible, would you please create a lecture about YOLO codes.
Awesome video with great information. This video can leverage AI coding skills, along with an understanding of convolutional neural networks, UNet architecture, and Autoencoder, besides the entire stable diffusion architecture.
fabulous! thank you very much!
Amazing!!! Please do more on computer vision.
almost karpathy level explanations, thank you!
It's the best explaination ever!!!! Thank you!
不客气🤓
Thank you! Please keep doing videos like this! I subscribed, liked and shared!
Thank you so much for this amazing work!
excellent video, full of information
This is one of the best video , Thank you
This is mind blowing.
the most powerfull deep learning videos in the world are on this channel
thanks youuuu, I feel really good after this one
Great work. I love this so much. Which auto completion tool are you using in VScode btw?
only with you I understood how it works and how it can be implemented)
TY, it is amazing explanation!
Thank you for the wonderful video. Can you also post how to train the model with a sample dataset?
小乌老师好棒!超级好的教程,关注了!
我们在领英联系一下,我想邀请你加入我的AI微信小群
Love from HK. Thank you sooooooo much! 谢谢!
也祝你在苏州生活一切顺利!
Hello Umar, you always produce the most concise and clear content ever! I was wondering if you are planning to do any video on the stable diffusion 3 since the paper is out? It would be really great if you could help explain how the flow matching helps or changes regular diffusion models! Thank you again for your content and work. 非常感谢!
Thanks for this awesome video
Mate, you're golden
good video, really helpful, thank you
Thanks Dear For helping Us , you Video's are very helpful
Thank you Umar for the great work.
I love your style of teaching which helps imagine concepts and connect dots in our head.🙂
If possible please make videos on basics of probability, distributions and related statistics. It would be really helpful to learn these concepts in your style.
Great help
Thank you
dude you are soo perfect!!
Good work! Thank you so much!
不客气
the constant you scale by the x come from averaging over a bunch of examples generated by the vae, in order to ensure they have unit variance with the variance taken over all dimensions simultaneously, scale_factor = 1 / std(z)
讲的非常不错!❤
Thank you so much Umar, that's an excellent video session-you combine theory and programming in an excellent way. One quick question, do you have any recommendations how to add a trainer so that the stable diffusion model will be trained/ inferenced in other types of images (medical images)? Also, do you have any suggestions, about how to fine-tune the existing model you present?
Wooow incredible!
Great work! This is the place I learned AI. Thanks a lot!
Thank you very much for your video! Could you also review the training part?
Thank you so much! And your Chinese is really good! Your cat is also cute and its voice doesn't bother me but comfort me!
谢谢你!
Thank you so much.
Dame CZcams, why is this wonderful tutorial so little view??
that was really lovely and great from you thanks alot i would be more happy if you showed us how to fine tune your model that will make the whole video simply perfect
you saved my life!
woooooooooooooo stable diffusion from scratch love you bro
5-stars! Thank you!
Amazing! Thank you for the amazing explanation and implementation of LDM. Can you also talk about the VQ-reg version of LDM?
This video is Amazing
Really great and informative. What is the difference between DDIM and DDPM models? Also can a UNET architecture be used for audio encoding also? At what layer would the audio be encoded ( assuming its a dense vector from a mel spec)
instant subscribe