ELLA - A Powerful Adapter for Complex Stable Diffusion Prompts

Sdílet
Vložit
  • čas přidán 10. 04. 2024
  • Diffusion models have made incredible strides in text-to-image generation, but they still struggle with dense, complex prompts that involve multiple objects, detailed attributes, and intricate relationships.
    Enter ELLA - the Efficient Large Language Model Adapter that's poised to revolutionize how diffusion models handle sophisticated prompts. This ingenious adapter equips text-to-image models with the power of large language models, without the need to train either the U-Net or the language model itself.
    In this video, I dive into using ELLA in ComfyUI, and explore how it tackles the limitations of current text encoders like CLIP. Prepare to be amazed as Ishowcase the superior performance of ELLA compared to CLIP conditioning, unlocking a new level of sophistication in text-to-image generation. If you've ever struggled to craft the perfect prompt, this video is a must-watch!
    (* oops! Was playing with scheduling in the video and so forgot to switch back to sgm unfiform, but it really doesn’t make much difference. It also turns out they are no longer going to release the SDXL weights.)
    However! For SDXL, check this video out!
    Pixart Sigma - Like ELLA but for SDXL+ Resolutions in ComfyUI!
    • Pixart Sigma - Like EL...
    Want to support the channel?
    / nerdyrodent
    Links:
    ella-diffusion.github.io/
    github.com/TencentQQGYLab/ELLA
    github.com/ExponentialML/Comf...
    huggingface.co/QQGYLab/ELLA/b...
    huggingface.co/google/flan-t5...
    huggingface.co/Kijai/flan-t5-...
    See also: github.com/TencentQQGYLab/Com...
    == More Stable Diffusion Related Stuff! ==
    * Installing Anaconda for MS Windows Beginners - • Anaconda - Python Inst...
    * Installing ComfyUI - • How to Install ComfyUI...
    * ComfyUI Workflow Creation Essentials For Beginners - • ComfyUI Workflow Creat...
    * Faster Stable Diffusions with the LCM LoRA - • LCM LoRA = Speedy Stab...
    * Make an Animated, Talking Avatar - • Create your own animat...
    * One Image Gets You a Consistent Character in ANY pose - • Reposer = Consistent S...
  • Věda a technologie

Komentáře • 69

  • @Cingku
    @Cingku Před 21 dnem +14

    Finally...I don't need SD3 anymore if this is the case. Who needs to download more SD3 models when this is doing prompt adherence that well...because my disk space is suffering these days.. :D

    • @Cingku
      @Cingku Před 20 dny

      I took that back. It seems that I need SD3 after all. I cannot get the style I want because of limitation in prompting (like nerdy said in the video). And the prompt adherence is very well only with the raw prompt without including styling prompt. So it is basically useless.

    • @nowshinnur
      @nowshinnur Před 19 dny

      @@Cingku plus they don't release sdxl

    • @hphector6
      @hphector6 Před 19 dny

      SD3 has a wayyy better VAE too. That's the main thing I'm looking forward to

    • @user-cz3io5tg5l
      @user-cz3io5tg5l Před 18 dny

      I see small improvements with ELLA but it is not even close to sdxl examples on their github page which they wont release :(
      edit: oh wait, seems like I used outdated extension, Will try again
      edit: still very tiny improvement

  • @kofteburger
    @kofteburger Před 21 dnem +7

    I've been looking forward to this.

  • @MrSporf
    @MrSporf Před 21 dnem +6

    That is quite the improvement!

  • @Pauluz_The_Web_Gnome
    @Pauluz_The_Web_Gnome Před 21 dnem +3

    Thanks man for the flow, I am one of your Patreon's now! 😀

  • @LIMBICNATIONARTIST
    @LIMBICNATIONARTIST Před 21 dnem

    Absolutely amazing!

  • @Pending22
    @Pending22 Před 21 dnem

    Epic!! Brilliant tutorial as always, thanks Nerdy :)

  • @Remianr
    @Remianr Před 21 dnem +4

    6:38 Love your sense of humor, Nerdy Rodent :D (pretty sure I would have made similar joke either lol).
    Also it's amazing that such a simple additional tool on top of the already known one, can make such a huge difference and it shows that these models are actually more powerful that we thought they are. Really impressive new nerdy tech news as of recent 3-6 months to me :)

    • @NerdyRodent
      @NerdyRodent  Před 21 dnem +1

      I can’t wait to see what we get two papers down the line!

  • @godpunisher
    @godpunisher Před 20 dny +2

    Nerdy's always on time 👍

  • @marschantescorcio1778
    @marschantescorcio1778 Před 21 dnem

    Thank goodness I can write my mini-novel prompts! This is quite a game changer.

  • @GyattGPT
    @GyattGPT Před 21 dnem +1

    This really seems to be getting closer to the ability of the private imagegen models to follow the prompt better.

  • @USBEN.
    @USBEN. Před 21 dnem +7

    FINALLY! Stable diffusion is at prompt adherence level of natural language as DALLE!!

  • @kenmillionx
    @kenmillionx Před 21 dnem

    Cool video. Much love ❤❤❤❤. Cool video. Am waiting for next video 😊😊😊😊

  • @southcoastinventors6583
    @southcoastinventors6583 Před 21 dnem

    How well does it output text and does it work with artist name like art by Picasso ? Also can it run with 8GB of Vram like normal SDXL models can ? Thanks for the great video, it like getting a sneak peak at SD3 capability.

  • @testales
    @testales Před 20 dny

    Very impressive, I need to try this out as soon as possible! :) What's the sigma node doing and what kind of impact does it have? Should it use the same sample and scheduler as the KSampler?

  • @suffolkcountysheriff
    @suffolkcountysheriff Před 21 dnem +1

    Would love to see this with a control net depth map

  • @DrMattPhillips
    @DrMattPhillips Před 21 dnem +3

    I installed all the T5 files linked but keep getting a "T5Tokenizer requires the SentencePiece library but it was not found in your environment." error. I re-downloaded the spiece.model file which I assume is the file in question but still get the error. It could be something I'm doing since I'm new to comfy (still learning how to navigate it).
    Edit: managed to fix it somehow, installed and reinstalled ella, also pip installed sentencepiece and installed the other ELLA add-on in comfy, so if anyone has a similar issue one of those might help (can't be more specific as I genuinely have no idea why it happened or how it was fixed)

    • @stevenkosin6965
      @stevenkosin6965 Před 19 dny +1

      I wonder if the SentencePiece Library has to be installed prior to installing the customNode? I am having the same issue but didn't try removing Ella's node and reinstalling it, So I will give that a go now.
      What worked for me was uninstalling it and then using the Experimental Pip installer to install it from the ComfyUI manager. No clue why that worked.

  • @SLAMINGKICKS
    @SLAMINGKICKS Před 17 dny

    perfect, just got your patron too

  • @TUSHARGOPALKA-nj7jx
    @TUSHARGOPALKA-nj7jx Před 19 dny

    How does it too for low steps if we want to combine with LCM Lora?

  • @97BuckeyeGuy
    @97BuckeyeGuy Před 21 dnem +3

    Unfortunately, this group has stated definitively that they are NOT releasing their SDXL weights. I'm waiting to see what the RPG Dungeon Master group comes up with.

    • @NerdyRodent
      @NerdyRodent  Před 21 dnem +3

      We will also have to see what license they have on that 🫤

  • @michalgonda7301
    @michalgonda7301 Před 16 dny

    Hey thanks for your videos ;) you are awesome! ...
    Did you tried it with animediff ? :) do you think it would work? :) maybe even some control-nets would help it with understanding and it can color them correctly :)

  • @harshitpruthi4022
    @harshitpruthi4022 Před 6 dny

    t5 model is showing as unidentified even after putting it in right folder , any help , i also placed it in ella-embedd folder

  • @dkracingfan7206
    @dkracingfan7206 Před 21 dnem

    Is there a huggingface space where I can try this out?

  • @DemShion
    @DemShion Před 21 dnem +1

    A shame it was tencent who came up with this, they stated that sdxl wont be made publicly available, they also didn't release the training process for 1.5. There is a community effort to reverse engineer the training process, hopefully they'll pull it off.

  • @jibcot8541
    @jibcot8541 Před 21 dnem +3

    Shame they probably aren't ever going to release the SDXL weights (only the SD 1.5 version)

    • @97BuckeyeGuy
      @97BuckeyeGuy Před 21 dnem

      They have stated definitively that they are NOT releasing the SDXL weights.

    • @NerdyRodent
      @NerdyRodent  Před 21 dnem

      Ah yes, I see they said after I’d made the video 😞 Still, maybe someone will do it in the future!

    • @prettyawesomeperson2188
      @prettyawesomeperson2188 Před 21 dnem +2

      Any reason for not releasing the weight for SDXL?

    • @97BuckeyeGuy
      @97BuckeyeGuy Před 21 dnem +1

      @@prettyawesomeperson2188 Business reasons... ie they want to make money with the good stuff.

    • @lalayblog
      @lalayblog Před 21 dnem +1

      I see it might not be so profitable to release SDXL version of this technique. 1. It produces the anime style of semi-realistic picture alone (flan-t5-encoder-only gives anime). You need to combine it's output conditioning with clip conditioning.
      2. SDXL prompt adherence comparable with Ella on sd1.5. I can't expect Ella adherence will be significantly better for SDXL (didn't see claims on that yet).

  • @jacket8818
    @jacket8818 Před 18 dny

    Ok cool
    But, how good is it with cartoonish or anime style?

  • @shirwan
    @shirwan Před 21 dnem +1

    I'm getting this error "RuntimeError: "LayerNormKernelImpl" not implemented for 'Half'" on a GTX 1660S, could the limited VRAM be the cause?

    • @NerdyRodent
      @NerdyRodent  Před 21 dnem +1

      My guess would be due to the older GPU 😕

    • @shirwan
      @shirwan Před 21 dnem

      @@NerdyRodent I figured that would be it, thank you anyway, I'm sure the requirements would be halved in the future

    • @rebeldeconcausa9227
      @rebeldeconcausa9227 Před 3 dny +1

      @@shirwan The GTX1660s is obsolete for AI due to the scarce 6GB, I recommend the RTX3060 12GB which is very economical due to the launch of the RTX4060

  • @jeffkilgore2693
    @jeffkilgore2693 Před 21 dnem +1

    now I want chips

  • @MarcSpctr
    @MarcSpctr Před 21 dnem +2

    NO SDXL release and NO TRAINING CODE either.
    I just don't even bother working with such things, as it is just a dead end.
    edit: feels like they just released it to create hype for something else that they wanna monetize

    • @lalayblog
      @lalayblog Před 21 dnem

      Flan-t5-encoder-only-bf16 useless alone because it produces anime style only.
      So I agree that without fine tuning to realism it is useless.

  • @prolamer7
    @prolamer7 Před 21 dnem

    I still dont understand HOW ELLA does what it does?

  • @jonmichaelgalindo
    @jonmichaelgalindo Před 21 dnem +2

    Just tried it. The prompt adherence is good (not as good as SD3), but the quality of SD1.5 is terrible. The SDXL version would probably be a lot better.

    • @itycagameplays
      @itycagameplays Před 21 dnem +2

      How about using it as first pass and then a SDXL model as second pass? The prompt adherence would help a lot.

    • @jonmichaelgalindo
      @jonmichaelgalindo Před 20 dny

      @@itycagameplays Might be worthwhile.

    • @rebeldeconcausa9227
      @rebeldeconcausa9227 Před 3 dny +1

      @@itycagameplays I've been doing AI images for a long time and the idea of ​​using the SD1.5 model to create hundreds of quick images and improve the one I like in SDXL didn't occur to me, thanks man 🤣

  • @kariannecrysler640
    @kariannecrysler640 Před 21 dnem +2

    🐀 ❤🤘

  • @animetechs2191
    @animetechs2191 Před 19 dny +1

    Can we be seeing this anywhere in automatic1111 or forge sd , i don't want to switch to comfy ui as i am already comfortable with forge

  • @thevoid6756
    @thevoid6756 Před 21 dnem

    combine this with FreeU and Self-Attention Guidance

  • @virtualalias
    @virtualalias Před 21 dnem

    Doesn't OpenAI already do this with DallE3?

  • @xXxPRxXx
    @xXxPRxXx Před 21 dnem +1

    CHISP!

  • @pragmaticcrystal
    @pragmaticcrystal Před 21 dnem +1

    🫶

  • @Mika43344
    @Mika43344 Před 21 dnem

    YOU USED 2 DIFFERENT SCHEDULERS, now you have to redo everything dude))))

  • @therookiesplaybook
    @therookiesplaybook Před 14 dny +1

    Can you please do a more clear instruction on the path. ComfyUI is complicated as it is without giving how you set all this spaghetti up.

  • @blahblahdrugs
    @blahblahdrugs Před 21 dnem

    So this only works with sd 1.5 models?

    • @lalayblog
      @lalayblog Před 21 dnem +1

      So, this thing is profitable for sd1.5 mostly. The only advantage for SDXL would be support for different languages but for the price of huge space consumption on SSD.

    • @blahblahdrugs
      @blahblahdrugs Před 20 dny

      @@lalayblog It's working for me with sd1.5 but I prefer sdxl and it just crashes with sdxl.