Animatediff perfect scenes. Any background with conditional masking. ComfyUI Animation

Sdílet
Vložit
  • čas přidán 27. 06. 2024
  • Consistent animations with perfect blending of foreground and background in ComfyUI and AnimateDiff. Do you want to know how?
    #animatediff #comfyui #stablediffusion
    ============================================================
    💪 Support this channel with a Super Thanks or a ko-fi! ko-fi.com/koalanation
    🚨 Use Runpod and I will get credits! tinyurl.com/58x2bpp5 🤗
    ☕ Amazing ComfyUI workflows here: tinyurl.com/y9v2776r
    🔥 Run ComfyUI on the cloud (no install) 🔥
    👉 RunDiffusion: tinyurl.com/ypp84xjp 👉15% off first month with code 'koala15'
    👉 ThinkDiffusion: tinyurl.com/4nh2yyen
    🔥 Do you want to use your own local ComfyUI? 🔥
    👾 Upgrade you setup with a RTX 4090 24 GB: tinyurl.com/2s3ajcv3
    🖥️ Desktop with RTX 4090 64GB: amzn.to/3V5RXEA
    💻 Laptop with RTX 4090 24GB: amzn.to/4am5OLj
    🤑🤑🤑 FREE! Check my runnable workflows in OpenArt.ai: tinyurl.com/2twcmvya
    ============================================================
    This method explores the blending of a foreground character with any background you want to use avoiding the edges which occur when masking frames. This is done with masking BEFORE rendering the images, and using conditional masking to control the rendering of foreground and background, using ControlNet.
    Please, support this channel with a ko-fi!
    ko-fi.com/koalanation
    The workflow and frames to test this workflow are found in this Civit.AI article: tinyurl.com/w2xr3hj4
    The tutorial where I show the basic workflow using the InstantLora method + AnimateDiff:
    • AnimateDiff + Instant ...
    Basic requirements:
    ComfyUI: tinyurl.com/24srsvb3
    ComfyUI Manager: tinyurl.com/ycvm4e29
    Vast.ai: tinyurl.com/5n972ran
    Runpod: tinyurl.com/mvbh46hk
    Custom nodes:
    AnimateDiff Evolved: tinyurl.com/yrwz576p
    Advanced ControlNet custom node: tinyurl.com/yc3szuuf
    VideoHelper Suite: tinyurl.com/47hka2nn
    ControlNet Auxiliary preprocessors: tinyurl.com/3j3p6bjw
    IP Adapter: tinyurl.com/3x3f2rfw
    ComfyUI Impact pack: tinyurl.com/4jsmf8va
    ComfyUI Inspire pack: tinyurl.com/2wkzezxm
    KJ Nodes: github.com/kijai/ComfyUI-KJNodes
    WAS node Suite: tinyurl.com/2ajuh2mx
    Models:
    DreamShaper v8 (SD1.5): tinyurl.com/3rka67pa
    ControlNet v1.1: tinyurl.com/je85785u
    vae-ft-mse-840000-ema-pruned VAE: tinyurl.com/c9t6wntc
    ClipVision model for IP-Adapter: tinyurl.com/2wrtvnx4
    IP Adapter plus SD1.5: tinyurl.com/2p8ykxf6
    Motion Lora mm-stabilized-mid: tinyurl.com/mr42m5hp
    Upscale RealESRGANx2: tinyurl.com/2frvcyca
    Tracklist:
    00:00 Intro
    00:13 Method: approach and explanation
    00:32 Basic requirements
    00:41 Downloading and copy of assets from Civit AI
    01:16 Base AnimateDiff and Instant Lora: installing missing custom nodes and updating ComfyUI with Manager
    01:59 Models used in the workflow
    02:17 Testing the basic workflow
    02:40 Creating Lora reference image with a new different background
    workflow test
    05:13 ControlNet for the background
    06:10 Creating masks for the foreground (hero) and background
    07:38 Conditional masking (blending masks and controlnets of foreground + background)
    09:34 Running the workflow for all frames to create your full animation - including face detailing and frame interpolation
    10:18 Outro
    My other tutorials:
    AnimateDiff and Instant Lora: • AnimateDiff + Instant ...
    ComfyUI animation tutorial: • Stable Diffusion Comfy...
    Vast.ai: • ComfyUI - Vast.ai: tut...
    TrackAnything: • ComfyUI animation with...
    Videos: Pexels
    Music: CZcams Music Library
    Edited with Canva, Runway.ml and ClipChamp
    Subscribe to Koala Nation Channel: cutt.ly/OZF0UhT
    © 2023 Koala Nation
    #comfyui #animatediff #stablediffusion
  • Krátké a kreslené filmy

Komentáře • 42

  • @Foolsjoker
    @Foolsjoker Před 7 měsíci +1

    I had been trying to do this workflow for almost a month, but I could never get the foreground and background to merge correctly. Obviously, mine had some major missing components compared to this. So glad you posted. Thank you!

    • @koalanation
      @koalanation  Před 7 měsíci

      Glad I could help! I tried several tricks: masking, inpainting, trying to add correction layers in the video editor...so it also took me a while to find out the way to do how I want it.

  • @dkamhaji
    @dkamhaji Před 7 měsíci

    hey yo! super great video and many interesting techniques going on here. I will definitely be integrating this into my workflow. so I do have a question though, I get you are moving the character with the open pose animation. but how is the background moving? (and camera) are you using some video input to drive that or something else like a motion lora?

    • @koalanation
      @koalanation  Před 7 měsíci

      For the background, I have used this pexels video as a base: tinyurl.com/yn4y8bdf
      I reversed the video in ezgif first.
      In the workflow, I tested a few preprocessors to see which ones work the best, and adjusted how many frames per second match better with the foreground. In this case, Zoe depth maps and MLSD work well. I adjusted the frequency of the frames for one every 3 frames, starting from frame 90 (in a VHS Load Video node).
      To avoid running the preprocessors all the time during my tests, I just extracted the same number of frames as in the OpenPose and saved them as images and used them in the final workflow.

    • @dkamhaji
      @dkamhaji Před 5 měsíci

      Hello!@@koalanationIm building a workflow that has similar intentions to yours here - but with a different slant. Im just using seg masks to separate the BG from the Character and applying separate masks to each IP to influence the character and the background separately. everything works great except that I'm trying to apply the motion from the original input video to the new background created by the attention masked ip adpater. is there a world we can discuss this further to try to find some possible solutions? I would love to share this with you

  • @skaramicke
    @skaramicke Před 3 měsíci +2

    Couldn't you just reuse the mask from the compositing step when isolating background from foreground in the later stages?

    • @koalanation
      @koalanation  Před 3 měsíci +1

      The mask in the compounding is only applied to one image. For the background/foreground: we are creating an individual mask for each of the video frames. The first is static, the second 'dyanamic', so to say. I hope this resolves your doubts!

    • @skaramicke
      @skaramicke Před 3 měsíci +1

      @@koalanation yes of course! Didn’t think of that.

  • @lukaso2258
    @lukaso2258 Před 4 měsíci

    Hey Koala, thank you very much for this guide, exactly what i needed. I have one question, this works very well with merging two conditions into one scene, but what if im using separete IPAdapter for background and character? I found way to merge the two IPAdapter outputs, but cant find way a to mask each IPA model for purpose of character and background. Do you see any solution for this? (In my workflow im doing lowres output of char and background first, then upscaling both and now im figuring out how to run it throught another sampler and properly merging them together)
    Thanks again for ur work

    • @koalanation
      @koalanation  Před 4 měsíci

      Nowadays, in the Apply IPAdapter node, there is the possible to use 'attn_mask', so you can use the two separated images (foreground/background). This gives you more flexibility regarding the type of IP adapter, strength, use of batches....
      When I was preparing the video, that was still not possible. You can also use CN with masks. So having different layers is possible in several ways.
      Results will be slightly different, though, depending on how you do it.
      Good luck

    • @lukaso2258
      @lukaso2258 Před 4 měsíci

      @@koalanation You are legend, it works :) Thank you!

  • @eyesta
    @eyesta Před 3 měsíci

    Good video.
    Slightly different question.
    I made vid to vid in comfyui, my background changes, but I have a static background to replace, how to render the model/character on a green background like you have in this video?

    • @koalanation
      @koalanation  Před 3 měsíci +1

      There are several custom nodes that do that Check rembg, for example: github.com/Jcd1230/rembg-comfyui-node or Was Node suit. However, this go frame by frame and you will need to review them.
      I made a video using segmentation with track anything, but no one has developed a comfyui node/tool. It used to work very nicely, but I have not used it for a while: czcams.com/video/HoTnTxlwdEw/video.htmlsi=Pnlr-YUo-YmRz8UL
      At the end, I think it is easier and faster to use video editing software with rotoscope features: adobe premiere, DaVinci resolve or Runway.ml. I personally use runway.ml, but choose what you prefer

    • @eyesta
      @eyesta Před 3 měsíci

      ty!@@koalanation

  • @Disco_Tek
    @Disco_Tek Před 7 měsíci +1

    Any idea how to keep consistent color for items like clothing in vid2vid? Also... you can rotoscope in Comfyui now?

    • @koalanation
      @koalanation  Před 7 měsíci

      For clothing consistency, I think playing with masks and SAM detector (with for example, deepfashion2) it should be possible. But personally I have struggled to get the masks correctly for all frames (with other animations). I did a video using Trackanything, which I think can track clothes nicely. I believe with the right workflow it should be possible to do nicer things, but playing to the masks is not straightforward, so I did not elaborated further.
      Regarding rotoscoping: yes, with SAM is possible, but I find easier and faster use video editors (Premiere, DaVinci...). When rotoscoping, you may eventually need to correct some frames, and with ComfyUI becomes a very tedious task. TrackAnything is more user friendly, for adjustments, but it is a pity is not really maintained or integrated into ComfyUI (that I am aware)

    • @Disco_Tek
      @Disco_Tek Před 7 měsíci

      @@koalanation thanks for the reply. Yeah there has to be a way for consistent clothing and some lora's have helped but things like colors constantly want to shift. As far as rotoscoping I didn't know that was possible and normally just stick with runwayml if I need to do it.

    • @aivideos322
      @aivideos322 Před 7 měsíci +1

      @@Disco_Tek animate diff, use 24 frame context length, and 8 context stride, works for 48 frames, keep any text prompt short and dont repeat, (wool scarf, red scarf) is not good, wool red scarf works and do not mention scarf again in the prompt. If you want to describe it better, reword the original like. wool textured red long scarf. Prompting is very important, as is the model you choose.

    • @Disco_Tek
      @Disco_Tek Před 7 měsíci

      @@aivideos322 I've been using a context length of 16 and a overlap of 12 lately with pretty good results. I will mess with prompts though the next time I trying running without a LORA. I'm usually then just using the upscaler to get me home. Any suggestions for color bleed for when I add color to clothing item to prevent it from polluting the rest of the image?

    • @aivideos322
      @aivideos322 Před 7 měsíci +1

      @@Disco_Tek colour bleed is a problem for even images, and I have found no real solution to that.
      For upscaling, you can use tile/lineart/temporalnet control nets to upscale reliably and full denoise the video for double or triple sizes. Can even colour the videos with a different model at this step. You can give more details in this prompt that tend to work better for colouring things. This step does not use the animate diff model, it uses whatever model you want and controlnet so it has more freedom to colour what your prompt says. I use impact pack nodes to turn batches into lists before the upscale to lower the memory used and allow larger upscales. This does each frame 1 by 1.

  • @kamilkolmasiak9878
    @kamilkolmasiak9878 Před 7 měsíci

    do you think this workflow will work with 8gb vram?

    • @koalanation
      @koalanation  Před 7 měsíci

      I understand with AnimatedDiff you need 10 but I have read people can also do it with less. In this workflow, though, we are in reality doing one render for the foreground and another for the background, so it actually takes longer...However. with LCM you can decrease the render time quite a lot. I just made a video about it and I am really happy with how LCM works (with the right settings)

    • @kamilkolmasiak9878
      @kamilkolmasiak9878 Před 7 měsíci +1

      @@koalanation Cool, right now I am using animatediff with 2 cn on only 8gb 😅I will try your workflow though.

  • @matthewma7886
    @matthewma7886 Před 5 měsíci

    Great Workflow!That's what im looking for.
    But I run the workflow and get this error:
    Error occurred when executing ConditioningSetMaskAndCombine:
    too many values to unpack (expected 3)
    Does anyone know how to fix it?Thanks a lot:)

    • @matthewma7886
      @matthewma7886 Před 5 měsíci +1

      Try several ways,finally got the reason.the bug comes from the growmaskwithblur node,and the blur_radius.if blur_radius is not 0,the error happen.I thing there is a bug in this version of Browmaskwithblur.can use baussian blur mask or mask blur instead of blur function until the next version.

    • @koalanation
      @koalanation  Před 5 měsíci

      Hi! Thanks for checking it out! I had some time to look at it. As you say, it seems the error comes from the GrowWithMaskBlur Node.
      I checked the workflow and it seems like this Node has changed. The numbers are swapped, the blur radius of 20 appears in the lerp alpha field. And there is no sigma parameter...
      I have changed the values according to what is shown in the video (blur radius 20, lerp alpha 1 and decay factor 1, no sigma anymore), and the workflow works for me.

    • @matthewma7886
      @matthewma7886 Před 5 měsíci

      @@koalanation All right,bro.Great appreciate for your work:-)

  • @TheNewOption
    @TheNewOption Před 7 měsíci

    Damn I'm behind on AI stuff, haven't seen this UI and is this a new version of SD?

    • @koalanation
      @koalanation  Před 7 měsíci +1

      Yep, everything goes quick lately...but you will catch up, no worries.

    • @happytoilet1
      @happytoilet1 Před 7 měsíci

      Good stuff. many thanks. if the scene is not generated by SD, say it's a real photo taken by a camera, can SD still merge character and the scene? Thank you. @@koalanation

    • @koalanation
      @koalanation  Před 7 měsíci

      Hi, thanks to you! I think so...but take into account that the output is also affected by the model and the prompt you use. I am more fan of using cartoon and anime animations, but I think if you use realistic models (such as realistic vision), I think you will get what you are aiming for.
      At the end, there is quite a bit of experimentation here. Change and play also with the weights of the adapters.

    • @happytoilet1
      @happytoilet1 Před 7 měsíci

      thank you for your advice. Really appreciate it. @@koalanation

  • @StyleofPI
    @StyleofPI Před měsícem

    Load Image I change to Load Video, does it work? Video to Video

    • @koalanation
      @koalanation  Před měsícem

      You can use the Load Video node too for the controlnet reference images.

  • @user-ts2fq1gp8b
    @user-ts2fq1gp8b Před 6 měsíci

    I run the workflow and get this error:Error occurred when executing IPAdapterApply:
    Error(s) in loading state_dict for Resampler:
    size mismatch for proj_in.weight: copying a param with shape torch.Size([768, 1280]) from checkpoint, the shape in current model is torch.Size([768, 1664]).
    File "/root/ComfyUI/execution.py", line 153, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
    File "/root/ComfyUI/execution.py", line 83, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
    File "/root/ComfyUI/execution.py", line 76, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
    File "/root/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus/IPAdapterPlus.py", line 426, in apply_ipadapter
    self.ipadapter = IPAdapter(
    File "/root/ComfyUI/custom_nodes/ComfyUI_IPAdapter_plus/IPAdapterPlus.py", line 175, in __init__
    self.image_proj_model.load_state_dict(ipadapter_model["image_proj"])
    File "/root/miniconda3/lib/python3.10/site-packages/torch/nn/modules/module.py", line 2152, in load_state_dict
    raise RuntimeError('Error(s) in loading state_dict for {}:
    \t{}'.format(

    • @user-ts2fq1gp8b
      @user-ts2fq1gp8b Před 6 měsíci

      Thanks if you have time to help solve this problem

    • @koalanation
      @koalanation  Před 6 měsíci

      Check out which model version and clipvision and IP adapter models you are using. I think this error is because maybe you are using a SDXL model. Change the checkpoint or the IP Adapter model and/or the clipvision

  • @SparkFlowAAA
    @SparkFlowAAA Před 5 měsíci

    Great tutorial and method!! I have an isse with ConditioningSetMask: Error occurred when executing ConditioningSetMask: too many values to unpack (expected 3). If u can help would be awesome. Thank you
    Error log:
    `Error occurred when executing ConditioningSetMask:
    too many values to unpack (expected 3)
    File "/workspace/ComfyUI/execution.py", line 154, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
    File "/workspace/ComfyUI/execution.py", line 84, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
    File "/workspace/ComfyUI/execution.py", line 77, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
    File "/workspace/ComfyUI/nodes.py", line 209, in append
    _, h, w = mask.shape`

    • @koalanation
      @koalanation  Před 5 měsíci

      Hi! It seems there were some changes in the GrowMasWithBlur Node. Can you change, in that node (at the bottom in Mask Foreground Group), and change make sure the values in the node are: blur radius = 20, lerp alpha = 1.0 and decay factor = 1?