ComfyUI: Face Detailer (Workflow Tutorial)
Vložit
- čas přidán 9. 06. 2024
- This tutorial includes 4 Comfy UI workflows using Face Detailer. The workflow tutorial focuses on Face Restore using Base SDXL & Refiner, Face Enhancement (Graphic to Photorealistic & Vice Versa), and Facial Details (including hair styling). Unique techniques are used to automate the workflow for auto-detection, selection, and masking.
------------------------
JSON File (CZcams Membership): www.youtube.com/@controlaltai...
Dr. L.T. Data GitHub: github.com/ltdrdata
------------------------
TimeStamps:
0:00 Intro.
0:46 Requirements.
2:08 Face Restore.
10:46 Face Restore Extra.
12:26 Face Enhancement.
15:14 Face Details.
26:08 Upscaling. - Jak na to + styl
This tutorial video is the most detailed among the ones I've watched.
Thank you for making these nodes, the comfy manager. Your contribution to the community is greatly appreciated. 🙏
Excellent video! Really liked the detailed explanations of what the different settings do.
OMG, this was an insane learning experience. Thank you for sharing!!
Beautiful, so much info, thanks and your voice is soothing
Thank you very much to give a information about the detailer settings do work! Great vid!
Great guide. Very detailed and step-by-step. Thanks.
That was intense! Thank you
Perfect tutorial. Can't wait for your next video.
Absolutely Loved it, Subscribed.
wow this video is so full of information. Thanks a lot !
Super helpful. Awesome video!
I don't usually comment, but wow, this tutorial explained a lot. Thank you!
you've given us so much info here. Thank you so much! I learned so much
Thank you! with this i was able to add this to my animatediff batch list omg, thanks.
Superb Tutorial.
nicely done.
great video!
Absolutely solid videos all around on this channel. Something I am having the upmost difficult finding anywhere online are taking art and converting them to photorealistic images. There are too many tutorials that show how to do photo to art style using control nets and IP adapters, but nothing for the inverted form of it. I've also asked around from more experienced people and they don't know how to change art styles to photos either. If it is possible and you can figure that out, a video on that would be awesome.
Thank you! I will need some samples of the art. Will definitely have a look at it. (mail @ controlaltai . com - without spaces).
Search for Latent Vision here on YT. Mateo is the dev for IPAdapter and has a video where he recreates realistic portraits from game assets. It’s easy to follow and professional like the videos here.
i felt stupid watching this video. thnaks for sharing your knowldge
awesome
The use of the word "too" many times is so important. It gives not only the how but also the why. The how doesn't teach, the why teaches.
I wanted to see the upscaled result :p
Hey, I like the workflow. However, the program is reporting an error: TypeError: Media_Pipe_Face_Mesh_Preprocessor.detect() got an unexpected keyword argument 'resolution'
Great video! How on earth did you learn to do this?
Thank you! Reading GitHub notes, then adding my own technique to get the desired results.
Hi Mrs Melihe, i really thank you for this detailed Video, very much Helpful. I want to understand the Facedetailer so that i can master it. I want to control it and dont let it control me.
I have some questions:
Which {Max_Size} is the best? Because using different sizes gives very different pictures.
I see in the cmd that the crop region is beeing multiplied with numbers like between 2-4 depends on what i write in the max size. And i get a resolution at the end like(1024, 1412)
Do you know what that is for?
If you know any information that can help me i am ready to read anything, i would appreciate🙏.
Hello, the max size is the resolution to be used. There are too many variables to tell you which resolution to use. Take 1 image and try to use face detailer with the res 512, 768 and 1024. Save the outputs and compare. In some cases a lower res will give better output and in others a high res. This is dependent on image source, checkpoint, other settings and the level for face fix.
@@controlaltai Ah alright i was waiting for your massage.
Thank you for teaching us :)
i really like your channel, continue
Thank you for this awesome tutorial! With the mediapipe face mesh I’m running into issues where it errors very often when the face is to small in the image. But not taking about tiny. Still relatively large but still erroring.
If the faces is not being detected, its the media pipe limitation. You need to use a different face detector from BBox Yolo face. This however won't have separate facial features but has a lower confidence threshold to detect faces (when BBox setting is set to 0.01)
@@controlaltai thank you for the reply! I will give that a try. And one more question. I’ve noticed that the denoising settings for close up faces don’t work for smaller faces in the image and vise versa. Is there a solution to mitigate this with the SEGS Facedetailer? Before I’ve used face crop and incorp nodes but non of them use the ultralytics modes but some different ones that will always error when the face is smaller or even medium sized in the image so I had to go back using the SEGS Fdetailer. The great thing about the face crop and uncrop was no matter what size the face was in the original image I could just resize the face to a specific resolution and the denoising would work well in all cases where the Face detailer from SEGS has an option for max size but not minimum size so smaller faces will stay in that lower resolution thus the denoising won’t be a good fit for them.
I have worked on images with small faces. BBox is needed. And you can denoise like upto 0.9 - 1. These were faces completely like blob or highly distorted. Using a proper checkpoint here helps. In the tutorial video check out the first two workflows, the faces in some of the fashion photos are small. I don't use media pipe there. Just ultralytics and Bbox.
For up-close faces, the whole setting would be different, as you mentioned. You can use segs facedetailer, the only difference between that and media pipe is the further segmentation. For example, media pipe has full face, and other elements, where as the ultralytics will only select face. unless you want something very specific on the face just go with ultralytics. It will denoise the whole face. There is no one denoise setting, as this is highly dependent on the source image, and checkpoint.
About the max size thing, I am not 100% sure but I think, it takes the max size and resizes to fit the resolution. The crop factor plays a crucial role in Face Detailer. The model works if it has some surrounding pixels to get info from and draw the details in. So increase the crop factor by 0.5 at a time if the results are undesired. For smaller faces the crop factor will change, it should be lowered. However, a 768 size should be fine for 1024 and very small faces.
Unfortunately, I have no experience with face crop, I have to try it. So I cannot give you any feedback in reference to that.
Thank you for the detailed explanation. Yes I’m looking into a solution to automate the denoising so it will automatically change it based on bbox size. I’ll share it with you all once I figure it out. I think that is one of the things that is currently disrupting the flow.
For zero shot segmentation, did you see massive difference between SAM+Grounding DINO vs ClipSeg ?
I did not get an opportunity to compare. However, beyond the scope of the tutorial it would be very interesting to compare them.
wow
Hi, great tutorial and got it to work with the ClipTextEncodeSDXL for the positive, when I replaced that with the Blip Analyse Image you showed afterwards, I get this error: "Error occurred when executing BLIP Analyze Image: The size of tensor a (6) must match the size of tensor b (36) at non-singleton dimension 0". Very new to ComfyUI and Stable Diffusion in general, any ideas where I'm going wrong? From what I can see online it's expecting 2 values to be the same but they're not and that's throwing up the error. No idea what those values are though.
Hi, thank you! First go to the was suit in custom nodes. Click on install.bat. That should solve it. Make sure comfy is closed. Then start it again. If it doesn't solver let me know.
@@controlaltai Thank you for such a quick reply. Got to head out but I'll let you know how I get on.
Lol, stayed back to try it and it worked. Thanks again, definitely subscribing and looking at your other tutorials as this one is excellent.
This is an amazing tutorial. I did have one question. You mentioned that "inpainting" is a pain, and you're right, but how do you actually do it? Could you describe how to use comfyUI to inpaint at a higher resolution? I really like inpainting, creating bits of the image one at a time but really struggling to do it in comfyUI. Is there any kind of workflow you could point me towards?
Thank you! Basically inpainting is manual and has limitations, I mean you cannot inpainting full clothes and change clothes properly, that doesn’t current work with the current tech, for example. Making small changes it’s great. The reason for going into auto segmentation over manual inpainting is because it’s easier to do a lot of images. For inpainting manually it advisable to select an inpainting checkpoint. These checkpoint are trained to feather in the mask area so inpainting seems blended. For sd 1.5 there are plenty there. However for sdxl it’s a different story. You can convert any checkpoint into a inpainting checkpoint via using model merge and subtract method in comfy. The workflow needs a tutorial and we have yet to cover it on the channel.
The basic method, if the checkpoint is already an inpainting one, is simple. Add a load image node - connect it with inpainting model conditioning node and connect that to the k sampler. Rest of the workflow should be the same. Right click image and open in mask editor, make your mask, edit positive conditioning and run queue prompt.
@@controlaltai
Wow. Thank you for such a detailed explanation! I'll try it out your advice. Thank you again!
Hey, I like the workflow. However, could you share a link to the repo where we can dl the bbox/segm models? I can't find all of those on the original github repo. Thanks :)
Just go to Manager - Install Models - Search for "Ultralytics". Install whatever you need.
I am getting out of memory errors as soon as it goes to the refiner checkpoint. I am using Think Diffusion 32gb RAM/24gb VRAM machine also. Any ideas?
Your image must be too big. Image has to be downscaled little. That system can handle 2k maybe 3k.
I got error: "NotImplementedError: Could not run 'torchvision::nms' with arguments from the 'CUDA' backend. This could be because the operator doesn't exist for this backend, or was omitted during the selective/custom build process" even though I installed cuda 12.1; torchvision 0.17.0+cu121 . So, could you help for recommend plz
try to uninstall and reinstall torchvision.
I got error when I ran Blip : Error occurred when executing BLIP Analyze Image:
The size of tensor a (6) must match the size of tensor b (36) at non-singleton dimension 0 . So how to fix it @@controlaltai
This can be fixed by going in custom nodes was node suit and clicking install.bat
Error occurred when executing FaceDetailerPipe:
type object 'VAEEncode' has no attribute 'vae_encode_crop_pixels'
I'm Getting this Error.
Hi, go to comfy manager - fetch updates - if it says "There is an updated extension available." - click on "Please update from Install custom nodes" - select all and click update on any one - then it should give you "To apply the installed/updated/disabled/enabled custom node, please restart", click on restart - close and re-open browser. Try that. if after fetch updates it gives you no updates, then let me know. I will need to see what images you have as input and if you made any changes to the workflow setting.
Also, when I try running the workflow, I get the following error:
Error occurred when executing BLIP Analyze Image:
The size of tensor a (6) must match the size of tensor b (36) at non-singleton dimension 0
Do you know what this means?
Yes. Try this please: Close everything. Go to custom nodes folders. Find the was suit folder. Inside click on install.bat. let it run, open comfy and then try and see if blip works. If not send me the command prompt error when executing install.bat.
Thank you that fixed it! @@controlaltai
Great video! I'm trying to get this to work and I'm getting an error 'Error occurred when executing FaceDetailerPipe:
The size of tensor a (0) must match the size of tensor b (256) at non-singleton dimension 1'
I tried the install.bat in the Was node folder since I saw you recommended that for a lot of other people with problems. It said 'Requirement already satisfied' for everything except one item: WARNING: opencv-python-headless 4.7.0.72 does not provide the extra 'ffmpeg'
I'm running Comfy through Pinokio on an AMD Gpu if that makes any difference. An help would be greatly appreciated.
Hi, Thank You. I don't have experience with Pinokio, but I think I know what the issue is. Blip in WAS node suit requires "transformers==4.26.1", whereas, in Pinokia, you may have the latest or a higher version of transformers. If the transformer version is correct, then probably some other dependency is a higher or lower version. You can check the requirements.txt file in the custom node. 90% of the time, it's a transformer version or a missing dependency. If there is no way to alter the transformer version in Pinokio, then I suggest bypassing the blip and using a custom prompt. The Blip just automates the text prompts. If you can input a custom prompt describing the image, a simple description should work. It won't affect the results as long as the prompt somewhat matches the description. For example, A Male/Female with race (like American, African American, etc.) and age (like young adult, teenager, adult, etc.) would be sufficient for face detailer.
This error "opencv-python-headless 4.7.0.72 does not provide the extra 'ffmpeg'" is basically a warning and not an error. Also, it has nothing to do with the image workflow. FFMPEG is required for some video nodes in WAS suit.
@@controlaltai Thanks very much for your reply. After some reinstalling and trouble shooting and still getting the error I finally figured it out. It was an AMD Gpu problem. If I changed the SamLoader (Impact) 'device mode' setting from 'Auto' to 'CPU' it fixed it. So it's working now if slowly so I can finish your video and at least understand everything. By the time I'm done AMD may have updated their drivers and I might even be able to use it regularly without waiting forever. Cheers!
Great video! However, I am getting the following error message when using the BLIP Analyze Image: Error occurred when executing BLIP Analyze Image: The size of tensor a (6) must match the size of tensor b (36) at non-singleton dimension 0. Do you have any idea why this is happening?
Thanks! Go to custom nodes - was node suit - click on install.bat. Probably some dependdancies are missing for Blip Analyzer.
@@controlaltai, that's what it was. Work perfect now! Thanks!
Hello. A question? How can I replace faces in an image with another face with Comfyui? Thank you
Yes you can, using insight face model using IPadapter Face ID . I cannot make a CZcams tutorial, as the company has restricted me from doing so for copyright. Search for “latent vision faceid ip adapter” on CZcams. the guy who implemented ip adapter for comfy ui, has a video on it.
Doctor Data :)
hi do you have a work around as when i try i got an error : RuntimeError: The size of tensor a (6) must match the size of tensor b (36) at non-singleton dimension 0 , that not the whole error code but after looking it seem that because WAS node become imcompatible with up to date comfyui, as new cmfy seem to use transformer, 4.36, and WAS node seem to only go up to 4.26,.
But has WASasquatch commented himself on a post on his github page yesterday about this he reply to someone " The code I am using is not compatible. If you know how to PR an upgrade of the code to use high transformers that would be fine. But I have no plans to upgrade WAS-NS at this time. As noted on the readme it's unmanaged at this time. PRs are welcome though. "
So i would like to know if you know how to make this work ( i mean the part of the video for face detail where you can change the eye , hair , mouth of a character on the photo, with an up to date comfyui installation, as i'm fairly new to comfyui i'm not sure how to adapt your video to made it work.
Hi, you need to downgrade transformers to use blip. There is no issue with comfy, some nodes may require a higher version. I have other solution but they are very heavy. The easiest solution is to search for a custom node called wd tagger. You can connect the image here and it will give you some tags, you can use that and put your own prompt. The only reason to use blip, is to get a description for the image, it can be completely bypassed. Any simple prompt would suffice.
@@controlaltai hi thx for the answer will try that , i've got another troubles as i tried to follow the tuto, first not has bad as the second one but my mask detection seem to not be as precise as your, your seem to really go for the eyes in particular with clean edge of the eye, in my case it's like it just take a mask of the eye and his surronding i can get a better mask by lowering the bbox dilatation but even tough it's seem like he just want to take a crop of the eyes area and not the eye in particualr as your seem to do even tough i applied the same setting, maybe because your tuto image has an already high resolution and it make easier to take the eye correctly ? also i notice you use a xl model but a Todetailerpipe and not the xl one, i wondered why,.
And for the real trouble i got next as it cause an error as i queue is when it come to clicpSeg i got this :
Error occurred when executing CLIPSeg:
OpenCV(4.7.0) D:\a\opencv-python\opencv-python\opencv\modules\imgproc\src
esize.cpp:3699: error: (-215:Assertion failed) !dsize.empty() in function 'cv::hal::resize'
File "C:\ComfyUI_windows_portable\ComfyUI\execution.py", line 151, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable\ComfyUI\execution.py", line 81, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable\ComfyUI\execution.py", line 74, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable\ComfyUI\custom_nodes\clipseg.py", line 154, in segment_image
heatmap_resized = resize_image(heatmap, dimensions)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\ComfyUI_windows_portable\ComfyUI\custom_nodes\clipseg.py", line 44, in resize_image
return cv2.resize(image, dimensions, interpolation=cv2.INTER_LINEAR)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Not sure why this happen, is it because of the image i use ( format, size or idk) the checkpoint, or anything thing else,) if you could help me figure this out i'd be grateful, i really like ai generation but troubleshooting error when you start to learn can be a bit fearsome, especially when using comfy hehe.
The eye detection is most likely because of the good quality image I used. The media pipe detection has some limitations, thats why you must be facing the issue with the detection. About the error I am not sure. Mostly if it fails to detect anything, it may give that error. Without checking your workflow, its a bit hard to tell.
Hi wise girl! thanks for share your experience with US! I cant find this Face Detailer you are using, i justfound a very simple node called DZ_Face_Detailer... where can I download this FaceDetailer you are using? Please...
Hello, search for impact pack. The face detailer node is there.
@@controlaltai Thanks a lot find it here. Blessings from the Pope :)))
Thanks for excellent Video. However i wonder why my Blip Analyze Image is different with in the video . And also in my Blip Loader , there is no model name "caption".
I already downloaded all in the Requirements sections
Blip was recently updated. Just input the new blip node and model and use whatever it shows there. These things are normal. Ensure comfy is updated to the latest version along with all custom nodes. Only caption will not work any more.
@@controlaltai So i already applyed your workflow json. But when I click the queue prompt, I get an "allocate on device" error. However, if I check it and then click the queue prompt again, it works fine without any errors.
So, I searched for the "allocate on device" error related to ComfyUI, but my error log was different from the search results. My error log only mentions "allocate on device" without any mention of insufficient memory, and below that, it shows the code. However, other people's error logs mention insufficient memory. Despite this, could my error also be a memory issue?
Allocate on device error is running out of vram or system ram. If you can tell your vram and system ram and what is the size of the image you are trying to fix the face for and with what box settings or anything else you are trying to do i can guide you on how to optimize the settings as per your system specs.
@@controlaltai This is my system when i run comfy ui server.
Total VRAM 6140 MB, total RAM 16024 MB
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 4050 Laptop GPU : cudaMallocAsync
VAE dtype: torch.bfloat16
Im using Fix_Faces_Extra workflow json. And my images size is under 1mb jpg files.
And process is stopped at FaceDetailer node. I think i should optimize the FaceDetailer Settings. Thanks
Images size does not matter resolution does. Change face detailer from 1024 or 767 setting to 256. That should work for you. Try that
node ClipSEG Error occurred when executing CLIPSeg:
OpenCV(4.9.0) D:\a\opencv-python\opencv-python\opencv\modules\imgproc\src
esize.cpp:3789: error: (-215:Assertion failed) !dsize.empty() in function 'cv::hal::resize'
how to fix it?
Are you running comfy in gpu or cpu mode? What is your GPU?
@@controlaltaiI have encountered the same issue. I am running in GPU mode with a 3060, 6G graphics card. Is there any way to solve this problem?
Hi, where can I download the workflow file?
Hi, The workflow is made available only for paying channel members.
I have a problem:AttributeError:The size of tensor a (3) must match the size of tensor b (9) at non-singleton dimension 0,
Can you give me some help?
thanks
Error is on which node?
@@controlaltai BLIP Analyze image
Go to the custom node folder in WAS and click on install.bat. Try after that. It should download required dependencies. If you still get error let me know.
@@controlaltai I'll try.
It's still an error.RuntimeError: The size of tensor a (3) must match the size of tensor b (9) at non-singleton dimension 0
i try to install the inspire pack but i get an error
Only one node is used from that pack the media pipe mesh in the last workflow. Let me know what error you are getting, maybe I can have a look at it.
Please make a video how to instal and use wildcard
I will look into it, however I am not sure if an entire dedicated tutorial can be made for it.
Is there a way to change the text in the BLIP Analyze Image node or change that node to something that can change it? For instance, BLIP thinks there's a woman in my image when it is actually a man and the face detailer then gives me a woman's face instead of a fixed man's face.
When that happens manually enter the prompt. Keep the blip output disconnected and manually enter the prompt via click text encoder.
Email me your workflow, hard to understand this way. Will check and reply there, mail @ controlaltai. com (without spaces)
Thanks! This worked. But now I have to fix the hand and could use your help!
@@controlaltai Thank you. Just sent!
@@controlaltai Let me know if you didn't receive it. Thanks!
Guys, why i always get the same face? I change prompt like hairs, eyes, age etc and i always have the same face but other details like hairs etc... I tried dreamsharper XL, dreamsharper 8, realisticVision and juggernaut xl.
What part of the tutorial are you referring to? The part where it starts from the eye, the face is not selected via mediapipe node. The face has to be detected and masked as a whole for it to change.
@@controlaltai Sorry. I'm talking about face generation in general. I'm learning on AI generated images as sources for tutorials etc but its boring and frustrating when i always got the same "person" but with other hair or eyes:(. I starter with AI image generation few days ago so its suprice for me that i can generate totalny random thinks but this problem os very important for me.
@@goliat2606 ohh ok. So basically the face will depend on the training set of the model. You can get some variation by defining race, age, country or a combination of them in the prompt. Say for example, American male, 30 year old. Try changing the seed as well. If you go to civit ai, you can find some useful LoRAs which help enhance the eyes, details, skin texture, add emotion, etc. Also try Realism Engine (XL) from civit ai for faces.
@@controlaltai Thanks. So there is no any model for more random faces? I tried Realism Engine SDXL and face was different than on other models but again the same despite different seeds. I set batch_size to 10 and generate 10 images. Every looks like the same person but in other position/scenery. Prompt was very simple "photo 25 years old woman". Another prompt generate another face but again very similar with another seeds. As i understand, LoRAs only can change image but in specific way? For example for specific hairstyle?
Try defining "Ethnicity and character origin country/place." See if that helps. You will have to keep altering the prompt for complete randomness. You can also try emotions (or use a LoRA for that).
Error occurred when executing BLIP Analyze Image
solved the problem by installing `transformers==4.25.1`.
I have an error when using blip model. There is solution?
Try this: Go to the custom nodes folder, in was suit, click install.bat, start comfy and see if blip works or not.
@@controlaltai Yup, solved. But..when i use node who containing face, mouth, left eye, right eye, etc (like ur turorial), I got error.. b'cause limitation of vram. Mine is rtx 3050ti 4gb vram.
Try using non sdxl and reduce resolution of Max in face detailer from 768 to 512 or even lower.
some nodes don't appear to me :\
I have mentioned it in the requirements. You need to have comfy manager installed. Then the nodes from there.
6 out of 10 usable? - The resulting face was ugly greyish. I cannot find a way around these grey faces I always get with the face detailer.
The result will vary depending upon the original image, settings and checkpoint. Checkpoint matters the most. You can share the image and workflow via email or here, I can have a look at it. And probably if it works on my side I can share the result and settings.
The 6 out of 10 was with a completely deformed image purposely taken to test the workflow. Practical fixes are less distorted and technically should work.
@@controlaltai - I think my mistake was to connect the VAE of the base-checkpoint to the face detailer instead of the VAE of the Refiner. Seemst to work now. VAE's seem to be different and important. Thx for your answer!
@@Minotaurus007This comment saved me from crying lol thank you
There is a reason SDXL is not use here.... It just does'nt work... DEAD. LOL
CLIPseg stopped working and now this video has become useless
Okay, I just reloaded the workflow and double checked, clipseg is working for me. What issue do you have. And if clig seg does not work, the same can be done with Grounding Dino. But you should get the clip seg working. Elaborate on the error or issue.
It's all terribly complicated, a thousand times more complicated than in automatics 111. I don't understand why you should do such unnecessarily complicated things.
Automatic 1111 although good cannot do some of the things comfy ui can. There are clear advantages of comfy ui over a1111. However, depending on the use case go with what you find best suited for your work. Not everyone needs complicated fine tuned controls, but there are companies and users who who, hence they use comfy ui.
its not that complicated if you're used to comfy which is much better