Stable Diffusion Consistent Character Animation Technique - Tutorial
Vložit
- čas přidán 31. 05. 2024
- GitHub: github.com/tobias17/sd-anim-u...
Turntable LoRA: civitai.com/models/3036?model...
WARNING: if you have limited VRAM, use lower sized pose images, like 384x384. Good results come from lining 3 images next to another, so make sure your graphics card can handle an image 3 times wider than your base pose image size plus a ControlNet model.
In the video I go through my latest workflow for creating multiple images of the same character using Stable Diffusion without a character embedding. I use ControlNet to create a sequence of images that can be composed into a spritesheet for an animated character in a video game. I also provide tools to do local frame interpolation to increase your frame count for the animation.
0:00 - Intro
1:32 - Cloning the Git
2:50 - Setting up the Workspace
4:54 - Creating a Reference Image
11:13 - Iteration 0
20:20 - Iteration 1
23:44 - Iteration 2
26:34 - Extracting and Cleaning
27:53 - Frame Interpolation
31:51 - Sprite Sheet Generation
33:28 - Results
This is incredible. Thanks for sharing.
incredible for placeholder assets at least! thanks a lot very very helpful
looks really comfatably smooth
dude.....DUDE! This is wild.
Smart workflow
Soon there will be an AI that does this, and everyone will make games and movies. The next Renaissance is almost here, only the spark of AGI is left to ignite. Let us all embrace the singularity.
😂 amen
Artist tremble.
Renaissance ? Dude, what is done without effort become worthless. You will have a sea of trash. On top of that people will become useless..
L
Pffft you wish 😁. The steps he did cant be reproduced yet by AI. At least not on current gen dumb AI
Wow! This is amazing!
beautiful
I tried following your video. I did learn something. But resources of the poses, even for the initial turntable, is not there.
It would sometimes generate more / less poses than expected for turntable if no controlnet is used. I guess that part can be found by running the example images on civitai turntable page
Great Tutorial!
Great work, this is awesome.
wild! amazing job
absolutely brilliant, wow
Whenever I use the images for the turntable poses, I get two back facing characters instead of one in three directions like in the turn table. Any idea why that might be happening?
bro ngl this stuff is so impressive but it looks so goofy like from an mobile ad xD
Danke
In the end section did you use the latest Godot 4 or 3 ?
Great video, would love if you create more like this and integrate into Godot 4 workflow
He's using Godot 3.5.2 the latest one. You can see it at near the bottom right
Godness bless U, mate.
Hey this is awesome. Can it be done having only one image as a starting point? I am looking for a way to change pose of my character that I have only front facing - I need it to turn to profile view, 3/4 view and back. Do you think it's possible?
Its most likely doable, might just have slightly more finicky results
Excellent.
1:12 "Kevin your hotpockets are read---" *gasp* *Closes door*
I've been doing, or attempting, something similar in Midjourney using prompts and image influences like sprite sheet, Muybridge photo set, etc. It's a bit janky and hit or miss but your end results look pretty darn good!
I will say though that Midjourney V5 is frighteningly good at retaining shape and form of a subject at different angles if there are multiple instances of the same subject in the same output image. (if you can get it to do that) There's definitely potential for photo realistic quality turnarounds or animation already there. We just need to learn how to extract and output the proper information.
Stop using midjourney and start using the webui. You have infinitely more control.
@@Sammysapphira What is Webui
Thanks for sharing!
Dude, there are ways to animate this faster and have more control. This is more cumbersome than standard animation routines.
yeah I used to watch a 3D animator just straight ahead his animations in Maya straight down the timeline. He didn't do any preplanning at all, and yet this dude was able to work on movie after movie like Frozen 2, etc. Just taking a quick glance through this video, I would have way more fun animating straight ahead myself.
ha ha exactly. normies do not get it. at least the nerds get an a for effort. using typing and clicking instead of drawing. same.
@@dallassegno just wait till you find out that all recent disney movies that were praised for their artsyle were created by clicking and typing...
10/ 10
pretty cool dude :-) share the GODOT game when it's baked!
I’d be curious to see this mixed with something like Cascadeur
0 brain cells detected
22:58 You can resize the inpaint windows, and hover over the window's bottom right corner. close the output image on the right if it's in the way.
nice tutorial sir. have a good day.
as much as I truly marvel at what AI is accomplishing, it really has an insane knack to take away any fun from the reactive process and it replace it with simple yet confusing logic. Funny how things work, also very souless.
souls dont exist
@@moosesgalore apparently nihilism is at an all time high
@@untitled795 nihilism is a belief that meaning does not exist. i think believing in the soul is truly a burdensome and crippling delusion that will destroy ones ability to think clearly, and thus eradicate any desire to struggle and create a new world where suffering can be remedied. the world is very bad, and it has been since we arose from the darkness of nature... we must embrace our fullest potential and reject falsities and religious notions that do not elevate us. no matter what we must change. you, to me, are the nihilist, uninterested in the material world but in petty distractions.
@@moosesgalore "souless" is just a word people who believe in such thing use for "unnatural"
@@cyberdoge1857 it is unnatural to continually reinforce artificial religions created by rulers from ancient times to enslave their servants easier
The Lora linked is actually a textual inversion embedding... I ended up having to move it to the correct folder... :D
Damn THicc
Great tutorial!
Are you planning on "automating" this process? So that the game creates spritesheets for newly generated enemies?
I have experimented with the following concept:
Generating an image that has the characters limbs and head/torso detached and layed flat out (like some sort of t-pose). Currently using img2img for that. Then a script to remove the background and slive each limb up in a seperate texture. Then I use those limbs to position them into a spritesheet, then I img2img the sprite sheet with high enough denoising so the "seams" between the limbs get "patched" but low enough so the contents of the image remain somewhat what they where. There's some post processing that's possible but I haven't bothered yet (I could blend some parts of the automatically-built-sprite sheet to get more consistency in between frames, like the face etc). I had not interpolated the frames like you did, which would probably even make it look better. The only issue I had with my method is that the movement was very "stocky" like Diabloe 2 sprite sheets or something.
I wonder when this comes to automatic1111 a bit more user friendlyish. (not that complicated but a lot of settings)
That is an interesting technique, I will definitely investigate. I found in the past that it was difficult to automate things without either 1) having things look too similar or 2) have a high probability of deformities. But all of that investigation was before controlnet, so I definitely want to give it another go.
I think a perfect automation middle ground would be supplying an algorithm a turntable and it produces a spritesheet from that.
You have a script that can slice each limb up automatically? I've been trying to find one/create one but no luck.
@@matteogarza2211 wish there was a bit automation on the move pictures to this and that. it should be able to allow contextual information to auto save picked/clicked pictures. it is easier/faster and also more intuitive that way. at least i think it would.
I personally use Unity a lot and I know it has a sprite rigging system (haven't used that myself, though). I imagine there could be a similar workflow using something like that on the engine side and instead of generating frame interpolation sprites one could try to figure out the 2D rigging information from the stable diffusion generation process. (the control net poses of course are a start, maybe stable diff depth output could help as well, maybe one needs an extra workflow step for outpainting "separated" limbs..) Something like that should result in smooth interpolation and it almost sounds like it is possible, too. ;-)
Good work but, When I use ControlNet, my inpaint mask did not work, the whole picture will be different, it seems like img2img.But when I close ControlNet, the mask worked. It makes me puzzled...Could you plz tell me how to fix it if you have a idea ?
Amazing
Hello, this is very interesting. If you want a specific face, then generate just the face, copy the seed and then paste the seed when you draw the face. As I understand it, when you have the seed from the whole character, the details (face, arms, legs, etc.) individually, can be different. Especially not a large image. But a close-up of just the face or muzzle:), for example, will always generate the same thing.The only thing wrong is that you can not (or?) throw 10-100 frames from the video, in which someone is walking, running, lying, somersaulting, and so on. And immediately pull out a folder poses. A depth map can be created, some masks can be created from many frames at once. But the pose or "control_canny" together with the model "canny" I don't know, you can pull an entire folder at once (batch file) - I haven't tried it yet. It would be super. And to this also "interogate derpibooru", which automatically generates cues on the image.
The main issue with that is for the face, you have to normalize a crop to the center of the face. Things like img2img on video work fine when you are going from human to human with similar body shapes, but I am exploring the creations of animations for video games, where you have a wide variety of shapes and animations, and where you can not get good source video for.
@@tobiasfischer1879 you could use an inpaint mask simply no ?
@@ElieSanhDucos0 the issue is that even with an inpaint mask that's pretty narrow, a denoising strength that's too high will cause deformaties. It will have limbs the wrong way and then the mask just cuts them off. If you have too low of a denoising strength then it does not change enough. I am investigating some automatic techniques to mitigate this, so hopefully I can make a decent automatic process.
please can you make in dept full tutorail about how to install from beginning i follow the video but not working please help
Can you share the walk, stand and turntable pose pngs?
Thanks
No matter what I try I can't manage to have a proper result. It might be cause your link don't send me to a LORA but a textual inversion, but no matter what I do (using the textual inversion with the keyword 'charturnerv2' alone doesn't work, I managed to get 3 pose for my turntable by guiding the AI, but when I try to get a 4th, it has nothing to do with the 3 firsts. Downloading and using didn't work either, it tries to add 3 more poses for my character, and while it feel like it COULD give me something related to my poses, the problem is it completely ignore my controlnet when doing so
Try playing with the hyper parameters like denoising, only masked vs whole picture, cfg scale. For me I needed to use different settings than the vid. Guess it depends on your prompt, etc.
@@wackcrewpeeps I kinda managed by guiding it a bit more with a bit of prepaint on gimp. That being said with the latest update of controlNet I'm not sure we'll have to keep doing that I'll have to do some testing
@@alias9932 yeah having to do this for each frame is a bit tedious but just give it a month or two
You skipped how to create the poses that are in your pose folder, is there a link, or is there a way to generate these?
He skipped it because that's the cheat part. He actually had to animate those poses, or take a running animation video, strip it down to frames, and openpose each frame. So if you want to do this for a game, you need to animate a dummy anyway, or rip someone else's animation movement
you familiar with how to train models by chance? ive been looking for someone who I can pay to help me figure out how to train a music model on songs I own or have made and get it to generate stuff from it
I wonder if Corridor Crew help Contribute greatly to this.
this is 10 months old and it seems like youre still the only one with a legit technique. i came to jump into RPG maker because i percieved that spritesheets would be as simple as uploading a few pictures and thinking it would spit out a sweet sprite sheet however i programmed it. but no! you still have to sit there and process each frame manually. using a script that spits out all these directories and images. not really there yet, however still a massive time save compared to hand animating frame-by-frame. i wonder if youve changed your methodology 10 months later.
HOw do you create the character sheet?
Could you share the link to the poses(files) you used for the idle, walking and other animations? @tobiasfischer1879
Bump
Anyone can suggest a good starting point?
Do you think it could be used for 3D creations?
I was thinking about something along those lines..... If you create a turntable some existing software could potentially turn that into a rought 3d model. The model itself would not look the best, but you could rig it to a skeleton, animate it, render out frames of the animation, and then run the render through SD to clean up the "rough" 3d model.
@@tobiasfischer1879 yes exactly
God I wish I had an excellent computer, these AI tools look really helpful. Well, atleast I'll save this video for later when I get a good computer whenever. 4 GB Vram is *really* not cutting it OOF
Nice, I think pose fixes can be completed in photoshop with SD plugin :)
muy buen trabajo pero me perdi en el proceso
In this amount of time guesing and trials and error and testing you can create same thing without any issue.
I didn't know stable difusion was a hood irony user
to be honest im pretty sure i could get a better result by breaking the image apart into layers and animating them by hand in the same time.
This isnt impressive to me for the result but for the potential.
this is only going to get better and better. cant wait to see what we will have in a year or so
Are both models trained with the same technique?
I am not using a custom model, these are both generated from a generic model I got online. The openpose with turntable technique allows me to generate consistent characters without having to train a custom model.
@@tobiasfischer1879 Sorry, I meant to say "created" (was talking about the character models, not the stable diffusion models, I've used the wrong word by mistake).
The consistency and quality is incredible. You got an amazing finding in here
@@fdimb yes, both were made with the same technique
still a lil bit weird animation, i'll wait until its perfect.
This looks like these mobile ads lmao funni
This did not work for me at all, I keep getting a random character.
Hi dude, how are you? I can't install Stable Diffusion on my PC, it gives an error about torch not accepting the GPU... Do you know how to fix this? Or is there another alternative to solve this? Thank you very much in advance and success with your channel!
This video isn't about installing stable diffusion, but there's a very good one about it in Aitrepreneur's youtube channel
2 months later... DragGAN
What's your PC specs?
I'm running an RTX 3080 (with an AMD 5950x CPU)
@@tobiasfischer1879 Thanks, I have a 2660 Super and I'm trying to get into AI. I think I'll try for a 4060gpu before diving too deep.
@@emanuelrouse7076 for things like stable diffusion you can get a lot done with cards like a 2060, most of the things I'm doing are overkill for my card. The best way to get in is to just start creating and experimenting
@@tobiasfischer1879 didn’t know that! I guess I’ll dive in and see where it takes me. 🙏 thanks.
Mola
Things move so fucking fast.
Does it save time? guess no. but great try!!!
This is a Huge advance in AI art.
But.. right now its seem much more difficult than only generate a concept art and do the sprites by hand..
That requires you to either be artistic enough to make the concept art and sprites by hand, or rich enough to hire an artist to do it for you.
@@IceMetalPunk i mean, generate a concept using AI and retouch, modify to make a spritesheet.
Of course one will need some Photoshop skill.. but not much.
@@Adnegoo Ah, yes. Just "modify to make a spritesheet". So simple, requires almost no artistic talent at all, even a child could do it! /s Some of us have trouble with any art beyond stick figure people, my friend.
Forms and I do *not* get along.
@@IceMetalPunk but the technic used in this video also require some using of photoshop. The point is that someone that can reproduce this workflow, can do it manually way faster.
@@Adnegoo Using Photoshop is one thing; redrawing concept art in sprite form is something entirely different that requires talent and skills that aren't required for this AI-based approach. It's not about the speed, it's about the difficulty.
he never hits you...
This is amazing technological advancement. but for real application it has "made by AI" written all over the place. still a long way to go till that part is really unrecognizable.
ゲームには使えそうだな😃
sounds like it takes as long as it would without ai.
what good is it when everything generated is not your property? it's all watermarked, you'll just end up getting sued to death if you use it. [Edit: I stand corrected. leaving this for educational purpose]
@@AlexUsername it's not at all the same. the company Stability AI created stable diffusion, just like the company OpenAI created chatGPT. it's their product. if you generate something with it it's still their intellectual property, hence the invisible watermark. a dictionary does not generate art or other content, it just provides a vocabulary. your brain does the work when you write a novel, so it's your intellectual property. anyways, good luck.
@@user-dx1no8ht2c that is absolute bullshit check your facts dude.
You're talking bs. SD if free for commercial use. The copyright is in their license-model, it doesn't meter if its watermarked, they can't sue you.
@@rbscli can you show me some proof of that? where does it say so?
where is the reply from @baza0, why was it removed? it was quite intriguing
laughable results.
For now... give it a few months
Hear that? The voices of Thousands of Artist trembling at AI.
While they attempt to restrict us through the slow Court system. By the time judges realized what happened. Either AGI prototype will be here or AI animation is a thing.
my god that looks terrible
Bruhh I thought this is a serious animation study not a futuristic AI bs 💀
I would love to make my own little CRPG game with models and animations like that...
MENGA HAM UCI TASHLAB BER
wow dude.. Are you showing your workflow for this? I might have to subscribe to you..