Stable diffusion face refiner online reddit

Stable diffusion face refiner online reddit

Suppose we want a bar-scene from dungeons and dragons, we might prompt for something like. In my experience, LMS has similar quality to 2M (both Karras and non, compared to their 2M versions), but LMS samplers are more artifact-prone, esp. Same with SDXL, you can use any two SDXL models as the base Hopefully Adetailer gets updated soon so you can choose the hands inpaint model instead of inpaint global harmonious. At that moment, I was able to just download a zip, type something in webui, and then click generate. Kohya Deepshrink is based on Scalecrafter research. The control Net Softedge is used to preserve the elements and shape, you can also use Lineart) 3) Setup Animate Diff Refiner Hey, a lot of thanks for this! I had a pretty good face upscaling routine going for 1. Then I do multiple img2img passes with a higher resolution, more VATSIM (Virtual Air Traffic Simulation Network) is the go-to online flight simulation network, where virtual pilots can connect their flight simulators to a shared network and enjoy realistic communication and procedures by VATSIM's trained virtual Air Traffic Controllers. Not too sure how exactly to do all this others that are up to date will know better. I'll then be wondering why the image was so bad ๐Ÿ˜‚. This brings back memories of the first time that I use Stable Diffusion myself. 2) and used the following negative - Negative prompt: blurry, low quality, worst quality, low resolution, artifacts, oversaturated, text, watermark, logo, signature, out SDXL 1. So the website shows all the images SD was trained on and more. Anyway, while i was writing this post, there has been a new update and it now look like this : Here we go. 78. 3) Jul 22, 2023 ยท After Detailer (adetailer) is a Stable Diffusion Automatic11111 web-UI extension that automates inpainting and more. The prompts: (simple background:1. The refiner was trained in tandem with the base, so it will not work without it. I've tried this article, but the result does not give me what I wanted. All dreambooth models require a special keyword to condition the image generation but the traditional fine tuning (continue training with a narrow dataset) doesn't. 85, although producing some weird paws on some of the steps. 5 can be seen in the style of the obvious changes in 0. Trained information is represented in an alternate form (using CLIP for text and VAE for image embeddings). I have been using automatic1111, don't know much about comfyui. 7 in the Refiner Upscale to give a little room in the image to add details. Then play with the refiner steps and strength (30/50 Activate the Face Swapper via the auxiliary switch in the Functions section of the workflow. 0, VAE hash example of workflow: Prompt: full body photo of beautiful age 18 girl, elf ears, blonde hair, freckles, sexy, beautiful, BREAK hiding behind a tree in the forest /r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and exclude blind users from the site. text_l & refiner: "(pale skin:1. but if I run Base model (creating some images with it) without activating that extension or simply forgot to select the Refiner model, and LATER activating it, it gets OOM (out of memory) very much likely when generating images. I guess an important thing for the quality (when you mention without finetunes) is that this time, the base model is finetuned by Lykon, the number 1 model creator on civitai. img2img API with inpainting. 0 base model and HiresFix x2. But they have different CFG values (because of the lightning). Question - Help. Technical details regarding Stable Diffusion samplers, confirmed by Katherine: - DDIM and PLMS are originally the Latent Diffusion repo DDIM was implemented by CompVis group and was default (slightly different update rule than the samplers below, eqn 15 in DDIM paper is the update rule vs solving eqn 14's ODE directly) /r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and exclude blind users from the site. Tried a bunch of images, and none of them had the hands detected…. If you use ComfyUI you can instead use the Ksampler Regarding the "switching" there's a problem right now with the 1. It depends on: what model you are using for the refiner (hint, you don't HAVE to use stabilities refiner model, you can use any model that is the same family as the base generation model - so for example a SD1. Structured Stable Diffusion courses. When SD tries to generate an image that is too different in size and aspect ratio from what it is trained on, you end up getting elongated or multiple features such as two heads, two torsos, and 4 legs. example of the basic model of photo-realistic style, the model used in the refiner is the anime style. You'll need two checkpoints "Real Pony" checkpoint (there are three, you want the one with the most upvotes on civit to start): this will be the primary checkpoint. So they put some stuff with the . I created this comfyUI workflow to use the new SDXL Refiner with old models: json here. 5. 0 Refine. Then install the SDXL Demo extension . 7 in the Denoise for Best results. Is there currently a way to adapt a ComfyUI workflow to avoid the refiner touching any human faces? It's removing details that I want kept there: it makes all faces smooth, de-ages them (I don't want that!) and evens them out, which deletes all of the characters' personalities, age, and uniqueness as a result. As per the SD super stage event, the refiner is an optional second pass that can improve some generations. It's called Family pack, get with the times old man! Dude is corn. Not sure if it’s the quality of the image or something but the colors become horrible and the art style becomes much more stylized instead of realistic, even when I try higher resolution images. It is not a reasonable approximation, it is the actual data it was trained on. The soft inpainting feature is also handy, it tends to blend the seams very well on the inpainted area. If you have powerful GPU and 32GB of RAM, plenty of disc space - install ComfyUI - snag the workflow - just an image that looks like this one that was made with Comfy - drop it in the UI - and write your prompt - but the setup is a bit involved - and things don't always go smoothly - you will need the toon model as well - Civitai/HuggingFace I can't get Outpainting to work in Stable Diffusion. Do not use the high res fix section (can select none, 0 steps in the high res section), go to the refiner section instead that will be new with all your other extensions (like control net or whatever other extensions you Using refiner with different settings. In this post, you will learn how it works, how to use it, and some common use cases. These sample images were created locally using Automatic1111's web ui, but you can also achieve similar results by entering prompts one at a time into your distribution/website of choice. 45 denoise it fails to actually refine it. No Automatic1111 or ComfyUI node as of yet. We'll see about the actual quality, flexibility, prompt adherence and optimization, if/when SD3 comes out fully. Workflow Overview: txt2Img API. After an entire weekend reviewing the material, I think (I hope!) I got the implementation right: As the title says, I included ControlNet XL OpenPose and FaceDefiner models. a close up of a woman with a butterfly on her head, a photorealistic painting, by Anna Dittmann Flexibility. I haven't played with Dreambooth myself so just going by other people's experience. As you can see the difference is an improvement but the image retains nearly everything. E. 6. fix while using the refiner you will see a huge difference. Step two - upscale: Change the model from the SDXL base to the refiner and process the raw picture in img2img using the Ultimate SD upscale extension with the following settings: (same prompt) Steps: 30, Sampler: DPM++ 2M Karras, CFG scale: 7, Seed: 1799556987, Size: 2304x1792, Model hash: 7440042bbd, Model: sd_xl_refiner_1. 5 models and LoRA are so fine tuned that while SDXL gives me a much wider range of control, getting the 'perfect' finish seems to only be reliable with 2) Set Refiner Upscale Value and Denoise value. Color problems when using face reference. There is no such thing as an SD 1. Imagine if you can do your model photoshoot with your new watch, skin care product, or line of overprice handbags in a studio, and seamlessly put the model in the streets of Milan, on the beaches of the Maldives, or wherever else instagram and tiktok says your target demo wants UNet does precisely this, working on different levels of detail as it downscales and upscales. Automatic1111 can’t use the refiner correctly. 0 Base vs Base+refiner comparison using different Samplers. Unrealistic body standards having a 24 pack. 4 - 0. 70 Prompt Comparison: SD3 API vs SD3 Medium. I did run the prompts on the SD3 API again to make sure they haven't changed it, and the results were the same, so it's still good. I downscale my SD pictures before using them in dalle-2, then do img2img again and work with cfg and init strength till they just retouch the dalle2 hands. 9 safetesnors file. Looking for a tutorial to train your own face using Automatic1111. All prompts share the same seed. The first is PixArt Sigma with no refinement and the second is after a . We all know SD web UI and ComfyUI - those are great tools for people who want to make a deep dive into details, customize workflows, use advanced extensions, and so on. If you don't use hires. 6 or too many steps and it becomes a more fully SD1. In summary, it's crucial to make valid comparisons when evaluating the SDXL with and without the refiner. It detects hands greater than 60x60 pixels in a 512x512 image, fits a mesh model and then generates SDXL vs SDXL Refiner - Img2Img Denoising Plot. 8. A person face changes after ADMIN MOD. On the other hand tin my experience the SD3 renders doesn't mix very well with refiners so what you obtain is almost a dead end. I also automated the split of the diffusion steps between the Base and the Actually the normal XL BASE model is better than the refiner in some points (face for instance) but I think that the refiner can bring some interesting details. For prompt use something like face, eye color, hair color, hair style, expression. 9 workflow, the one that olivio sarikas video works just fine) just replace the models with 1. It is the curve of rolling hills. I fix all my hands in dalle-2. Basically a bunch of junk so that I can perfect the image. SDXL vs DreamshaperXL Alpha, +/- Refiner. Uncharacteristically, it's not as tidy as I'd like, mainly due to a challenge I have with passing the checkpoint/model name through reroute nodes. Steps: (some of the settings I used you can see in the slides) Generate first pass with txt2img The checkboxes (face fix, hires fix) disappeared. fix @Dr__Macabre. Use SDXL Refiner with old models. It will just produce distorted, incoherent images. First of all, sorry if this doesn't make sense, i'm french so english isn't my native language and i'm self-taught when it comes to english. Even the Comfy workflows aren’t necessarily ideal, but they’re at least closer. face recognition API. next version as it should have the newest diffusers and should be lora compatible for the first time. realvisXL is great and currently probably better than Juggernaut however it is not a Pony model so it can't do what Pony can (but can do a few things that Pony struggles with, like working with controlnet). 5, we're starting small and I'll take you along the entire journey. AP Workflow v5. fix. You just can't change the conditioning mask strength like you can with a proper inpainting model, but most people don't even know what that is. And the SDE++ 2M versions are also fast per step. Like there's Embeddings, which there are quite a few Me too! realvisXL is awesome at photorealism. This simple thing also made my that friend a fan of Stable Diffusion. ComfyUI with SDXL (Base+Refiner) + ControlNet XL OpenPose + FaceDefiner (2x) ComfyUI is hard. This seemed to add more detail all the way up to 0. They also have an SDXL Lora that kinda adds some contrast. I will first try out the newest sd. I've search but found nothing that seems to use Automatic1111. For example, if you wanted a great image of a person in a firefighter outfit, you could add a specific extra ‘embedding’ model trained on images of firefighter outfits. Can take a while, on average I need two or the dalle2 inpainting prompts to get them fixed. 0 model. Ah also, death to the false emperor, blood for the blood god. I am going to experiment a bit more but if it doesn't work out, I may just use Pixart for the global compositional coherence latent base for SDXL and SD 1. Reply reply [deleted] High detail RAW color Photo of a strong man, hands in the face, urban city in the background, (full body view:1. [Cross-Post] I feel the original one is better, high denoise refiner destroys the lighting consistency, especially the hands become flat and even changed the skin color in the third image. Yes. 9 vae in there. 9 (just search in youtube sdxl 0. Comparison. For today's tutorial I will be using Stable Diffusion XL (SDXL) with the 0. This series of images is made to see if the color depth in SD3 can be translated in the process of refiner pass. I was surprised by how nicely the SDXL Refiner can work even with Dreamshaper as long as you keep the steps really low. Switch the timing point from 0. What they have is a marketable product. Hi. 9 vae, along with the refiner model. 5 secs More than 0. i dont understand what you need. 9 refiner node. This one feels like it starts to have problems before the effect can Is there an explanation for how to use the refiner in ComfyUI? You can just use someone elses workflow of 0. 5 version, losing most of the XL elements. Medium is using the base workflow on the huggingface Using Refiner -> Base or just CrystalClearXL or other model from the start -> VAEDecode->VAEEncode (SD 1. DreamshaperXL is really new so this is just for fun. pony is anime and 2d model. Hires fix is still there, you just need to click to expand but face restore has indeed been removed from the main page. 5 model in highresfix with denoise set in the . 9 and Stable Diffusion 1. Ideally the refiner should be applied at the generation phase, not the upscaling phase. I had the same idea of retraining it with the refiner model and then load the lora for the refiner model with the refiner-trained-lora. I will see that when I click on the wrong model and used it instead of the base. 3~0. But if you use both together it will make very little differences. Describe the character and add to the end of the prompt: illustration by (Studio ghibli style, Art by Hayao Miyazaki:1. But it's reasonably clean to be used as a learning tool, which is and will It's amazing - I can get 1024x1024 SDXL images in ~40 seconds at 40 iterations euler A with base/refiner with the medvram-sdxl flag enabled now. Stable Diffusion 3 Medium is Stability AI’s most advanced text-to-image open model yet, comprising two billion parameters. It is suitably sized to become the next standard in text-to-image models. I have tried uninstalling stable diffusion (deleting Taking a good image with a poor face, then cropping into the face at an enlarged resolution of it's own, generating a new face with more detail then using an image editor to layer the new face on the old photo and using img2img again to combine them is a very common and powerful practice. 2. An style can be slightly changed in the refining step, but a concept that doesn't exist in the standard dataset is usually lost or turned into another thing (I. , that is more conspicuous than the number of fingers RTX 3060 12GB VRAM, and 32GB system RAM here. The refiner should definitely NOT be used as the starting point model for text2img. The vae that was originally baked into SDXL created visual artifacts when it tried to do its "invisible watermarking". I see a lot of people complaining about the new hires. From L to R, this is SDXL Base -- SDXL + Refiner -- Dreamshaper -- Dreamshaper + SDXL Refiner. 74 votes, 16 comments. AP Workflow 5. Experimental Functions. (as mentioned Used Automatic1111, SDXL 1. The chart above evaluates user preference for SDXL (with and without refinement) over SDXL 0. It adds detail and cleans up artifacts. Im using automatic1111 and I run the initial prompt with sdxl but the lora I made with sd1. 5 for final detail refinement seems to give me the ultimate control. The main reason why I chose to do this is a selfish one. 5 model as the "refiner"). fix with SDXL is broken. Last, I also performed the same test with a resize by scale of 2: SDXL vs SDXL Refiner - 2x Img2Img Denoising Plot. In my opinion the renders of pixart tend to be more interesting and beautiful than the SD3 renders, but they need a second pass with a refiner. 30ish range and it fits her face lora to the image without Very nice. Stable Diffusion creates images out of pure noise. 0 and upscalers. Yep! I've tried and refiner degrades (or changes) the results. Install the SDXL auto1111 branch and get both models from stability ai (base and refiner). It's often not required. Edit: I realized that the workflow loads just fine, but the prompts are sometimes not as expected. This simple thing made me a fan of Stable Diffusion. Stable Diffusion is trained on a subset of those images, around 600 million of those, supposedly. Thanks tons! Accidentally used the refiner model to generate images. Misconfiguring nodes can lead to erroneous conclusions, and it's essential to understand the correct settings for a fair assessment. If the problem still persists I will do the refiner-retraining. 0 includes the following experimental functions: Free Lunch (v1 and v2) AI researchers have discovered an optimization for Stable Diffusion models that improves the quality of the generated images. But stable diffusion is faster and I can load the workflow I like. So, I'm mostly getting really good results in automatic1111 Yes its human faces only, probably best prompting on your dogs photo using img2img or controlnet. The model doesn’t seem to work for anime images…. It'll be perfect if it includes upscale too (though I can upscale it in an extra step in the extras tap of automatic1111). Whenever you generate images that have a lot of detail and different topics in them, SD struggles to not mix those details into every "space" it's filling in running through the denoising step. Forcing Lora weights higher breaks the ability for generalising pose, costume, colors, settings etc. Here are the solutions: ***Basically, install the refiner extension (sd-webui-refiner). Award. This is not my code, I'm simply posting it. Generate your images through automatic1111 as always and then, go to the SDXL Demo extension tab, turn on 'Refine' checkbox and drag your image onto the square. SD 1. 0 of my AP Workflow for ComfyUI. In any case, we could compare the picture obtained with the correct workflow and the refiner. We need laws that mark images like this as AI generated so we don't get low self-esteem. Opening the image in stable-diffusion-webui's PNG-info I can see that there are indeed two different sets of prompts in that file and for some reason the wrong one is being chosen. that extension really helps. /r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and exclude blind users from the site. 0 Base, moved it to img2img, removed the LORA and changed the checkpoint to SDXL 1. If you install locally you can add in your own additions to the main model. 5 refiner node. This accuracy allows much more to be done to get the perfect image directly from text, even before using the more advanced features or fine-tuning that Stable Diffusion is famous for. 5 in A1111, but now with SDXL in Comfy I'm struggling to get good results by simply sending an upscaled output to a new pair of base+refiner samplers Code Posted for Hand Refiner. Legal and PR issues. Karras in general are superb at low step counts (though LMS Karras gets lots of artifacts at high step counts, so never do that). Normal Hires. 3), detailed face, /r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app So in order to get some answers I'm comparing SDXL1. Key Takeaways. I was really like stable Cascade mixed with a 1. I just released version 4. Just like Juggernaut started with Stable Diffusion 1. Create a Load Checkpoint node, in that node select the sd_xl_refiner_0. Interesting, gonna try this tomorrow. 1), crowded, alluring eyes, detailed skin, highly detailed, hyperdetailed, intricate, soft lighting, deep focus, photographed on a Canon 5D, 24mm macro lens, F/8 aperture, film still [after]{zoom_enhance mask="face" replacement /r/StableDiffusion is back open after the protest of Reddit killing open API access, which will bankrupt app developers, hamper moderation, and exclude blind users from the site. 1. choose two different styles of models, one as a base, one as a refinement of the model. You can inpaint with SDXL like you can with any model. 5 VAE) -> SD 1. 5 of my wifes face works much better than the ones Ive made with sdxl so I enabled independent prompting(for highresfix and refiner) and use the 1. Thanks. LewdGarlic. I had to use clip interrogator on Replicate because it gives me errors when using it locally. 01 ~ 1, each increase of 0. I'd like to share Fooocus-MRE (MoonRide Edition), my variant of the original Fooocus (developed by lllyasviel ), new UI for SDXL models. 5 model as your base model, and a second SD1. Reply. It saves you time and is great for quickly fixing common issues like garbled faces. I also automated the split of the diffusion steps between the Base and the I thought my gaming would be at least a lot better than my 2070 super 8gb. 0. 40 denoise using Zavy 's excellent ZavyChromaXL v7. Use img2img to refine details. To get it back, go to settings --> user interface and add it back. If you've seen this post before, you know what to expect. It's not, it's just barely better. 0 and some of the current available custom models on civitai with and without the refiner. Thanks for this - newbs coming from A1111 can be overwhelmed by the ComfyUI when trying to locate nodes. 5 LCM refiner sampler pass. 1 sdxl model and 1 sd1. 2), (isometric 3d art of floating rock citadel:1), cobblestone, flowers, verdant, stone, moss, fish pool, (waterfall:1. These comparisons are useless without knowing your workflow. Second, you'll need the photo style SDXL checkpoint of your preference. Stable Diffusion looks too complicated”. They mostly use python to train Stable Diffusion. Try: rear view shot or just rear shot. 0. Note: I used a 4x upscaling model which produces a 2048x2048, using a 2x model should get better times, probably with the same effect. Use 0. SD is a big thing with a lot going on, don't be afraid The truth about hires. 519K subscribers in the StableDiffusion Make sure you have: Settings -> Stable Diffusion -- > "Maximum number of checkpoints loaded at the same time" set to 2 so it wont unload and reload the model for each pass. After some testing I think the degradation is more noticeable with concepts than styles. and have to close terminal and restart a1111 again to clear that OOM effect. i came across the "Refiner extension" in the comments here described as "the correct way to use refiner with SDXL" but i am getting the exact same image between checking it on and off and generating the same image seed a few times as a test. 509K subscribers in the StableDiffusion ComfyUI with SDXL (Base+Refiner) + ControlNet XL OpenPose + FaceDefiner (2x) ComfyUI is hard. Consistent character faces, designs, outfits, and the like are very difficult for Stable Diffusion, and those are open problems. Refiner extension not doing anything. Edit: RTX 3080 10gb example with a shitty prompt just for demonstration purposes: Without --medvram-sdxl enabled, base SDXL + refiner took 5 mins 6. Below 0. Basically it just creates a 512x512 as usual, then upscales it, then feeds it to the refiner. I've found very good results doing 15-20 steps with SDXL which produces a somewhat rough image, then 20 steps at 0. However, this also means that the beginning might be a bit rough ;) NSFW (Nude for example) is possible, but it's not yet recommended and can be prone to errors. The negative prompt sounds like frustration to me. A TON of budget of commercial shoots is location-based. No surprises, Medium is much worse. But I'm not sure what I'm doing wrong, in the controlnet area I find the hand depth model and can use it, I would also like to use it in the adetailer (as described in Git) but can't find or select the depth model (control_v11f1p_sd15_depth) there. Can I use a different CFG value for the refiner in Comfy? I'm currently using Forge, should I switch to Comfy or StableSwarm? 2. It is the delicate interplay of shadow and light. Here is an example of two images. If you aren't using that GUI, the best option is to bring it into GIMP Fooocus-MRE v2. The SDXL base model performs significantly better than the previous variants, and the model combined with the refinement module achieves the best overall performance. The dataset I linked above contains 5 billion images, it's called LAION-5B. 5 denoise with SD1. 3 - 1. i tried "camera from behind" or "camera shot from behind", i cant really think of other prompts to use, but iv only been able to get like 1 out of 20 images to be from behind with this. Here's how to get the benefits of Pony XL, without the drawbacks of art-style. 0-RC. 2), cottage. The smaller size of this model makes it perfect for running on consumer PCs and laptops as well as enterprise-tier GPUs. For now, I have to manually copy the right prompts. Technically dreambooth is a also a fine tuning technique. (viewed from behind:1. 5 and 2. Simply ran the prompt in txt2img with SDXL 1. I know there is the ComfyAnonymous workflow but it's lacking. 55 and go from there. 0 for ComfyUI - Now with Face Swapper, Prompt Enricher (via OpenAI), Image2Image (single images and batches), FreeU v2, XY Plot, ControlNet and ControlLoRAs, SDXL Base + Refiner, Hand Detailer, Face Detailer, Upscalers, ReVision, etc. The title tells everything. F222 is a traditional fine tuned model that does not require a special keyword. People using utilities like Textual Inversion and DreamBooth have been able to solve the problem in narrow use cases, but to the best of my knowledge there isn't yet a reliable solution to make on-model characters without just straight up hand-holding the AI. Inpainting is almost always needed to fix the face consistency. Every time I use a face reference for my stable diffusion model I get really weird artifacts. 7 and then close to the base model. 5 model. I just started learning about Stable Diffusion recently, I downloaded the safe-tensors directly from huggingface for Base and Refiner model, I found…. It can do this because it was trained on a lot of images paired with their text captioning with various amount of noise added to the image. Can someone guide me to the best all-in-one workflow that includes base model, refiner model, hi-res fix, and one LORA. Code for automatically detecting and correcting hands in Stable Diffusion using models of hands, ControlNet, and inpainting. Start with a denoise around . It'll be at the top though, not where it used to be. I want to use Pony as a base model and Juggernaut Lightning as a refiner for more realistic images. ๐Ÿ“ท All of the flexibility of Stable Diffusion: SDXL is primed for complex image design workflows that include generation for text or base image, inpainting . There is an SDXL 0. First get a photo of the head in the same direction as the result in mind. It works to a degree but maybe not enough. If you're using Automatic's GUI there should be an option for full res inpainting so you can mask off the face and generate a new one using a prompt referencing the face and it will generate the face at the full resolution of the image and then scale it down to fit the mask. I'm using the recommended settings; Sampling Steps: 80-100, Sampler: Euler a, Denoising strength: 0. Use a value around 1. Setup. I've heard you get better results with full body shots if the source images used for the training were also full body shots, and also keeping the dimension to no more than 512X512 durign generation. ix vg fu tg tf zb ev oi so ht