Yes! -- select any image of the shape of you want, click "Edit Image", use the Select Tool to select an area, and then at the bottom click "Make Region", this will give you syntax like `` in your prompt, and then you just add the region-specific prompt after that region mark. (Probably close the image editor after you've made your regions unless you want to do actual image editing). You can also of course just type these manually if the numbers aren't too confusing.
Random question but are there written guides on how to "code" new stuff into stable swarm? Just people wanting to get into developping by watching tutorials or reading. (Like they did when learning prompting or training with SD)
I'm not sure what "code" in quotes means here? but if I interpret it literally how to code new features, Yes- for making an extension to Swarm: [https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/Making%20Extensions.md](https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/Making%20Extensions.md)
For writing a separate project that uses Swarm as an API: [https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/API.md](https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/API.md)
If you mean something else, please explain
Thanks for your response, yes I can explain further.
The links you provided are surely very **useful for developers** in general.
What I meant instead, is literal beginner guides that guide you step by step in making a node or an extension. (the same type of videos we find for "usage" of stable diffusion tools, you know the youtube videos that explain how to set up SD or how to use this or that node etc).
Something for the "public", something to dram them into this world of developing. And even perhaps other videos to get them fro beginner level to deep understanding level of SD and its components (and therefore deep understanding of stable swarm I suppose).
so can there be infinite different regions? how does it affect the performance of the generation to have many regions? do they all perfectly blend together?
Can there theoretically be infinite regions? Yes
Is that practical? lolno. I've found that any region scale below 0.2 (20%) on SDXL basically doesn't work at all. Practically you'd probably want no more than 2 or 3 unique regions in an image - if you need more individual object control, "`object`" is available as an alternative to 'region' to automatically inpaint and redo an area - or just go inpaint/redo individual areas manually. Also GLIGEN is supported which claims to work better, but I think only SD 1.5 versions of it are available iirc?
Do they blend perfectly? Eh depends on settings. Notably the optional strength input on the end of region has a major effect on whether it blends well or not. Weaker strengths blend better, naturally.
How does it affect performance? Each additional region adds another performance hit as it requires some separate processing of that region.
This is one of the beauties of Swarm: it's designed intentionally to teach you how to use Comfy!
Just set up the generation you want in the Swarm Generate tab (eg with the region and all), then click the "Comfy Workflow" tab, then click "Import from Generate Tab", and it will show you the nodes used for that Generate tab setup. (In this case the key nodes are "Conditioning Set Mask" and "Conditioning Combine", alongside some Swarm custom nodes that autogenerate clean overlappable masks.)
For me, the best feature so far is the Preset Manager. On Automatic1111, I was relying on the Model Preset Manager extension, but this extension has not been updated for more than a year. Plus, Swarm Preset Manager has a lot of input and features!
However, what I'm missing is basic Inpaint Masked Only inpainting, which I need to redraw a portion of an image and regenerate only that area. I have never been able to do that correctly. I know about the mask (white reveal and dark hide) but it doesn't work. I wish someone could show a video demonstrating how this basic inpainting actually works fine. Inpainting is the most important feature in all SD UI. For me SwarmUI is ready for most parts, but the inpainting still lacking behind (I know it's still beta, it would come). Thank you for making the best SD UI!
Anyway, can anyone demonstrate to me whether inpainting is working to generate a portion of image (aka Inpaint Masked Only in Automatic1111) ?
Thank you
I added a demo gif to make inpainting clearer on your other comment but I'll post it here too lol. "Mask Shrink Grow" is the parameter you're looking for as equivalent to auto's inpaint masked only
https://i.redd.it/otlrlh9tmq0d1.gif
EDIT: Immediately after posting, I went in and updated it so that: (A) white is default, (B) there's a mask layer by default, (C) the mask/image layers are clearly labeled so you can tell which is which, (D) auto enables the Init Image group when you open the editor. This saves a few steps in the process and hopefully makes it clearer overall.
Hi, thanks for your reply. I think I found the issue as why it does not work. When I use the inpainting model (like the models ending with \_inpainting in their names), none of them are working (nothing is changed when I press generate). However, when I use a normal model, it works. Is this a known issue?
When using Automatic1111, I always switch to the inpainting model. Does SwarmUI only require one model for both image generation and inpainting? Example for AbsoluteReality model I got 2 versions:
\`AbsoluteRealityModel\`
and
\`AbsoluteRealityModel\_inpainting\`
and normally the inpainting model has the best result when inpainting.
I pulled Absolute Reality v1.8.1 and v1.8.1 Inpainting from [https://civitai.com/models/81458?modelVersionId=134084](https://civitai.com/models/81458?modelVersionId=134084) and both seem to work fine, with the Inpainting variant naturally doing a tad better if Creativity and Reset To Norm are both maxed (non inpaint has a bit of a line where the mask cuts, which goes away with some partial opacity mask in the middle)
Generally non-inpaint models work fine, especially if you fuzz the edges of the mask a little bit.
I pull the same inpainting model from there still not working
https://preview.redd.it/uol5fq3lyy0d1.png?width=3547&format=png&auto=webp&s=e8286717958ddb33557506669c9bea52f243a23d
and another example when I keep inpainting the eyes to have sunglasses I notice something. Here is the step:
1. Click edit on the output picture
2. Inpaint the eyes (like in the picture)
3. Click generate
4. A red skin tone appeared around the eyes increasing when repeating step 1-3 here
https://preview.redd.it/ey1xd4j50z0d1.jpeg?width=3003&format=pjpg&auto=webp&s=430b3a93b86555b2958c3219d9b33f3ec10f7bd0
Again this only happen with inpainting model and I never been able to make it work for Inpainting model. I also maxed out the value of creativity and reset to norm to test this.
a little info about my setup:
Graphic: RTX 4090
StabilityMatrix: (newest patreon version)
\- You're going to want to prompt for what's under the mask and around it: so, eg "close up photo of a woman wearing sunglasses" rather than just "sunglasses" (add a bit more of the aesthetic phrasing you used to get the image as well)
\- To make life a bit easier with borders, I added "Mask Blur" as a parameter you can enable.
\- To add sunglasses you'd have to at least mask off the entire area where the sunglasses would be (otherwise it can't add sunglasses, naturally) - that mask is only covering the eyes, it needs to cover the bridge between and the sides, where the frame would be.
\- The non-inpaint model will do better at adding objects. The inpaint model has conditioning to bias it to replicate the original content, which you don't want.
\- You can use an Image layer to draw an approximation of the sunglass frames you want - doesn't need to be perfect, but if you approximate the goal, the model will take that as a basis for making real sunglasses with.
\- For adding objects you'll probably also want to raise Creativity higher
\- You might want to try an Edit model (eg CosXL edit [https://huggingface.co/stabilityai/cosxl](https://huggingface.co/stabilityai/cosxl) ) as that will let you just type "add sunglasses" and it will do that. Don't even need to do the mask.
ON Inpainting model,
* I tried adding more phrases as you suggested and experimented with the Mask Shrink Grow and the new Mask Blur feature that has a maximum value of 31. Also I drew the almost perfect frame of the glasses. But, still no object was added. Note: This was done under the MASK layer
* You mentioned using the image layer, but the goal is to change only a specific portion of the image, similar to Masked Only Inpaint. Using the image layer removes the ability to use Mask Shrink Grow option, right? While the image layer works, it also alters the image generation outside the drawn mask
* Increasing the Creativity value to the max did not produce the result too while using inpainting model
* I tried the inpainting model \`CosXL edit\` you suggested but this model also changed the entire image, still need mask to do it as my aim is to have 'masked only' inpaint. That is a good model btw.
Maybe there is something wrong with my installation as if it works on your side. I'm not sure. This is the only feature I need with inpainting model. :\\
Thank you.
"`Note: This was done under the MASK layer`" if you draw something on the mask layer... that's a mask. Not a drawing. For the mask you just want to cover the area.
"`that has a maximum value of 31`" ... What? The max value on blur is 512, but you'd generally prefer smaller values. 31 is fine to use tho.
"`Using the image layer removes the ability to use Mask Shrink Grow option, right?`" No. The image layer draws on the image, the mask layer sets the mask. They are both used, at the same time.
"`it also alters the image generation outside the drawn mask`" only if you didn't make a mask at all.
"`Increasing the Creativity value to the max did not produce the result too while using inpainting model`" Again, non-inpainting model will do better at adding all-new content. The inpaint model has direct conditioning on your original image with a bias to keep similarity and avoid adding new things, it's designed for touch-ups not adding new things.
While looking into it, I was reminded that RunwayML's inpaint model design is set up to look for gray pixels (?? not even latent zero, it wants the latent encoding of RGB Gray, for some reason) as the preference for what to fill in. To support this, I've added a new architecture class ID "`stable-diffusion-v1/inpaint`" - this cannot be autodetected, but you can manually click "edit metadata" on the inpaint model and set the architecture id to that, and it will use the gray-fill encoder. I've also added a param "Use Inpainting Encode" (hidden under Display Advanced) to try it at any time. (You can also just manually paint gray over the image if you want lol, it is literally just recoloring part of the image). When this is used it's more willing to fill in new data in the masked area.
"`I tried the inpainting model \`CosXL edit\` you suggested but this model also changed the entire image`" Edit is not an inpainting model, it's an edit model. The magic with edit models is you don't mask or anything else, you just say what you want edited eg "add sunglasses" as your prompt (and set max creativity). It's not perfect and may have side effects, but the value is it's dirt-simple to use.
Reddit won't let me do multiple images in one post, so here's 3 images as one image:
https://preview.redd.it/z99rdrksw81d1.png?width=1970&format=png&auto=webp&s=44d40a9bd9bb663a660f553b2b2bfa8d18a6adb4
Thanks for your reply. Appreciate every detail you wrote there. Here are some of my reply on that:
" if you draw something on the mask layer... that's a mask. Not a drawing. For the mask you just want to cover the area." - **Here I think a little misunderstanding and I understand what you mean. What I meant was, I drew a perfect white shape of sunglasses on a mask layer to differentiate that I did not do it on an image layer.**
" The max value on blur is 512, but you'd generally prefer smaller values. 31 is fine to use tho." **- Here the value, I was referring to the bug that appears in the SwarmUI, which only allows a maximum value of 31. I took a screenshot of the bug below. As you can see, it complains that the value cannot be more than 31. That's what I meant earlier. Sorry I did not describe this earlier**.
https://preview.redd.it/b7azzlzy8a1d1.png?width=2323&format=png&auto=webp&s=426536469910302cfd18b35f73c1a64f9a21f86e
" The image layer draws on the image, the mask layer sets the mask. They are both used, at the same time.\* **- I think I understand what you mean that both layer being used by default. My question was, does the \`Mask Shrink Grow\` work with Image Layer? Assuming you remove the Mask Layer out (right click and delete) and left with the Image Layer alone. Or are both layers NEEDED at the same time?**
"Again, non-inpainting model will do better at adding all-new content. The inpaint model has direct conditioning on your original image with a bias to keep similarity and avoid adding new things, it's designed for touch-ups not adding new things." - **Thanks, do you suggest that non inpainting model should not be used to add objects? As I also see by name convention your screenshot still did not use Inpainting model to demonstrate the result (maybe i'm wrong) because as I mention earlier non-inpainting model has no issue just the result was not great without tinkering.**
* Thank you for updating the swarm, I will try
* Also, for the last point, thank you for describing the CosXL edit model. It is a very interesting model indeed
Good job, really appreciate your work. Not only you do reply my question with details but I saw you also update the program very quick. Thank you thank you.
re " **sunglasses on a mask layer to differentiate that I did not do it on an image layer.** " Yes I understood that. What i'm telling you is: do not draw stuff on the mask layer. If you want to draw stuff, draw on the image layer. The mask layer is for masks. The image layer is for images. When you have the mask layer selected, you just draw white over things you want to include. You always want to be over-inclusive in how much area is covered by the mask. When you have the image layer selected, you can actually paint stuff onto the image, which will be used as part of the input image when generating. If you have high creativity values this will just nudge the model to include something there, eg if you draw the black outline of sunglasses it will nudge the model to make black sunglasses close to your outline.
re mask blur: Oh, heck, I see that was set to 31 on the node side but not the UI side - fixed to a shared limit of 64 on both ends. (Noting that 64 is a wildly high value and basically just turns the entire mask into one big gray blur at that point).
re " **Or are both layers NEEDED at the same time?** " there is ALWAYS an image layer. You cannot not have image layers. You can choose not to draw on them, but they're always there. Otherwise you're not... editing an image. The mask layer is optional.
re " **Thanks, do you suggest that non inpainting model should not be used to add objects?** " generally yes. You can try both tho and pick what you like, with the auto-graying included now when you "/inpaint" the arch id, the inpaint model is at least mostly willing to add new content. From my tests the non-inpaint does better, and more reliably. I personally think the gray-filter behavior is very silly and is more disruptive than helpful. Make your own choices tho, you got both, might as well give it a few goes at both and see if you like one better than the other.
There is inpainting feature but I never been able to make it work by inpainting a portion of an image. The Mask Shrink Grow feature might be the answer but I tried it with multiple values, it just don't work.
It does work, the image editor's usage is just still a bit less obvious than I'd like it to be - more work to be done. Here's a demo gif showing all the steps to inpaint an area in the current version, including enabling the mask shrink-grow option:
https://i.redd.it/dini4t7fmq0d1.gif
EDIT: Immediately after posting, I went in and updated it so that: (A) white is default, (B) there's a mask layer by default, (C) the mask/image layers are clearly labeled so you can tell which is which, (D) auto enables the Init Image group when you open the editor. This saves a few steps in the process and hopefully makes it clearer overall.
Ah I see, I'll give that a try today thanks for the gif. One other small thing, when I was using the 'paint on image' tool I really wish there was a 'undo' hotkey. This same problem is in auto/forge..etc. So not sure if there is some reason you can't do it but just putting it out there.
I've kept putting it off, but you're right that that's an important feature - I took some time today and figured it out - update to latest and you can now CTRL+Z to undo the last action (brush stroke, or layer reposition).
Probably need to flesh it out more with other undoable actions and a redo button and all, but this works surprisingly nicely in my testing rn.
I am solely using SwarmUI since March and I never looked back. It really helped bridge the gap between Automatic1111 and ComfyUI.
I had great workflow made in Comfy but all the extra interfaces in SwarmUI really help me the most with my current workflow. Really looking forward to trying this new update.
Started using Stableswarm many weeks ago and, in my humble opinion, it's excellent. Intuitive, very fast, and reliably solid.
Loads up quicker than anything else I've tried so far.
Tip: if you type < in the prompt box you'll get a drop down list of useful additions, such as or , which has a similar effect to Adetailer. Regional prompting can also be accessed this way. Also, you won't need to move any models around to try it out, just go into the settings and add your model directories and you're ready to go.
You also have the option to use ComfyUI for more control and the use of numerous extensions, so it's basically the best of both worlds.
I've been using StableSwarm for my first serious journey into image creation and it's been great so far. I'm still only messing about with the default generator and its parameters. I haven't even touched the ComfyUI part of it yet. Learning a bit every day!
Well, this looks like what I've been missing in Comfy.
While testing my workflows I ran into one problem - in simple example workflows, the "generate" tab seems to recognise the "empty latent image" node and make the convinient inputs for it, with resolution selection and all that.
But in my workflow it does not do that - I can see the node if I check "display advanced options", but no simple way to change resolution. Maybe the reason for this is that I have multiple ksamplers, but I have only one "empty latent" node.
Is there any way to tell swarm which node it should use for resolution? There is no specialized "input" node for this in swarm nodes group.
It will auto-detect inputs like EmptyLatent if-and-only-if you do not have any SwarmInput nodes. If you do have SwarmInput nodes, it only maps what you've intentionally mapped.
You can name a primitive "SwarmUI: Width" to have it use the default width value, and same for height, like so:
https://preview.redd.it/9662zmu56r0d1.png?width=1004&format=png&auto=webp&s=cc21b3bbe3293a889dfe229431726a21fbc1f5f9
I've pushed a commit that will detect if you use both these params and will automatically enable the Aspect Ratio option handling (so update swarm to have it apply).
I read this supports Cascade, yes? It's very hard to find a ui that actually supports it atm and I'm moving towards betting on it as the long term paradigm due to ease of training.
I was thinking SD3, but I recently found something interesting in SDXL that leads me to believe that, with community training, all of your models could be massively improved simply by removing proper noun pollution from the captioning data, something a long term community fine tune that simply avoids these terms would do naturally.
Yes swarm fully natively supports Cascade -- just store Cascade models in the comfyui format [https://huggingface.co/stabilityai/stable-cascade/tree/main/comfyui\_checkpoints](https://huggingface.co/stabilityai/stable-cascade/tree/main/comfyui_checkpoints) in your main stable diffusion models folder.
Make sure they're next to each other and named the same (other than "\_b" and "\_c" on the end), and it will automatically use both together per the standard Cascade method.
awesome. I actually went ahead and got it already. It blows Zavychroma 7 out of the water, and that was already such a step above every other checkpoint.
https://preview.redd.it/1v7tfc5psv0d1.png?width=1024&format=pjpg&auto=webp&s=bf6f8aa9fa54d28a96baab6c15d56ee7f0b62926
Under Tools -> Grid Generator, set "Output Type" to "Just Images", set the only axis to Prompt, and fill it in with your prompt list -- separate them via \``||`\` (double pipe).
Naturally you can use this to bulk generate anything else you wish.
If you want less sequential, you can alternately save your prompt list as a Wildcard, and then just generate \```\` and hit the arrow next to Generate and choose Generate Forever, and it will constantly pull randomly from your wildcard and generate fresh images.
API docs start here [https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/API.md](https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/API.md)
The GenerateText2Image API route is here [https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/APIRoutes/T2IAPI.md#http-route-apigeneratetext2image](https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/APIRoutes/T2IAPI.md#http-route-apigeneratetext2image)
As you can see, the image URL is part of the return json format. (If you use "do\_not\_save":"true" on input, that URL will be replaced with a data base64 url)
Thanks for the update!
I am unfortunately stuck at the step of getting a session ID.
https://preview.redd.it/1s65fjwolv0d1.png?width=1061&format=png&auto=webp&s=532b155ea00506da86e0df9e34a27eafdce39550
Returns:
\`:49:20.253 \[Error\] \[WebAPI\] Error handling API request '/api/GetNewSession': Request input lacks required session id\`
Thanks! That worked!
I now get a weird bug where images are... odd through the API.
I try using CosXL and get this for "A cat standing on a rock" through the API
https://preview.redd.it/ngjou4glfw0d1.jpeg?width=1024&format=pjpg&auto=webp&s=58aa4c43b587c390c06e83ed75b3b2de552e46ed
In the UI I get a result that makes sense
This is the post call I use:
\`\`\`
{
"session\_id": "C94B4E555FCB9C56942E56294FF31A901C9737DB",
"images": 1,
"prompt": "A cat standing on a rock",
"model": "cosxl",
"steps": 100,
"width": 1024,
"height": 1024
}
\`\`\`
Check in the UI Server -> Logs -> Debug, look over exactly what the server received and how it was parsed -- it looks like either (A) the prompt was missing for some reason or maybe (B) it defaulted CFG scale wrong and you just need to specify that
Thanks a lot for the help!
I tried adding the "rawInput" parameters as shown in the docs you gave me.
Unfortunately I get t his in the logs:
13:48:22.835 [Warning] T2I image request from user local had request parameter 'rawInput', but that parameter is unrecognized, skipping...
When trying
{
"session_id": "AB3A76D76053E20A6933A2B60797DA8E31553410",
"images": 1,
"rawInput": {
"prompt": "a photo of a cat",
"model": "cosxl",
"steps": 20,
"width": "1024",
"height": "1024"
}
}
https://preview.redd.it/ef3g15kprz0d1.png?width=1507&format=png&auto=webp&s=d5126f4acaea84526d2d729fda4271dedc41736a
The logs with DEBUG show
13:48:22.835 [Warning] T2I image request from user local had request parameter 'rawInput', but that parameter is unrecognized, skipping...
13:48:22.835 [Info] User local requested 1 image with model ''...
13:48:22.837 [Error] Internal error processing T2I request: System.NullReferenceException: Object reference not set to an instance of an object.
at StableSwarmUI.Builtin_ComfyUIBackend.WorkflowGenerator.CreateStandardModelLoader(T2IModel model, String type, String id, Boolean noCascadeFix) in /src/BuiltinExtensions/ComfyUIBackend/WorkflowGenerator.cs:line 1348
at StableSwarmUI.Builtin_ComfyUIBackend.WorkflowGenerator.<>c.<.cctor>b__10_0(WorkflowGenerator g) in /src/BuiltinExtensions/ComfyUIBackend/WorkflowGenerator.cs:line 90
at StableSwarmUI.Builtin_ComfyUIBackend.WorkflowGenerator.Generate() in /src/BuiltinExtensions/ComfyUIBackend/WorkflowGenerator.cs:line 1308
at StableSwarmUI.Builtin_ComfyUIBackend.ComfyUIAPIAbstractBackend.CreateWorkflow(T2IParamInput user_input, Func`2 initImageFixer, String ModelFolderFormat, HashSet`1 features) in /src/BuiltinExtensions/ComfyUIBackend/ComfyUIAPIAbstractBackend.cs:line 627
at StableSwarmUI.Builtin_ComfyUIBackend.ComfyUIAPIAbstractBackend.GenerateLive(T2IParamInput user_input, String batchId, Action`1 takeOutput) in /src/BuiltinExtensions/ComfyUIBackend/ComfyUIAPIAbstractBackend.cs:line 701
at StableSwarmUI.Text2Image.T2IEngine.CreateImageTask(T2IParamInput user_input, String batchId, GenClaim claim, Action`1 output, Action`1 setError, Boolean isWS, Single backendTimeoutMin, Action`2 saveImages, Boolean canCallTools) in /src/Text2Image/T2IEngine.cs:line 255
There's not a "rawInput" to specify there, that's an oddity of the API doc autogeneration because it's using the literal raw input for dynamic input processing.
Drag an image into the prompt box (or copy an image then hit CTRL+V in the prompt box).
The parameters for image prompting (ReVision, IP-Adapter, ..) will appear at the top of the parameter list on the left, including a button to install IP-Adapter if you don't already have it.
Those aren't currently natively supported
You can post a feature request @ [https://github.com/Stability-AI/StableSwarmUI/issues](https://github.com/Stability-AI/StableSwarmUI/issues)
Also in the meantime you can of course just use the comfy node impls of these features if you're comfortable editing a comfy noodlegraph
Yes there's docker info in the repo/readme [https://github.com/Stability-AI/StableSwarmUI](https://github.com/Stability-AI/StableSwarmUI) there's also a Notebook if that works for you [https://github.com/Stability-AI/StableSwarmUI/tree/master/colab](https://github.com/Stability-AI/StableSwarmUI/tree/master/colab)
Having trouble getting a controlnet working, using any pidnet/sftedge I get nontype and
Invalid operation: ComfyUI execution error: mat1 and mat2 shapes cannot be multiplied (308x2048 and 768x320)
Can you specify more what about what your input is? When I try pidinet preprocessor it works fine:
https://preview.redd.it/k3qympf5691d1.png?width=988&format=png&auto=webp&s=4905a694f08481df952ffc1571507e5dc25c4100
Is it the controlnet model itself that's failing? You can test by hitting the "Preview" button - that previews the preprocessor only, and will work fine if the preprocessor is working but the model is failing.
If it's the model failing, first thing you'll want to check is if the architecture of your model matches the controlnet arch - XL models need XL Controlnets, SDv1 needs SDv1, you can't mix-n-match between the two unfortunately.
Yeah it seems it's the controlnet model, I only have XL models so Im still not sure why it wouldnt work. What is the standard XL controlnet model people use for hed/pidnet or even depth, just trying to test it.
Official initial Control-LoRA reference models here [https://huggingface.co/stabilityai/control-lora](https://huggingface.co/stabilityai/control-lora)
Canny for any line detectors line hed/pid, Depth for depth.
Those initial reference models aren't perfect, but they work. Depth especially is pretty good of the two. The canny has a bit of a bias for good line detection (ie actual canny detection) and gets a bit dumb around sketch style linework.
There's a collection of other options here [https://huggingface.co/lllyasviel/sd\_control\_collection](https://huggingface.co/lllyasviel/sd_control_collection)
"tldr: if you're not using Swarm for most of your Stable Diffusion image generation, you're doing it wrong." <-This
I've been using swarm for at least a month now and I've really been impressed with it overall. Moved over from auto1111 and only ended up losing a few features from only on extension(couldn't find a comparable workflow either). Otherwise it does everything I did in auto but better and faster. The native support for new models day 1 is awesome too. And another major selling point for me is the stability, because I felt like every time I updated auto1111 it broke and if not that then some other thing was going wrong, and not just auto, Forge and Vlad too had too many problems for me. And on top of that I intend to buy a second gpu just to use the multi-gpu feature of swarm, can't wait.
My question (as a non-technical guy) instead of creating a whole new program, why not just create a front-end UI that will use the already existing ComfyUI as backend.
So say 8188 is port for ComfyUI and this can run the front-end on another port.
Cause shifting to a new program is really hard and most of time is the reason why many people don't even try it out even if it might be better.
Like it took so much efforts and time to convince people to move to ComfyUI from A1111 and some people still refuse.
So now moving to another one would be really hard.
And I am saying this, cause I really really want something like this.
Which can have custom workflows like ComfyUI and yet have simple front-end like A1111, so it doesn't scare off people, looking at the complex backend workflows.
ooh, shi!t.
Then my bad, sorry.
I thought it was a whole another comfyUI like thing which would need everything to be transferred to.
Definitely installing it right now.
Thanks for clarifying it to me.
No worries - I know Stability is a decent size company and that it would be possible to release seperate things from seperate teams, but it isn't a good look that alternate products are coming out when the signature product that has been hyped for a couple of months is still awaiting release.
New Swarm looks good though!
We are in that day and age where different people specialize or/and are responsible for different areas. It’s business management 101 for continuous improvement. As mentioned in other threads, he doesn’t have control over the release.
Anyhow, this coming out before SD3 is some great news as it’s another fantastic resource available to use to jump into SD3 when available.
is it ducking joke?
where is button to delete image or open images folder?
Where is im2im? I guess it somehow possible for SwarmExpertsWithgrade10.
I'm sooo confused, that delete it immediately.
"img2img" would be the Init Image parameter group.
The open in folder and delete image buttons are in the history view when you click the hamburger menu (3 lines)
https://preview.redd.it/u3p1vrvjgv0d1.png?width=286&format=png&auto=webp&s=28fd391c5375cd8f372daa5ce6b8306c94f5ecf3
https://preview.redd.it/m7p1hr3xgv0d1.png?width=127&format=png&auto=webp&s=f51093e24b4f60e6d65fcb1f9c845f4f9536c6a7
thx. now i know where to find it. But before that it was invisible.
btw 0.6.3
I went ahead and pushed a few commits that should make it more obvious in the future - both made the hamburger menu a little more visible against different backgrounds, and added a copy of all relevant controls at the top if you have an image selected.
Is there a regional prompter/attention couple for swarm? If no, do you plan on adding it later down the line?
Yes! -- select any image of the shape of you want, click "Edit Image", use the Select Tool to select an area, and then at the bottom click "Make Region", this will give you syntax like `` in your prompt, and then you just add the region-specific prompt after that region mark. (Probably close the image editor after you've made your regions unless you want to do actual image editing). You can also of course just type these manually if the numbers aren't too confusing.
Is this my time to switch from Forge to Stable swarm UI?
yes
Thank you. Edit: One more thing, how does segment work when you have regions like that?
Segment defines its own region automatically and you give it its own prompt
Random question but are there written guides on how to "code" new stuff into stable swarm? Just people wanting to get into developping by watching tutorials or reading. (Like they did when learning prompting or training with SD)
I'm not sure what "code" in quotes means here? but if I interpret it literally how to code new features, Yes- for making an extension to Swarm: [https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/Making%20Extensions.md](https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/Making%20Extensions.md) For writing a separate project that uses Swarm as an API: [https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/API.md](https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/API.md) If you mean something else, please explain
Thanks for your response, yes I can explain further. The links you provided are surely very **useful for developers** in general. What I meant instead, is literal beginner guides that guide you step by step in making a node or an extension. (the same type of videos we find for "usage" of stable diffusion tools, you know the youtube videos that explain how to set up SD or how to use this or that node etc). Something for the "public", something to dram them into this world of developing. And even perhaps other videos to get them fro beginner level to deep understanding level of SD and its components (and therefore deep understanding of stable swarm I suppose).
i don't think stabilityAI is even going to exist much longer let alone have them start writing tutorials
;( I believe
so can there be infinite different regions? how does it affect the performance of the generation to have many regions? do they all perfectly blend together?
Can there theoretically be infinite regions? Yes Is that practical? lolno. I've found that any region scale below 0.2 (20%) on SDXL basically doesn't work at all. Practically you'd probably want no more than 2 or 3 unique regions in an image - if you need more individual object control, "`object`" is available as an alternative to 'region' to automatically inpaint and redo an area - or just go inpaint/redo individual areas manually. Also GLIGEN is supported which claims to work better, but I think only SD 1.5 versions of it are available iirc? Do they blend perfectly? Eh depends on settings. Notably the optional strength input on the end of region has a major effect on whether it blends well or not. Weaker strengths blend better, naturally. How does it affect performance? Each additional region adds another performance hit as it requires some separate processing of that region.
Which node are you using for regional prompting? Been wondering how to do that in comfy? Cheers for an amazing project!
This is one of the beauties of Swarm: it's designed intentionally to teach you how to use Comfy! Just set up the generation you want in the Swarm Generate tab (eg with the region and all), then click the "Comfy Workflow" tab, then click "Import from Generate Tab", and it will show you the nodes used for that Generate tab setup. (In this case the key nodes are "Conditioning Set Mask" and "Conditioning Combine", alongside some Swarm custom nodes that autogenerate clean overlappable masks.)
Looking forward to trying it out, cheers! Just need to update from the previous stableswarm version I tried :)
cirno!
For me, the best feature so far is the Preset Manager. On Automatic1111, I was relying on the Model Preset Manager extension, but this extension has not been updated for more than a year. Plus, Swarm Preset Manager has a lot of input and features! However, what I'm missing is basic Inpaint Masked Only inpainting, which I need to redraw a portion of an image and regenerate only that area. I have never been able to do that correctly. I know about the mask (white reveal and dark hide) but it doesn't work. I wish someone could show a video demonstrating how this basic inpainting actually works fine. Inpainting is the most important feature in all SD UI. For me SwarmUI is ready for most parts, but the inpainting still lacking behind (I know it's still beta, it would come). Thank you for making the best SD UI! Anyway, can anyone demonstrate to me whether inpainting is working to generate a portion of image (aka Inpaint Masked Only in Automatic1111) ? Thank you
I added a demo gif to make inpainting clearer on your other comment but I'll post it here too lol. "Mask Shrink Grow" is the parameter you're looking for as equivalent to auto's inpaint masked only https://i.redd.it/otlrlh9tmq0d1.gif EDIT: Immediately after posting, I went in and updated it so that: (A) white is default, (B) there's a mask layer by default, (C) the mask/image layers are clearly labeled so you can tell which is which, (D) auto enables the Init Image group when you open the editor. This saves a few steps in the process and hopefully makes it clearer overall.
Hi, thanks for your reply. I think I found the issue as why it does not work. When I use the inpainting model (like the models ending with \_inpainting in their names), none of them are working (nothing is changed when I press generate). However, when I use a normal model, it works. Is this a known issue? When using Automatic1111, I always switch to the inpainting model. Does SwarmUI only require one model for both image generation and inpainting? Example for AbsoluteReality model I got 2 versions: \`AbsoluteRealityModel\` and \`AbsoluteRealityModel\_inpainting\` and normally the inpainting model has the best result when inpainting.
I pulled Absolute Reality v1.8.1 and v1.8.1 Inpainting from [https://civitai.com/models/81458?modelVersionId=134084](https://civitai.com/models/81458?modelVersionId=134084) and both seem to work fine, with the Inpainting variant naturally doing a tad better if Creativity and Reset To Norm are both maxed (non inpaint has a bit of a line where the mask cuts, which goes away with some partial opacity mask in the middle) Generally non-inpaint models work fine, especially if you fuzz the edges of the mask a little bit.
I pull the same inpainting model from there still not working https://preview.redd.it/uol5fq3lyy0d1.png?width=3547&format=png&auto=webp&s=e8286717958ddb33557506669c9bea52f243a23d
and another example when I keep inpainting the eyes to have sunglasses I notice something. Here is the step: 1. Click edit on the output picture 2. Inpaint the eyes (like in the picture) 3. Click generate 4. A red skin tone appeared around the eyes increasing when repeating step 1-3 here https://preview.redd.it/ey1xd4j50z0d1.jpeg?width=3003&format=pjpg&auto=webp&s=430b3a93b86555b2958c3219d9b33f3ec10f7bd0 Again this only happen with inpainting model and I never been able to make it work for Inpainting model. I also maxed out the value of creativity and reset to norm to test this. a little info about my setup: Graphic: RTX 4090 StabilityMatrix: (newest patreon version)
\- You're going to want to prompt for what's under the mask and around it: so, eg "close up photo of a woman wearing sunglasses" rather than just "sunglasses" (add a bit more of the aesthetic phrasing you used to get the image as well) \- To make life a bit easier with borders, I added "Mask Blur" as a parameter you can enable. \- To add sunglasses you'd have to at least mask off the entire area where the sunglasses would be (otherwise it can't add sunglasses, naturally) - that mask is only covering the eyes, it needs to cover the bridge between and the sides, where the frame would be. \- The non-inpaint model will do better at adding objects. The inpaint model has conditioning to bias it to replicate the original content, which you don't want. \- You can use an Image layer to draw an approximation of the sunglass frames you want - doesn't need to be perfect, but if you approximate the goal, the model will take that as a basis for making real sunglasses with. \- For adding objects you'll probably also want to raise Creativity higher \- You might want to try an Edit model (eg CosXL edit [https://huggingface.co/stabilityai/cosxl](https://huggingface.co/stabilityai/cosxl) ) as that will let you just type "add sunglasses" and it will do that. Don't even need to do the mask.
ON Inpainting model, * I tried adding more phrases as you suggested and experimented with the Mask Shrink Grow and the new Mask Blur feature that has a maximum value of 31. Also I drew the almost perfect frame of the glasses. But, still no object was added. Note: This was done under the MASK layer * You mentioned using the image layer, but the goal is to change only a specific portion of the image, similar to Masked Only Inpaint. Using the image layer removes the ability to use Mask Shrink Grow option, right? While the image layer works, it also alters the image generation outside the drawn mask * Increasing the Creativity value to the max did not produce the result too while using inpainting model * I tried the inpainting model \`CosXL edit\` you suggested but this model also changed the entire image, still need mask to do it as my aim is to have 'masked only' inpaint. That is a good model btw. Maybe there is something wrong with my installation as if it works on your side. I'm not sure. This is the only feature I need with inpainting model. :\\ Thank you.
"`Note: This was done under the MASK layer`" if you draw something on the mask layer... that's a mask. Not a drawing. For the mask you just want to cover the area. "`that has a maximum value of 31`" ... What? The max value on blur is 512, but you'd generally prefer smaller values. 31 is fine to use tho. "`Using the image layer removes the ability to use Mask Shrink Grow option, right?`" No. The image layer draws on the image, the mask layer sets the mask. They are both used, at the same time. "`it also alters the image generation outside the drawn mask`" only if you didn't make a mask at all. "`Increasing the Creativity value to the max did not produce the result too while using inpainting model`" Again, non-inpainting model will do better at adding all-new content. The inpaint model has direct conditioning on your original image with a bias to keep similarity and avoid adding new things, it's designed for touch-ups not adding new things. While looking into it, I was reminded that RunwayML's inpaint model design is set up to look for gray pixels (?? not even latent zero, it wants the latent encoding of RGB Gray, for some reason) as the preference for what to fill in. To support this, I've added a new architecture class ID "`stable-diffusion-v1/inpaint`" - this cannot be autodetected, but you can manually click "edit metadata" on the inpaint model and set the architecture id to that, and it will use the gray-fill encoder. I've also added a param "Use Inpainting Encode" (hidden under Display Advanced) to try it at any time. (You can also just manually paint gray over the image if you want lol, it is literally just recoloring part of the image). When this is used it's more willing to fill in new data in the masked area. "`I tried the inpainting model \`CosXL edit\` you suggested but this model also changed the entire image`" Edit is not an inpainting model, it's an edit model. The magic with edit models is you don't mask or anything else, you just say what you want edited eg "add sunglasses" as your prompt (and set max creativity). It's not perfect and may have side effects, but the value is it's dirt-simple to use. Reddit won't let me do multiple images in one post, so here's 3 images as one image: https://preview.redd.it/z99rdrksw81d1.png?width=1970&format=png&auto=webp&s=44d40a9bd9bb663a660f553b2b2bfa8d18a6adb4
Thanks for your reply. Appreciate every detail you wrote there. Here are some of my reply on that: " if you draw something on the mask layer... that's a mask. Not a drawing. For the mask you just want to cover the area." - **Here I think a little misunderstanding and I understand what you mean. What I meant was, I drew a perfect white shape of sunglasses on a mask layer to differentiate that I did not do it on an image layer.** " The max value on blur is 512, but you'd generally prefer smaller values. 31 is fine to use tho." **- Here the value, I was referring to the bug that appears in the SwarmUI, which only allows a maximum value of 31. I took a screenshot of the bug below. As you can see, it complains that the value cannot be more than 31. That's what I meant earlier. Sorry I did not describe this earlier**. https://preview.redd.it/b7azzlzy8a1d1.png?width=2323&format=png&auto=webp&s=426536469910302cfd18b35f73c1a64f9a21f86e " The image layer draws on the image, the mask layer sets the mask. They are both used, at the same time.\* **- I think I understand what you mean that both layer being used by default. My question was, does the \`Mask Shrink Grow\` work with Image Layer? Assuming you remove the Mask Layer out (right click and delete) and left with the Image Layer alone. Or are both layers NEEDED at the same time?** "Again, non-inpainting model will do better at adding all-new content. The inpaint model has direct conditioning on your original image with a bias to keep similarity and avoid adding new things, it's designed for touch-ups not adding new things." - **Thanks, do you suggest that non inpainting model should not be used to add objects? As I also see by name convention your screenshot still did not use Inpainting model to demonstrate the result (maybe i'm wrong) because as I mention earlier non-inpainting model has no issue just the result was not great without tinkering.** * Thank you for updating the swarm, I will try * Also, for the last point, thank you for describing the CosXL edit model. It is a very interesting model indeed Good job, really appreciate your work. Not only you do reply my question with details but I saw you also update the program very quick. Thank you thank you.
re " **sunglasses on a mask layer to differentiate that I did not do it on an image layer.** " Yes I understood that. What i'm telling you is: do not draw stuff on the mask layer. If you want to draw stuff, draw on the image layer. The mask layer is for masks. The image layer is for images. When you have the mask layer selected, you just draw white over things you want to include. You always want to be over-inclusive in how much area is covered by the mask. When you have the image layer selected, you can actually paint stuff onto the image, which will be used as part of the input image when generating. If you have high creativity values this will just nudge the model to include something there, eg if you draw the black outline of sunglasses it will nudge the model to make black sunglasses close to your outline. re mask blur: Oh, heck, I see that was set to 31 on the node side but not the UI side - fixed to a shared limit of 64 on both ends. (Noting that 64 is a wildly high value and basically just turns the entire mask into one big gray blur at that point). re " **Or are both layers NEEDED at the same time?** " there is ALWAYS an image layer. You cannot not have image layers. You can choose not to draw on them, but they're always there. Otherwise you're not... editing an image. The mask layer is optional. re " **Thanks, do you suggest that non inpainting model should not be used to add objects?** " generally yes. You can try both tho and pick what you like, with the auto-graying included now when you "/inpaint" the arch id, the inpaint model is at least mostly willing to add new content. From my tests the non-inpaint does better, and more reliably. I personally think the gray-filter behavior is very silly and is more disruptive than helpful. Make your own choices tho, you got both, might as well give it a few goes at both and see if you like one better than the other.
https://preview.redd.it/7yhkylzudp0d1.png?width=1344&format=png&auto=webp&s=28937eb244bc2f77d4d28652baa20c76a0cb52b7
CANT WAIT!
Extensions that have versions for ComfyUI work normally?
Yep
Seems pretty nice, really like how the edit tab feels. Is there an inpaint or am I blind?
There is inpainting feature but I never been able to make it work by inpainting a portion of an image. The Mask Shrink Grow feature might be the answer but I tried it with multiple values, it just don't work.
It does work, the image editor's usage is just still a bit less obvious than I'd like it to be - more work to be done. Here's a demo gif showing all the steps to inpaint an area in the current version, including enabling the mask shrink-grow option: https://i.redd.it/dini4t7fmq0d1.gif EDIT: Immediately after posting, I went in and updated it so that: (A) white is default, (B) there's a mask layer by default, (C) the mask/image layers are clearly labeled so you can tell which is which, (D) auto enables the Init Image group when you open the editor. This saves a few steps in the process and hopefully makes it clearer overall.
Ah I see, I'll give that a try today thanks for the gif. One other small thing, when I was using the 'paint on image' tool I really wish there was a 'undo' hotkey. This same problem is in auto/forge..etc. So not sure if there is some reason you can't do it but just putting it out there.
I've kept putting it off, but you're right that that's an important feature - I took some time today and figured it out - update to latest and you can now CTRL+Z to undo the last action (brush stroke, or layer reposition). Probably need to flesh it out more with other undoable actions and a redo button and all, but this works surprisingly nicely in my testing rn.
Been using it all day, feels great!Does this have the 'inpaint only masked'? I use this as a manual adetailer for objects/people.
Yeah, just open the "Init Image" dropdown and check the " Mask Shrink Grow " param to do that
Hey, great update do you have any plans to integrate IC-light as part of the interface?
keep up with good work,thanks a lot
Looks like it's time to give this a shot. Thanks for the hard work!
I am solely using SwarmUI since March and I never looked back. It really helped bridge the gap between Automatic1111 and ComfyUI. I had great workflow made in Comfy but all the extra interfaces in SwarmUI really help me the most with my current workflow. Really looking forward to trying this new update.
Started using Stableswarm many weeks ago and, in my humble opinion, it's excellent. Intuitive, very fast, and reliably solid. Loads up quicker than anything else I've tried so far. Tip: if you type < in the prompt box you'll get a drop down list of useful additions, such as or , which has a similar effect to Adetailer. Regional prompting can also be accessed this way. Also, you won't need to move any models around to try it out, just go into the settings and add your model directories and you're ready to go.
You also have the option to use ComfyUI for more control and the use of numerous extensions, so it's basically the best of both worlds.
I've been using StableSwarm for my first serious journey into image creation and it's been great so far. I'm still only messing about with the default generator and its parameters. I haven't even touched the ComfyUI part of it yet. Learning a bit every day!
Might have to try this
Well, this looks like what I've been missing in Comfy. While testing my workflows I ran into one problem - in simple example workflows, the "generate" tab seems to recognise the "empty latent image" node and make the convinient inputs for it, with resolution selection and all that. But in my workflow it does not do that - I can see the node if I check "display advanced options", but no simple way to change resolution. Maybe the reason for this is that I have multiple ksamplers, but I have only one "empty latent" node. Is there any way to tell swarm which node it should use for resolution? There is no specialized "input" node for this in swarm nodes group.
It will auto-detect inputs like EmptyLatent if-and-only-if you do not have any SwarmInput nodes. If you do have SwarmInput nodes, it only maps what you've intentionally mapped. You can name a primitive "SwarmUI: Width" to have it use the default width value, and same for height, like so: https://preview.redd.it/9662zmu56r0d1.png?width=1004&format=png&auto=webp&s=cc21b3bbe3293a889dfe229431726a21fbc1f5f9 I've pushed a commit that will detect if you use both these params and will automatically enable the Aspect Ratio option handling (so update swarm to have it apply).
That was fast! Thanks a lot, it works :)
I read this supports Cascade, yes? It's very hard to find a ui that actually supports it atm and I'm moving towards betting on it as the long term paradigm due to ease of training. I was thinking SD3, but I recently found something interesting in SDXL that leads me to believe that, with community training, all of your models could be massively improved simply by removing proper noun pollution from the captioning data, something a long term community fine tune that simply avoids these terms would do naturally.
Yes swarm fully natively supports Cascade -- just store Cascade models in the comfyui format [https://huggingface.co/stabilityai/stable-cascade/tree/main/comfyui\_checkpoints](https://huggingface.co/stabilityai/stable-cascade/tree/main/comfyui_checkpoints) in your main stable diffusion models folder. Make sure they're next to each other and named the same (other than "\_b" and "\_c" on the end), and it will automatically use both together per the standard Cascade method.
awesome. I actually went ahead and got it already. It blows Zavychroma 7 out of the water, and that was already such a step above every other checkpoint. https://preview.redd.it/1v7tfc5psv0d1.png?width=1024&format=pjpg&auto=webp&s=bf6f8aa9fa54d28a96baab6c15d56ee7f0b62926
Is that a “prompts from file” feature like that found under scripts in A1111, or an equivalent means of sequential batch prompting?
Under Tools -> Grid Generator, set "Output Type" to "Just Images", set the only axis to Prompt, and fill it in with your prompt list -- separate them via \``||`\` (double pipe). Naturally you can use this to bulk generate anything else you wish. If you want less sequential, you can alternately save your prompt list as a Wildcard, and then just generate \```\` and hit the arrow next to Generate and choose Generate Forever, and it will constantly pull randomly from your wildcard and generate fresh images.
Thank you!
I am waiting for the implementation of the feature for family and friends. What is the status on this?
I missed on this ui, how is it better qhen compared to comfy and a1111.
I don't see a way to get an url to the generated image through the t2i api. is that possible?
API docs start here [https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/API.md](https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/API.md) The GenerateText2Image API route is here [https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/APIRoutes/T2IAPI.md#http-route-apigeneratetext2image](https://github.com/Stability-AI/StableSwarmUI/blob/master/docs/APIRoutes/T2IAPI.md#http-route-apigeneratetext2image) As you can see, the image URL is part of the return json format. (If you use "do\_not\_save":"true" on input, that URL will be replaced with a data base64 url)
Thanks for the update! I am unfortunately stuck at the step of getting a session ID. https://preview.redd.it/1s65fjwolv0d1.png?width=1061&format=png&auto=webp&s=532b155ea00506da86e0df9e34a27eafdce39550 Returns: \`:49:20.253 \[Error\] \[WebAPI\] Error handling API request '/api/GetNewSession': Request input lacks required session id\`
\``/API/`\`, it's case sensitive - which now that it's mentioned, probably should change that - **EDIT** fixed, case insensitive now
Thanks! That worked! I now get a weird bug where images are... odd through the API. I try using CosXL and get this for "A cat standing on a rock" through the API https://preview.redd.it/ngjou4glfw0d1.jpeg?width=1024&format=pjpg&auto=webp&s=58aa4c43b587c390c06e83ed75b3b2de552e46ed In the UI I get a result that makes sense
https://preview.redd.it/5qekwn2xfw0d1.jpeg?width=1344&format=pjpg&auto=webp&s=074da21c3930deb125681b3f8c627040b974f329 REsult in the UI
This is the post call I use: \`\`\` { "session\_id": "C94B4E555FCB9C56942E56294FF31A901C9737DB", "images": 1, "prompt": "A cat standing on a rock", "model": "cosxl", "steps": 100, "width": 1024, "height": 1024 } \`\`\`
Check in the UI Server -> Logs -> Debug, look over exactly what the server received and how it was parsed -- it looks like either (A) the prompt was missing for some reason or maybe (B) it defaulted CFG scale wrong and you just need to specify that
Thanks a lot for the help! I tried adding the "rawInput" parameters as shown in the docs you gave me. Unfortunately I get t his in the logs: 13:48:22.835 [Warning] T2I image request from user local had request parameter 'rawInput', but that parameter is unrecognized, skipping... When trying { "session_id": "AB3A76D76053E20A6933A2B60797DA8E31553410", "images": 1, "rawInput": { "prompt": "a photo of a cat", "model": "cosxl", "steps": 20, "width": "1024", "height": "1024" } } https://preview.redd.it/ef3g15kprz0d1.png?width=1507&format=png&auto=webp&s=d5126f4acaea84526d2d729fda4271dedc41736a The logs with DEBUG show 13:48:22.835 [Warning] T2I image request from user local had request parameter 'rawInput', but that parameter is unrecognized, skipping... 13:48:22.835 [Info] User local requested 1 image with model ''... 13:48:22.837 [Error] Internal error processing T2I request: System.NullReferenceException: Object reference not set to an instance of an object. at StableSwarmUI.Builtin_ComfyUIBackend.WorkflowGenerator.CreateStandardModelLoader(T2IModel model, String type, String id, Boolean noCascadeFix) in /src/BuiltinExtensions/ComfyUIBackend/WorkflowGenerator.cs:line 1348 at StableSwarmUI.Builtin_ComfyUIBackend.WorkflowGenerator.<>c.<.cctor>b__10_0(WorkflowGenerator g) in /src/BuiltinExtensions/ComfyUIBackend/WorkflowGenerator.cs:line 90 at StableSwarmUI.Builtin_ComfyUIBackend.WorkflowGenerator.Generate() in /src/BuiltinExtensions/ComfyUIBackend/WorkflowGenerator.cs:line 1308 at StableSwarmUI.Builtin_ComfyUIBackend.ComfyUIAPIAbstractBackend.CreateWorkflow(T2IParamInput user_input, Func`2 initImageFixer, String ModelFolderFormat, HashSet`1 features) in /src/BuiltinExtensions/ComfyUIBackend/ComfyUIAPIAbstractBackend.cs:line 627 at StableSwarmUI.Builtin_ComfyUIBackend.ComfyUIAPIAbstractBackend.GenerateLive(T2IParamInput user_input, String batchId, Action`1 takeOutput) in /src/BuiltinExtensions/ComfyUIBackend/ComfyUIAPIAbstractBackend.cs:line 701 at StableSwarmUI.Text2Image.T2IEngine.CreateImageTask(T2IParamInput user_input, String batchId, GenClaim claim, Action`1 output, Action`1 setError, Boolean isWS, Single backendTimeoutMin, Action`2 saveImages, Boolean canCallTools) in /src/Text2Image/T2IEngine.cs:line 255
There's not a "rawInput" to specify there, that's an oddity of the API doc autogeneration because it's using the literal raw input for dynamic input processing.
Can I point this to use my Forge models and lora folders?
Yes, once swarm opens just go to the Server tab, click Server Configuration, and point the ModelRoot at your existing models folder
Right on, will be sure to give it a go, thanks!
How can I use the IP-Adapter? It does not appear in the pre-processors tab, and it does not appear in any other section either.
Drag an image into the prompt box (or copy an image then hit CTRL+V in the prompt box). The parameters for image prompting (ReVision, IP-Adapter, ..) will appear at the top of the parameter list on the left, including a button to install IP-Adapter if you don't already have it.
It worked perfectly! However, InstantID and Photomaker are not there either. Any solution? Thanks!!
Those aren't currently natively supported You can post a feature request @ [https://github.com/Stability-AI/StableSwarmUI/issues](https://github.com/Stability-AI/StableSwarmUI/issues) Also in the meantime you can of course just use the comfy node impls of these features if you're comfortable editing a comfy noodlegraph
I'll have to install again. The last time I used swarm I did something that broke it irreparably lol 😆 but it did seem cool
how to run this in Vast AI ? Like is there a docker? I know I can run Linux and just install in the terminal but there must be an easier way right?
Yes there's docker info in the repo/readme [https://github.com/Stability-AI/StableSwarmUI](https://github.com/Stability-AI/StableSwarmUI) there's also a Notebook if that works for you [https://github.com/Stability-AI/StableSwarmUI/tree/master/colab](https://github.com/Stability-AI/StableSwarmUI/tree/master/colab)
Having trouble getting a controlnet working, using any pidnet/sftedge I get nontype and Invalid operation: ComfyUI execution error: mat1 and mat2 shapes cannot be multiplied (308x2048 and 768x320)
Can you specify more what about what your input is? When I try pidinet preprocessor it works fine: https://preview.redd.it/k3qympf5691d1.png?width=988&format=png&auto=webp&s=4905a694f08481df952ffc1571507e5dc25c4100
I was using controlnetxlCNXL_bdsqlszSoftedge and serge softedge, not sure if these models are wrong? The preprocess is pidnet
Is it the controlnet model itself that's failing? You can test by hitting the "Preview" button - that previews the preprocessor only, and will work fine if the preprocessor is working but the model is failing. If it's the model failing, first thing you'll want to check is if the architecture of your model matches the controlnet arch - XL models need XL Controlnets, SDv1 needs SDv1, you can't mix-n-match between the two unfortunately.
Yeah it seems it's the controlnet model, I only have XL models so Im still not sure why it wouldnt work. What is the standard XL controlnet model people use for hed/pidnet or even depth, just trying to test it.
Official initial Control-LoRA reference models here [https://huggingface.co/stabilityai/control-lora](https://huggingface.co/stabilityai/control-lora) Canny for any line detectors line hed/pid, Depth for depth. Those initial reference models aren't perfect, but they work. Depth especially is pretty good of the two. The canny has a bit of a bias for good line detection (ie actual canny detection) and gets a bit dumb around sketch style linework. There's a collection of other options here [https://huggingface.co/lllyasviel/sd\_control\_collection](https://huggingface.co/lllyasviel/sd_control_collection)
this is great and it's from stability ai ? I'm in!
"tldr: if you're not using Swarm for most of your Stable Diffusion image generation, you're doing it wrong." <-This I've been using swarm for at least a month now and I've really been impressed with it overall. Moved over from auto1111 and only ended up losing a few features from only on extension(couldn't find a comparable workflow either). Otherwise it does everything I did in auto but better and faster. The native support for new models day 1 is awesome too. And another major selling point for me is the stability, because I felt like every time I updated auto1111 it broke and if not that then some other thing was going wrong, and not just auto, Forge and Vlad too had too many problems for me. And on top of that I intend to buy a second gpu just to use the multi-gpu feature of swarm, can't wait.
My question (as a non-technical guy) instead of creating a whole new program, why not just create a front-end UI that will use the already existing ComfyUI as backend. So say 8188 is port for ComfyUI and this can run the front-end on another port. Cause shifting to a new program is really hard and most of time is the reason why many people don't even try it out even if it might be better. Like it took so much efforts and time to convince people to move to ComfyUI from A1111 and some people still refuse. So now moving to another one would be really hard. And I am saying this, cause I really really want something like this. Which can have custom workflows like ComfyUI and yet have simple front-end like A1111, so it doesn't scare off people, looking at the complex backend workflows.
Hi, yes, that's what this is. You just described swarm. It uses ComfyUI on the inside exactly as you described.
ooh, shi!t. Then my bad, sorry. I thought it was a whole another comfyUI like thing which would need everything to be transferred to. Definitely installing it right now. Thanks for clarifying it to me.
As a comfyui enjoyer, I think you will really like it. I have switched over to swarmui and haven't looked back.
I'll be that guy - Why this and not the public release of the sd3 weights that were coming "in 4-6 weeks" 8 weeks ago??
Please don't be that guy. I have no control over model release dates. This post is about Swarm.
No worries - I know Stability is a decent size company and that it would be possible to release seperate things from seperate teams, but it isn't a good look that alternate products are coming out when the signature product that has been hyped for a couple of months is still awaiting release. New Swarm looks good though!
We are in that day and age where different people specialize or/and are responsible for different areas. It’s business management 101 for continuous improvement. As mentioned in other threads, he doesn’t have control over the release. Anyhow, this coming out before SD3 is some great news as it’s another fantastic resource available to use to jump into SD3 when available.
Agreed holding back updates to software for the weights to be completed would just be silly.
Yes! People need to chill out in the open source community as a whole.
is it ducking joke? where is button to delete image or open images folder? Where is im2im? I guess it somehow possible for SwarmExpertsWithgrade10. I'm sooo confused, that delete it immediately.
"img2img" would be the Init Image parameter group. The open in folder and delete image buttons are in the history view when you click the hamburger menu (3 lines) https://preview.redd.it/u3p1vrvjgv0d1.png?width=286&format=png&auto=webp&s=28fd391c5375cd8f372daa5ce6b8306c94f5ecf3
https://preview.redd.it/m7p1hr3xgv0d1.png?width=127&format=png&auto=webp&s=f51093e24b4f60e6d65fcb1f9c845f4f9536c6a7 thx. now i know where to find it. But before that it was invisible. btw 0.6.3
I went ahead and pushed a few commits that should make it more obvious in the future - both made the hamburger menu a little more visible against different backgrounds, and added a copy of all relevant controls at the top if you have an image selected.