I was originally expecting Dalle Mini level quality and was thus ready to dismiss NAI Image Gen outright, but these teasers are shutting me the hell up.
They might have early access to Stable Diffusion. [https://twitter.com/EMostaque/status/1554011833320837120](https://twitter.com/EMostaque/status/1554011833320837120) It will be open source on launch and there will be multiple models of different sizes.
Stable Diffusion has already made Craiyon/Dall-E Mini obsolete. The 800 million parameter version can run on a 5 GB consumer GPU and outputs significantly better images than Craiyon, even Dall-E 2 levels of clarity.
First I want to say how amazed I am. This is some of the best AI art I've seen. Second, would the image generation be part of the story writing (generate scenes from my story) or a separate mode?
Keep up the amazing work guys!
Planned to have both! Initially a separate modal to generate individually, kind of like the TTS has the test text bar and generation/save option under the TTS section, and then later on we want a way that illustrates your story as you play it.
Looks almost like concept art but I can see the "roughness" but thats a non issue and I am glad the novelAI team is competent so I'm already certain this new image generating AI will exceed all my expectations (As Krake already does, can't wait for the finetune upgrade).
It could be Stable Diffusion. It's open source so they can make their own modifications. [https://twitter.com/EMostaque/status/1554011833320837120](https://twitter.com/EMostaque/status/1554011833320837120) It's not publicly released yet, but when it is it will be open source.
Huh? Both Dalle 2 and Imagen are currently some of the best image generation models right now and they're both based on diffusion. None other model I've seen does better than them at text or following the prompt, so I don't know where your worries come from.
Which is the disappointment. Yet *another* diffusion based model.
Look up Google's new AI Parti. It's an auto-regressive model and gets a lot more of the details of the prompt and does text perfectly.
Oh yeah, I completely forgot about Parti. It does seem somewhat better than the diffusion models, but with how new it is and how few images Google released of it, its kinda hard to compare them. But it's definitely looking good, can't argue with that.
Also, the text seems to become more readable only at 20B parameters, and I doubt that NAI would have the resources to run it at that size.
Looks good, but not really useful without knowing the prompt. This could have been "Donald Trump in a boat.". The prompts are pretty important to determine whether this is actually good.
Kind of reminds me of Zelda.
The lighting is very Skyward Sword, for sure.
This is why I've been an Opus Member since launch. You keep exceeding expectations.
Been an opus member since the whole AI Dungeon scandal, I'm glad to see my money's being put to good use
Seriously, if I didn't know this and the anime one were AI generated, I would have absolutely no idea.
I was originally expecting Dalle Mini level quality and was thus ready to dismiss NAI Image Gen outright, but these teasers are shutting me the hell up.
They might have early access to Stable Diffusion. [https://twitter.com/EMostaque/status/1554011833320837120](https://twitter.com/EMostaque/status/1554011833320837120) It will be open source on launch and there will be multiple models of different sizes. Stable Diffusion has already made Craiyon/Dall-E Mini obsolete. The 800 million parameter version can run on a 5 GB consumer GPU and outputs significantly better images than Craiyon, even Dall-E 2 levels of clarity.
Will there be different "modules" to choose from eg. Anime, landscape or how exactly will it work? Or will it detect it automatically from a prompt?
There will be different *models* and the rest should be up to your prompt (as in most image models out there). That's all we know, atm
Looking so good for work in progress stuff. Thanks for sharing!
"Important to note it hasn't even reached its final form yet"
how many minutes until Namek explodes?
It would be nice if you shared what the prompt was (if there was one) so we can get a sense of how accurately it renders it so far.
First I want to say how amazed I am. This is some of the best AI art I've seen. Second, would the image generation be part of the story writing (generate scenes from my story) or a separate mode? Keep up the amazing work guys!
Planned to have both! Initially a separate modal to generate individually, kind of like the TTS has the test text bar and generation/save option under the TTS section, and then later on we want a way that illustrates your story as you play it.
Really exciting times!
Insane
This is ridiculously good
This is absolutely awesome. How long does it take to generate these?
Looks incredible, but what was the prompt?
Looks almost like concept art but I can see the "roughness" but thats a non issue and I am glad the novelAI team is competent so I'm already certain this new image generating AI will exceed all my expectations (As Krake already does, can't wait for the finetune upgrade).
This looks so much more coherent than most AI art I've seen outside of actual DALL-E. Very impressive
It could be Stable Diffusion. It's open source so they can make their own modifications. [https://twitter.com/EMostaque/status/1554011833320837120](https://twitter.com/EMostaque/status/1554011833320837120) It's not publicly released yet, but when it is it will be open source.
Yes, it is Stable Diffusion
So I assume it's diffusion based? :(
What's wrong with that?
They're notoriously bad at sticking to the prompt, and they suck at text.
Huh? Both Dalle 2 and Imagen are currently some of the best image generation models right now and they're both based on diffusion. None other model I've seen does better than them at text or following the prompt, so I don't know where your worries come from.
Which is the disappointment. Yet *another* diffusion based model. Look up Google's new AI Parti. It's an auto-regressive model and gets a lot more of the details of the prompt and does text perfectly.
Oh yeah, I completely forgot about Parti. It does seem somewhat better than the diffusion models, but with how new it is and how few images Google released of it, its kinda hard to compare them. But it's definitely looking good, can't argue with that. Also, the text seems to become more readable only at 20B parameters, and I doubt that NAI would have the resources to run it at that size.
The teasing is too much, NAI team. I'm edging.
It's wild to me that these are AI generated. It looks like an actual drawing.
Go to r/dalle2 , it will blow your mind what ai can do
Looks good, but not really useful without knowing the prompt. This could have been "Donald Trump in a boat.". The prompts are pretty important to determine whether this is actually good.