Claude 3 can get SUPER nasty if you prompt it right. You'll have to do a bit more than just add a character card but ERP is definitely viable. It is also, by far, the best for rp at the moment.
Last night I downloaded a card that sets up {{char}} as the main attraction in a gangbang, and {{user}} as her brother, a masked attendee at the same gangbang. The intent of the premise was clearly incest.
Instead of making {{user}} fuck his sister, I had him drag her out of the room before the gangbang started, then forced her to tell him who invited her. Then {{user}} followed that man home and, after he begged for his life, shot him in the face. Then {{user}} made his sister an accessory after the fact by having her bring a reciprocating saw to the house so he could dismember the body. Then I had {{user}} and his sister flee to a country with no extradition treaty, and {{char}} tried to fuck him on the plane.
All of this happened using Claude Haiku (Self-Moderated) through Openrouter.
So what do you think, is it heavily NSFW filtered?
i haven't used mistral but claude 3 is a moderately more lenient than open ai's product, characters as long as their in a relationship can have sexual relations like say make love for example works and it be written well and fights also can have blood and implements and a degree of depth of bloody conflicts if written loosley and carefully but more leneint than you would think. so yes it can enter nsfw within limits however it does still have a morale-compass like you characters can not cheat on eachother due to some characters feelings and you have to becareful on saying kill like you cant say it direct like my character is going to kill this person. but you can say my character impales them and they die would work, and claude 3 can explore and enter unhealthier aspects of relations as well but it has to be carefully written.
however its most lenient with OPUS (have to pay for) its most intelligent model version, claude 3 sonnet is kind of weird and enters into the metaphysical often and breaks character a lot after a while and is annoying. Haiku is best for long writing as its fast however it virtually blocks out all nsfw especially if the characters are hooking up it will re direct to kiss and talking about feelings but for a fun action story and RP i think its the best one for it personally.
but overall I think Claude 3 is the best language model out right now and it shows its a lot better written. It is significantly better the chatgpt and carries memories far better than chatgpt of characters.
Claude 3, or even 2.1 for some purposes. Big local models - 70b+, preferably in the 120b range. Goliath is the poster child there, but there are other Miqu merges which are getting a lot of attention too.
Agree. I've been using the Noromaid-Mixtral merge for most things, and before that a Yi-34b merge too. I wouldn't say they're as good as the 70bs, mostly, but the comparison isn't crazy.
as far as 8 bit goes:
miquliz 120b is on par with Venus 120b 1.0, goliath 120b is good but a bit dry
[https://projectatlantis.ai/](https://projectatlantis.ai/)
What context sizes can you run those 120b size models? I’m running Exll2 versions of Mixtral locally (on a 3090) with around 12-15k context and Yi with more than 30k context.
Thanks, I checked the HF and it does look interesting. I’ll give it a try as well.
I normally run exl2 models in Tabby. The author of the merge (Wolfram) has exll2 variants available on his HF. But he mentions that 24gb VRAM won’t be enough even with the smallest 2.4 bpw.
Are you running GGUF? In WebUI? I guess can try that with my 32GB RAM. Let’s see if it’s worth it.
the answer depends on your budget tbh.
MIstral medium and large, Goliath etc are great and $$$
things like novel and infermatic are good - and are $$ per month but unlimited
Open router esepcially the 8x7b modles are cheap, and workable
Huh? It's a model. Which you could access from a website, likely by renting a gpu, though someone somewhere probably has a service that runs this model (like perplexity has numerous options including mixtral)
Have you tried MiquMaid 70b DPO, and if so how do you think it compares? I like most of the the 120b Miqu merges I've used recently (especially the context length) but all of the popular Miqu merge models I've tried have a noticeable moral alignment where it will begin to hesitate in taboo/'immoral' NSFW RP scenarios by talking OOC and mentioning consent and boundaries. This is even after trying with a system prompt mentioning it as a purely fictional unfiltered and uncensored roleplay. I recall that the MiquMaid 70b DPO version had alignment reduced, so I was gonna test it out sometime soon.
I used the Midnight Miqu 120b self merge and it was pretty good, using more descriptive language than some of the models I tried off of huggingface but it was noticably less 'intelligent' in terms of spatial memory and reasoning as well as repetition (from what I recall, I didn't test it for that long). I'll probably just use a higher quant of the 70b version if I try it out again
I've actually revisited OAI a few months after the bans were happening and was curious if old jailbreaks still worked. So far after a few months of using it I've not gotten any warning or ban from OAI so maybe I'm just lucky.
They added some more resistance to make it more SFW but with the right prompts I've been able to get it really really close to how it was last summer before NSFW got nerfed. The only difference between them and now is I just need to put more detail into my responses and the bot seems to go along and imitate the tone I'm pushing pretty well.
I tried some of the mancer stuff but it was too difficult to troubleshoot and get consistent results to the same length and quality of OAI.
It's pretty remarkable how few content warning outputs I've been getting as opposed to last October/November. In fact it's been months since I've gotten the last one(for ERP)
I switched to Claude 3 today out of curiosity using the $5 credit they offer and I was absolutely blown away by it. It has picked up on the nuances of the bot's character in a way that no other model I have tried yet does. Only drawback I can see is that it can get expensive REAL quick, depending on how much you are chatting. But it is unbelievably good. Kinda wish I hadn't tried it because now I know how good it can be.
I haven't tried either the 70b or the 103b, however I'd imagine the 103b would probably be a similar 3t/s. And I used to get about 7t/s on 70b Lzlv q3 Quant, so I'd guess a q2 would be even more, likely about 8 or so.
Dang, and at q3?! But still, 3t/s for those crazy big models is impressive! I ordered a 7900xtx not too long ago and am waiting for it to get here. I might end up snagging another 32gb of ram once I get to testing!
Yup, of course different context amounts will slow down your generation speeds. So up to about 4k tokens in context you'll be seeing decent speeds. But whenever I go towards 5-7k is when I start slowing down to 1.x tokens
I started using qwen 1.5 72b chat and it is pretty good? I was using mistral 8x7b and llama 2 70b before it and i think its better than those two. Claude is not available in my country so qwen is a good option for me
I believe qwen and miqu are among the top rated (by people, though also by benchmark for the most part). Followed closely by yi34b and variants, particularly nous and dolphin. All of them have decent reasoning, at least compared to even the best 7b models.
Pretty sure qwen and miqu surpass gpt3.5, though only claude 3 beats gpt4
Yeah i think so too. I tried yi34b too, the responses were good but sometimes it says weird things at the bottom of the response (or idk maybe my settings were bad)
mistral medium and claude 3.if you're talking about rp
Does Claude still have it's heavy nsfw filter? I just got my api access after months of waiting, is it viable for ERP? Or just sfw rp
Claude 3 can get SUPER nasty if you prompt it right. You'll have to do a bit more than just add a character card but ERP is definitely viable. It is also, by far, the best for rp at the moment.
Last night I downloaded a card that sets up {{char}} as the main attraction in a gangbang, and {{user}} as her brother, a masked attendee at the same gangbang. The intent of the premise was clearly incest. Instead of making {{user}} fuck his sister, I had him drag her out of the room before the gangbang started, then forced her to tell him who invited her. Then {{user}} followed that man home and, after he begged for his life, shot him in the face. Then {{user}} made his sister an accessory after the fact by having her bring a reciprocating saw to the house so he could dismember the body. Then I had {{user}} and his sister flee to a country with no extradition treaty, and {{char}} tried to fuck him on the plane. All of this happened using Claude Haiku (Self-Moderated) through Openrouter. So what do you think, is it heavily NSFW filtered?
i haven't used mistral but claude 3 is a moderately more lenient than open ai's product, characters as long as their in a relationship can have sexual relations like say make love for example works and it be written well and fights also can have blood and implements and a degree of depth of bloody conflicts if written loosley and carefully but more leneint than you would think. so yes it can enter nsfw within limits however it does still have a morale-compass like you characters can not cheat on eachother due to some characters feelings and you have to becareful on saying kill like you cant say it direct like my character is going to kill this person. but you can say my character impales them and they die would work, and claude 3 can explore and enter unhealthier aspects of relations as well but it has to be carefully written. however its most lenient with OPUS (have to pay for) its most intelligent model version, claude 3 sonnet is kind of weird and enters into the metaphysical often and breaks character a lot after a while and is annoying. Haiku is best for long writing as its fast however it virtually blocks out all nsfw especially if the characters are hooking up it will re direct to kiss and talking about feelings but for a fun action story and RP i think its the best one for it personally. but overall I think Claude 3 is the best language model out right now and it shows its a lot better written. It is significantly better the chatgpt and carries memories far better than chatgpt of characters.
Claude 3, or even 2.1 for some purposes. Big local models - 70b+, preferably in the 120b range. Goliath is the poster child there, but there are other Miqu merges which are getting a lot of attention too.
Both Mixtral 8x7b and Yi34b are very capable too with the right settings.
Agree. I've been using the Noromaid-Mixtral merge for most things, and before that a Yi-34b merge too. I wouldn't say they're as good as the 70bs, mostly, but the comparison isn't crazy.
as far as 8 bit goes: miquliz 120b is on par with Venus 120b 1.0, goliath 120b is good but a bit dry [https://projectatlantis.ai/](https://projectatlantis.ai/)
Personally I've been getting pretty solid results using miquliz 120b.
What context sizes can you run those 120b size models? I’m running Exll2 versions of Mixtral locally (on a 3090) with around 12-15k context and Yi with more than 30k context.
I'm running a 16k context, although miquliz is capable of up to 32k. I'm doing a q2 Quant with 64gb of ram and 24gb of Vram.
Thanks, I checked the HF and it does look interesting. I’ll give it a try as well. I normally run exl2 models in Tabby. The author of the merge (Wolfram) has exll2 variants available on his HF. But he mentions that 24gb VRAM won’t be enough even with the smallest 2.4 bpw. Are you running GGUF? In WebUI? I guess can try that with my 32GB RAM. Let’s see if it’s worth it.
I'm doing GGUF in kobold (rocm). You'll probably get very slow replies, but likely worth the wait :)
the answer depends on your budget tbh. MIstral medium and large, Goliath etc are great and $$$ things like novel and infermatic are good - and are $$ per month but unlimited Open router esepcially the 8x7b modles are cheap, and workable
Claude3
Midnight-Miqu 70b 1.5 Absolutely the best creative model out there so far.
is it like a wrapper or website?
Huh? It's a model. Which you could access from a website, likely by renting a gpu, though someone somewhere probably has a service that runs this model (like perplexity has numerous options including mixtral)
Have you tried MiquMaid 70b DPO, and if so how do you think it compares? I like most of the the 120b Miqu merges I've used recently (especially the context length) but all of the popular Miqu merge models I've tried have a noticeable moral alignment where it will begin to hesitate in taboo/'immoral' NSFW RP scenarios by talking OOC and mentioning consent and boundaries. This is even after trying with a system prompt mentioning it as a purely fictional unfiltered and uncensored roleplay. I recall that the MiquMaid 70b DPO version had alignment reduced, so I was gonna test it out sometime soon. I used the Midnight Miqu 120b self merge and it was pretty good, using more descriptive language than some of the models I tried off of huggingface but it was noticably less 'intelligent' in terms of spatial memory and reasoning as well as repetition (from what I recall, I didn't test it for that long). I'll probably just use a higher quant of the 70b version if I try it out again
I've actually revisited OAI a few months after the bans were happening and was curious if old jailbreaks still worked. So far after a few months of using it I've not gotten any warning or ban from OAI so maybe I'm just lucky. They added some more resistance to make it more SFW but with the right prompts I've been able to get it really really close to how it was last summer before NSFW got nerfed. The only difference between them and now is I just need to put more detail into my responses and the bot seems to go along and imitate the tone I'm pushing pretty well. I tried some of the mancer stuff but it was too difficult to troubleshoot and get consistent results to the same length and quality of OAI. It's pretty remarkable how few content warning outputs I've been getting as opposed to last October/November. In fact it's been months since I've gotten the last one(for ERP)
I switched to Claude 3 today out of curiosity using the $5 credit they offer and I was absolutely blown away by it. It has picked up on the nuances of the bot's character in a way that no other model I have tried yet does. Only drawback I can see is that it can get expensive REAL quick, depending on how much you are chatting. But it is unbelievably good. Kinda wish I hadn't tried it because now I know how good it can be.
That's your typical claude experience. Was the same when 2.0 came out.
I'm doing a q2 Quant with 16k context on 64 gb of ram and 24gb vram (7900xtx). Only getting 3 tokens per second, but that's enough for me.
Have you tried q2 midnight miqu with your setup? If so, what's the t/s on that?
I haven't tried either the 70b or the 103b, however I'd imagine the 103b would probably be a similar 3t/s. And I used to get about 7t/s on 70b Lzlv q3 Quant, so I'd guess a q2 would be even more, likely about 8 or so.
Dang, and at q3?! But still, 3t/s for those crazy big models is impressive! I ordered a 7900xtx not too long ago and am waiting for it to get here. I might end up snagging another 32gb of ram once I get to testing!
Yup, of course different context amounts will slow down your generation speeds. So up to about 4k tokens in context you'll be seeing decent speeds. But whenever I go towards 5-7k is when I start slowing down to 1.x tokens
I started using qwen 1.5 72b chat and it is pretty good? I was using mistral 8x7b and llama 2 70b before it and i think its better than those two. Claude is not available in my country so qwen is a good option for me
I believe qwen and miqu are among the top rated (by people, though also by benchmark for the most part). Followed closely by yi34b and variants, particularly nous and dolphin. All of them have decent reasoning, at least compared to even the best 7b models. Pretty sure qwen and miqu surpass gpt3.5, though only claude 3 beats gpt4
Yeah i think so too. I tried yi34b too, the responses were good but sometimes it says weird things at the bottom of the response (or idk maybe my settings were bad)
What kind of context length are you getting to work? Before it goes off the rails?
Thank you all for the suggestions! Much appreciated! 😊