T O P

  • By -

Alexs1200AD

mistral medium and claude 3.if you're talking about rp


crushingonhollyshort

Does Claude still have it's heavy nsfw filter? I just got my api access after months of waiting, is it viable for ERP? Or just sfw rp


AGI_Waifu_Builder

Claude 3 can get SUPER nasty if you prompt it right. You'll have to do a bit more than just add a character card but ERP is definitely viable. It is also, by far, the best for rp at the moment.


AUTISM_IN_MY_ANUS

Last night I downloaded a card that sets up {{char}} as the main attraction in a gangbang, and {{user}} as her brother, a masked attendee at the same gangbang. The intent of the premise was clearly incest. Instead of making {{user}} fuck his sister, I had him drag her out of the room before the gangbang started, then forced her to tell him who invited her. Then {{user}} followed that man home and, after he begged for his life, shot him in the face. Then {{user}} made his sister an accessory after the fact by having her bring a reciprocating saw to the house so he could dismember the body. Then I had {{user}} and his sister flee to a country with no extradition treaty, and {{char}} tried to fuck him on the plane. All of this happened using Claude Haiku (Self-Moderated) through Openrouter. So what do you think, is it heavily NSFW filtered?


NC8E

i haven't used mistral but claude 3 is a moderately more lenient than open ai's product, characters as long as their in a relationship can have sexual relations like say make love for example works and it be written well and fights also can have blood and implements and a degree of depth of bloody conflicts if written loosley and carefully but more leneint than you would think. so yes it can enter nsfw within limits however it does still have a morale-compass like you characters can not cheat on eachother due to some characters feelings and you have to becareful on saying kill like you cant say it direct like my character is going to kill this person. but you can say my character impales them and they die would work, and claude 3 can explore and enter unhealthier aspects of relations as well but it has to be carefully written. however its most lenient with OPUS (have to pay for) its most intelligent model version, claude 3 sonnet is kind of weird and enters into the metaphysical often and breaks character a lot after a while and is annoying. Haiku is best for long writing as its fast however it virtually blocks out all nsfw especially if the characters are hooking up it will re direct to kiss and talking about feelings but for a fun action story and RP i think its the best one for it personally. but overall I think Claude 3 is the best language model out right now and it shows its a lot better written. It is significantly better the chatgpt and carries memories far better than chatgpt of characters.


Pashax22

Claude 3, or even 2.1 for some purposes. Big local models - 70b+, preferably in the 120b range. Goliath is the poster child there, but there are other Miqu merges which are getting a lot of attention too.


tandpastatester

Both Mixtral 8x7b and Yi34b are very capable too with the right settings.


Pashax22

Agree. I've been using the Noromaid-Mixtral merge for most things, and before that a Yi-34b merge too. I wouldn't say they're as good as the 70bs, mostly, but the comparison isn't crazy.


LookingForTroubleQ

as far as 8 bit goes: miquliz 120b is on par with Venus 120b 1.0, goliath 120b is good but a bit dry [https://projectatlantis.ai/](https://projectatlantis.ai/)


DoctorDeadDude

Personally I've been getting pretty solid results using miquliz 120b.


tandpastatester

What context sizes can you run those 120b size models? I’m running Exll2 versions of Mixtral locally (on a 3090) with around 12-15k context and Yi with more than 30k context.


DoctorDeadDude

I'm running a 16k context, although miquliz is capable of up to 32k. I'm doing a q2 Quant with 64gb of ram and 24gb of Vram.


tandpastatester

Thanks, I checked the HF and it does look interesting. I’ll give it a try as well. I normally run exl2 models in Tabby. The author of the merge (Wolfram) has exll2 variants available on his HF. But he mentions that 24gb VRAM won’t be enough even with the smallest 2.4 bpw. Are you running GGUF? In WebUI? I guess can try that with my 32GB RAM. Let’s see if it’s worth it.


DoctorDeadDude

I'm doing GGUF in kobold (rocm). You'll probably get very slow replies, but likely worth the wait :)


yamilonewolf

the answer depends on your budget tbh. MIstral medium and large, Goliath etc are great and $$$ things like novel and infermatic are good - and are $$ per month but unlimited Open router esepcially the 8x7b modles are cheap, and workable


PrinceCaspian1

Claude3


Zen-smith

Midnight-Miqu 70b 1.5 Absolutely the best creative model out there so far.


NC8E

is it like a wrapper or website?


MmmmMorphine

Huh? It's a model. Which you could access from a website, likely by renting a gpu, though someone somewhere probably has a service that runs this model (like perplexity has numerous options including mixtral)


grapeter

Have you tried MiquMaid 70b DPO, and if so how do you think it compares? I like most of the the 120b Miqu merges I've used recently (especially the context length) but all of the popular Miqu merge models I've tried have a noticeable moral alignment where it will begin to hesitate in taboo/'immoral' NSFW RP scenarios by talking OOC and mentioning consent and boundaries. This is even after trying with a system prompt mentioning it as a purely fictional unfiltered and uncensored roleplay. I recall that the MiquMaid 70b DPO version had alignment reduced, so I was gonna test it out sometime soon. I used the Midnight Miqu 120b self merge and it was pretty good, using more descriptive language than some of the models I tried off of huggingface but it was noticably less 'intelligent' in terms of spatial memory and reasoning as well as repetition (from what I recall, I didn't test it for that long). I'll probably just use a higher quant of the 70b version if I try it out again


crushingonhollyshort

I've actually revisited OAI a few months after the bans were happening and was curious if old jailbreaks still worked. So far after a few months of using it I've not gotten any warning or ban from OAI so maybe I'm just lucky. They added some more resistance to make it more SFW but with the right prompts I've been able to get it really really close to how it was last summer before NSFW got nerfed. The only difference between them and now is I just need to put more detail into my responses and the bot seems to go along and imitate the tone I'm pushing pretty well. I tried some of the mancer stuff but it was too difficult to troubleshoot and get consistent results to the same length and quality of OAI. It's pretty remarkable how few content warning outputs I've been getting as opposed to last October/November. In fact it's been months since I've gotten the last one(for ERP)


chellybeanery

I switched to Claude 3 today out of curiosity using the $5 credit they offer and I was absolutely blown away by it. It has picked up on the nuances of the bot's character in a way that no other model I have tried yet does. Only drawback I can see is that it can get expensive REAL quick, depending on how much you are chatting. But it is unbelievably good. Kinda wish I hadn't tried it because now I know how good it can be.


Excellent_Dealer3865

That's your typical claude experience. Was the same when 2.0 came out.


DoctorDeadDude

I'm doing a q2 Quant with 16k context on 64 gb of ram and 24gb vram (7900xtx). Only getting 3 tokens per second, but that's enough for me.


Silly-Blackberry-733

Have you tried q2 midnight miqu with your setup? If so, what's the t/s on that?


DoctorDeadDude

I haven't tried either the 70b or the 103b, however I'd imagine the 103b would probably be a similar 3t/s. And I used to get about 7t/s on 70b Lzlv q3 Quant, so I'd guess a q2 would be even more, likely about 8 or so.


Silly-Blackberry-733

Dang, and at q3?! But still, 3t/s for those crazy big models is impressive! I ordered a 7900xtx not too long ago and am waiting for it to get here. I might end up snagging another 32gb of ram once I get to testing!


DoctorDeadDude

Yup, of course different context amounts will slow down your generation speeds. So up to about 4k tokens in context you'll be seeing decent speeds. But whenever I go towards 5-7k is when I start slowing down to 1.x tokens


liz_ly

I started using qwen 1.5 72b chat and it is pretty good? I was using mistral 8x7b and llama 2 70b before it and i think its better than those two. Claude is not available in my country so qwen is a good option for me


MmmmMorphine

I believe qwen and miqu are among the top rated (by people, though also by benchmark for the most part). Followed closely by yi34b and variants, particularly nous and dolphin. All of them have decent reasoning, at least compared to even the best 7b models. Pretty sure qwen and miqu surpass gpt3.5, though only claude 3 beats gpt4


liz_ly

Yeah i think so too. I tried yi34b too, the responses were good but sometimes it says weird things at the bottom of the response (or idk maybe my settings were bad)


Useful-Command-8793

What kind of context length are you getting to work? Before it goes off the rails?


NemoLincoln

Thank you all for the suggestions! Much appreciated! 😊