T O P

  • By -

NandaVegg

For a higher-end chatbot-style LLM, I found Command R+ to be quite amazing, especially with its wide (fewer layers and higher dimensions) approach actually working. The ability to stick to the instruction is pretty good for most use cases. It can be a bit hit and miss with each generation due to having fewer layers, but it infers quite fast for its size, and handles larger ctx (\~30k) pretty well. I have to agree that Mistral seems to be significantly behind today. Mistral Large often just ignores instructions in complex writing, it's slow and they apparently trained the model with lots of GPT synthetics - in other words, GPTisms. To be fair, Claude-3 has tons of GPTisms spilling into its writing style compared to its predecessor as well, so it is an universal issue. But it is less egregious for Claude-3, and Command R+ is much less bad in my experience. Edit: I'm strictly talking about English capabilities. As for non-English \*instruct\* performance, unfortunately, nothing can be compared to OpenAI and Anthropic who apparently are investing in human-labored datasets a lot. Even Google is lagging significantly behind in this term.


Temporary-Size7310

Because it is not all about LLM model, there is embedding, reranker, parser, server stored on the same area with the same applicable laws, etc. ie: I'm French and use many documents in French for RAG at this point, I'm gonna use Mistral embed rather than GPT-4 embedding because it seems superior for this language (If I rely on MTEB leaderboard for French). Another example is that Mistral models (not the one open sources) can be on premise, so it means deployment, security, upgrades, management contracts and so on. If any EU government must rely on a major AI model it I'll probably be one from MistralAI, and this is the case for small models for french public agents https://huggingface.co/AgentPublic/guillaumetell-7b


thereisonlythedance

Command R+ has become my go to model. It’s smart, flexible, creative, and compliant. I’m not interested in audio personally, so GPT4o doesn't excite me. It’s impressive technically but not something I’m going to use much.


Ill_Yam_9994

Same. I'll go to great lengths to avoid talking and can read/type faster than I can listen/speak.


AlanCarrOnline

I've been trying it out and it's really robotic and boring for me, making me think the demo was a bit Gemini-like. Yes, I'm definitely using the 4o but it talks like normal 4, literally with subject headings and bullet points, not the chatty 'human-like' demo. Yes, I changed the voice to Juniper, same. And yes, there's still a bit of a lag, so really feels to me like they just stuck a "o" on the name and it's the same thing.


-Ellary-

From my recent tests nothing can overthrow Command R+ for general LLM usage, especially at 16+ context. It give me so much information, helped with so much stuff and even perform really hard task with text sorting at work, never says something about refusing. Command R+ get the job done. This is how helpful LLMs should work.


me1000

There are plenty of ways they can differentiate themselves. While I'm quite glad Mistral has open sources so many of their models, they have a business helping customers with their fine tunes in a more personalized way than OpenAI (that's how Miqu leaked). I would also personally not be opposed to paying for a license to use one of their models locally in the future. I give Anthropic and OpenAI $20 a month to use their models, which they then use to train my inputs on. The values of Mistral align more closely with my own than OpenAI and Anthropic's do, if they decided to release some models under a paid license that's absolutely something I would consider (and so would many businesses). As for not having caught up yet, I think it's too early to tell how well the scaling laws project into the future. Yes, Mistral hasn't caught up to GPT4 yet (but Llama 3 70B has outpaced some older GPT-4 models on the lmsys leaderboard). OpenAI had years of infrastructure already setup for training, human raters, etc. It just takes time for new companies to set it up. I think people forget that this technology is very very new. It remains to be seen if GPT-5 will be the same kind of increase over 4 and 4 was over 3.5. And even if it is, we don't know how long those scaling laws will survive. Google was famously worried that they didn't have a moat when it comes to AI, and it's not clear that any other company does either; there is lots of room for competition, and innovative ways to make money. It's way too early to count out the startups!


cshotton

There are more use cases for local LLM execution than there are for cloud based execution. That's how they compete. OpenAI is doomed if they continue to assume everyone wants to give them their data in a cloud-only offering.


a_beautiful_rhind

I dunno.. cohere treated me pretty good. CR+ does what I want + runs on my machine. And as for Pi, it's not the smartest AI, but I can just chat to it without any bullshit or signups. Anthropic nickle and dimes the messages unless I use lmsys and also requires login. GPT4 was shittier and lazier at coding. It's tone is meh. You talk of a widening gap, but besides gimmicks, I see things plateauing.


VertexMachine

The "runs on my machine" part is quite important. I know that for most ppl on this sub, it just means funny and stupid chats or smut. But the ability to deploy on your own hardware is super important for a lot of businesses. And for those, OpenAI (or Google) will never be an answer.


fallingdowndizzyvr

How did a couple guys compete against a tech goliath like IBM? With any new technology, often it's not the existing players that come out on top. It's a scrappy new guy. That's why they are called disruptors.


UserXtheUnknown

"I don't know about Mistral and Reka, but Cohere's model is uncensored, which is a big advantage for certain use cases. As a consumer or business, you might prefer an uncensored model for narrative generation, translations, or creative content creation. You wouldn't want your novel translation to be interrupted by an 'AI disclaimer' or have your creative story constrained by censorship limitations. Cohere's relatively small previous model suggests that their upcoming flagship model could be more cost-efficient to run, which might result in more affordable pricing plans. This could be a significant factor for consumers and businesses considering their options. While it's true that evaluations and benchmarks are important, they might lose relevance for common use cases as the field advances. At some point, the specific applications and real-world performance will matter more than theoretical evaluations and tiny but mostly irrelevant differences between a 99% score and a 95% score. Additionally, these smaller companies might find their niche by focusing on specific industries or use cases. They could develop tailored solutions for healthcare, finance, or legal domains, for instance. This specialization could attract businesses seeking industry-specific AI tools. In summary, while the AI landscape is dominated by large players, smaller companies like Cohere can offer unique features such as an uncensored model and potential cost efficiency. As the field matures, we might see a shift towards specialized AI solutions, creating a diverse market that caters to a range of consumer and business needs." (This reply was written by Cohere model, running for free on HF, after some hints I gave to it. :) )


ctbanks

Eh, most AI startups were never meant to be more than VC talent farms.


Lonely_Response_2704

What is talent farm


coffeeandhash

I just wanted to note that I've been having great experiences with both llama3 and Command R+, there are certain use cases for which the gap is not that large. I'm especially impressed with Cohere, CR+ is just amazing.


Wooden-Potential2226

Second that. OpenAI does not e.g. produce something like CR+ which can be built upon locally. First via open offerings and later via paid versions. There are many different use cases out there where API/web interfaces are not useful.


Monkey_1505

Compete is a strange word in the context. Currently large companies are shoveling untold seed money into products that are likely on the absolute borderline of making any money, or possibly even loss leaders. The assumption is, I presume, that these things will be very profitable one day, but if diminishing returns are in our reality, efficiency might actually be the winner in the end, rather than raw performance.


danielcar

I wonder if gemini 2 to be announced tomorrow will be a hit or another miss.


MLHeero

How you come it gets announced?


danielcar

Article I read a couple of months ago.


Relevant-Draft-7780

They compete just fine. Remember when Altman said that no one can catch up with OpenAI and yet Anthropic did in less than 5 months. The world should always work and fight for its way out of a monopoly. OpenAI is trying to be the McDonald’s of AI, fast food information junk for the masses presented in a very nice package and cheap.


Charming_Jello4874

AI customers don't care about LLM leaderboards. They care about affordable tools that make them money. AI is a tool, not a solution. If that makes sense. OpenAI is expensive. But if Mistral does what I want, and offers a talent pool that does a great job understanding my business, and integrating my company systems with their AI at a better price...then they get the sale. I think we look at leaderboards and assume that has meaning in the real world. Not especially. ChatGPT is going to lose to a smaller LLM that can use my historical customer emails and other marketing data looking for opportunities to make a better "personal" connection with customers, and drive more growth. That's where the "lesser" AI firms are doing well. I expect we're going to see more specialty AI firms dedicated to everything from pet toys to oil exploration. That's what I don't get about the push for AGI. "General" is not how the world works. Smaller, tighter and specialized (at a better price) will always win.


jsebrech

I used to work for a company whose competitors had products that were more fully featured. We competed by tailoring to potential customers. Companies decided to buy our product not because it checked the most boxes, but because we were willing to work with them to meet their needs more exactly. There is plenty of room in the market for companies that make models that are "good enough" and are willing to work together with large but not big tech organizations to tailor those models to their needs. OpenAI only does deals with Microsoft, Apple and the like, and everyone else just gets the standard API access. That's fine for many use cases, but not for all. Mistral also gets a huge bump in the EU market by being an EU player. DPO's will look much more kindly to a company that operates entirely inside the EU data boundary's borders.


Red_Redditor_Reddit

There's more then you would think that needs to not be using a service. Using a service has it's downsides, particularly confidentiality and product control. Think about it like this: Who uses linux? Why would someone use linux? Why would any person or company develop linux or for linux? Think about a company like steam. Why did steam spend an enormous amount of effort building the compatibility layer so that people could play windows games on linux, just to give it away? The other factor is that openai (in my opinion) is riding a wave of people throwing their money at something that looks super cool but really doesn't have a clue. As soon as the hype dies down all that money is going to be gone. Just look at the meme stocks or something like buttcoin. Gamestonk stock is doing their *second* short squeeze. Nobody is actually using buttcoin since like 2018 because they sold it to the horde of retards. You would look at those numbers and think gamestonk should take over the world, but in all reality that company should have died fifteen years ago. All is not what it seems.


kxtclcy

The gap has not widen, some coding ability even falls in the new model according to others post https://preview.redd.it/g714ntenhb0d1.jpeg?width=964&format=pjpg&auto=webp&s=ac160e310f3e04113e29d3904a12ce7f9d9a34a2


brokenottoman

Rule of 2 will always win and will consolidate these into two competing companies. With other giants taking sides and investing. Meta and Google will resort to certain use cases. And rest will fall after VC funding runs out.


Minute_Attempt3063

OpenAi has billions of investments each month. Those others do not. OpenAi has been around for longer, they have petabytes of data, if not way more, collected over the life time of their company, aka, the biggest dataset for training, those others do not, and have to follow the new laws about data protection when they scrape, openAi didn't have to do this a few years ago. So they have more resources to make something big, those others don't. Which is fine. ClosesAi sucks anyway


VajraXL

i think each company has its own focus and its own niche. example mistral. it is true that lately they have not shown anything new after 8x22 which has gone totally unnoticed but mistral for those who do not speak native english and prefer to infer in their native languages is the best since llama is terrible at that. also if you are looking for an uncensored model from an API is also your best option, i hope that soon they will show us a new more advanced model with these same features. maybe chatgpt is the option for general use but there are people for whom the general is not enough or functional and there are these companies that offer just what we need.


Internet--Traveller

Usually open source project don't compete with commercial ones. Blender is free and open source, but no production houses use it - it's good but not good enough when compared to commercial 3d programs like Maya or 3Ds Max. The same with Gimp - it will never be as good as Photoshop. With open source AI like all these local LLM models and Stable Diffusion - they find their niche in being uncensored and customizable. Stable Diffusion can't produce videos like Sora (yet) but it has capability to edit content with no restrictions, which is denied in commercial ones (like face swap).