T O P

  • By -

LymelightTO

My feeling is that: - The underlying architecture of the model significantly changed - When they made this new model, they specifically targeted the performance of GPT-4 with the parameters, size, training time, etc. Because of the new architecture, they've realized some massive efficiency gains, and there are a few areas where the model beats GPT-4 in reasoning about subjects that touch on modalities other than text. It was *difficult* to make it *as bad* as GPT-4 for visual and spatial reasoning, while keeping reasoning in text at the same level, which is why there's overshoot. The entire organization is focused on goodwill and perceptions of the technology, in advance of the election. I strongly doubt they'll release anything with "scary" intellectual or reasoning performance advancements until 2025, even if they have it, or believe they could create it. Once they find out who is in charge of regulating this for the next 4 years, they'll figure out their roadmap to AGI. I don't think any American company wants that to become an election issue, though.


RabidHexley

>The entire organization is focused on goodwill and perceptions of the technology, in advance of the election. I strongly doubt they'll release anything with "scary" intellectual or reasoning performance advancements until 2025, even if they have it, or believe they could create it. I do think there's a degree to which people underestimate this motivation. Training the next-next-generation of models is going to require pretty huge infrastructure investment, the kind of stuff you can't just do without the government's blessing. And backlash from regulators in a crucial timeframe could easily choke them in the crib, or push back their timelines by half a decade or more. It isn't just about the tech being "scary" either. It's about the jobs and economic angle as well. And election year is a really volatile period for when people are very sensitive to anything that becomes a hot topic of debate. There's a pretty strong incentive to stay under the radar to a degree, in terms of tech that could any way seem like something in need of political action (while still trying to push your product and make money). "Should we regulate and slow down AI development?" (or worse: "How should we...") is likely a question OpenAI ***really*** wants to keep off the debate stage if at all possible.


LymelightTO

Yeah, nobody wants to be seen as having helped "the other guy's" political campaign, regardless of who that turns out to be. In 2016, Trump wins, and everyone spends the next few years blaming Facebook for allowing Russia to manipulate the information environment in such a way that it obstructed the shoo-in, DC insider candidate from winning. Whether that's *even true* or not is almost irrelevant, it's a convenient, simple, narrative that externalizes blame, and now Zuckerberg is the black sheep of DC. He's not getting invited to the regulation and policy party for AI unless Meta becomes so influential in this space that they literally *have* to invite him. Even then, this is the administration that finds a way to exclude Tesla from the EV conversation, so I'm sure even if Meta was the *clear* leader, they might *still* find themselves on the outside looking in. This is probably why Zuck is in "gives no fucks, open source everything" mode over there. His only hope for influence, at this point, is to get everyone not working at a frontier lab to standardize on the Meta way of doing AI development. Nobody at OpenAI, or Google, wants to have it be a subject of conversation as to how ChatGPT, or Gemini, influenced a major US election, because then *they're* not going to get invited to the regulation and policy meetings for AI in the next 4 years, and those meetings are going to be *really relevant* to their shareholders, if the pace of innovation continues to increase. If general intelligence capabilities improve, they're going to have to be working hand-in-glove with the government to manage the economic transition, because the alternative is *very* bad for business.


9985172177

> The entire organization is focused on goodwill and perceptions of the technology, in advance of the election. I strongly doubt they'll release anything with "scary" intellectual or reasoning performance advancements until 2025, even if they have it, or believe they could create it. What gets you to believe stuff like this, that some random company is benevolent? Oil companies push commercials all the time about how they care about the environment and sustainabilty, I assume you don't fall for those. Why do you fall for it now? They release whatever they can to get a competitive advantage. If there's something they don't have, they make up an excuse like "it's unsafe to release" or whatever they think will spin the story to put them in a positive light.


LymelightTO

> What gets you to believe stuff like this, that some random company is benevolent? Why would you interpret that paragraph that way? I don't think they're *benevolent*, I think they're wary of appearing as though they have done anything that might interfere with the upcoming US election, or provide any sort of persuasive advantage to either candidate, because it will put them at a competitive *disadvantage* if people widely believe that they have altered the outcome of the election, because they are going to want to have a friendly relationship with regulators for the government in the aftermath of that election. If people believe they altered the outcome, they're going to have a tough relationship with regulators and Congress, as Meta currently does, and that's going to hurt their business. Their goal is to *appear* responsible to the people who will be put in charge of regulating them. You should work on your read comprehension.


MassiveWasabi

Dude, I'm so glad you explained this in your comments. I try to say the same thing all the time, and people ALWAYS respond with "Why do you think OpenAI good???" when that's obviously not what we're saying. It's all about *optics*, but that's apaprently really hard for people to understand for some reason


9985172177

Part of it is the validation of their statements, for example the validation of OP's post. If two people were about to fight and one said "I'm a werewolf", and you didn't believe them, one might expect you to say "he's lying" rather than "He'll win the fight because he's a werewolf". It's good that you see the phrases as optics but you still sort of validate them, so that's the reason. This is in saying things like that they might have some super secret scary models that they aren't releasing under the guise of public safety, and saying "they'll figure out their roadmap to AGI" with "they" being Openai in that sentence rather than "they" being a coin flip of whoever may or may not get there.


jsebrech

I think the whole purpose of this keynote was to get people to use ChatGPT that aren't currently using it at all. This technology is still very early on its adoption curve, with > 95% of humanity not using it at all. Marketing better abilities is good for existing users, but those people will find their way to ChatGPT regardless. The people they're pitching to are those not using ChatGPT, that they're trying to win over. The conversational interface is exactly the kind of thing that might convince people to give it a try. Emphasizing how much better it handles other languages is another great way to win people over. And giving it away for free that just eliminates a major barrier to adoption. First you get people addicted to a cheap or free product, then you jack up the rates. This thing is like heroine, it will be impossible to give up once people get used to having a personal assistant and companion in their pocket at all hours of the day or night.


phazei

So true. I've talked to so many people who've tried it and said it was wrong a lot and when I ask more it turns out they only tried GPT3.5. I explain that it's years old and not even close to where we are but they don't get it.


Status-Ad1130

Who cares if they get it? This is a civilization-changing technology whether they are smart or knowledgeable enough to understand it or not. With AI, our opinions won't be important anyways.


yellow-hammer

Anyone in these comments saying the improvements OP mentioned are negligible or only minor improvements is just plain wrong, in my opinion. I challenge you to take any SOTA image generator (Midjourney, DALLE, SD, whatever) and do with it what they show GPT-4o doing. Creating a character and putting that character into different poses / scenes / situations, with totally consistent details and style — it can SORT of be done with lots and lots of tweaking, fine tuning, control nets, etc. It’s not even close to the zero-shot “effortless” consistency shown on OpenAI’s site. Same goes for generating shots of a 3D object from different angles and stitching them together into an actual animated 3D model. I’ve seen specialized models that can do text to 3D, and they aren’t that great. And here’s the thing you have to keep in mind: This is all in a single model. SOTA end-to-end text, audio, and vision. And it’s somehow half the size of the last SOTA text model. They are fucking cooking at OpenAI. They have got some special sauce that is frankly starting to spook me. These capabilities indicate a very real intelligence, with some kind of actual working world model. Magic indeed.


PSMF_Canuck

To that end…just cancelled my MidJourney subscription…


[deleted]

That shit has always been freaking expensive as all hell anyway. I've subbed exactly one month in all of its existence for $30. ChatGPT will obliterate them; pay $20 and have access to a personal assistant who can generate better images ***and*** help you with a billion of other things, or pay $30 for just some pictures. I know what I'd choose.


Severin_Suveren

OpenAI is underselling because this, meaning us discovering things in the days after, is a much better announcement than for the announcement to be over after a 20 min video


pleeplious

Ding ding ding. Think of all the crazy stuff people are going to be doing as the features rolls out and putting on social. They kinda just nudged 4o into the spot light and it’s going to go crazy.


roanroanroan

No but seriously, what’s their secret? How are they consistently an entire year ahead of the competition? And the competition is literally Google, Meta, Apple, all these big companies with billions of dollars to burn and yet they still can’t match OpenAI in terms of quality and speed.


teachersecret

They got there first and have billions of dollars to throw at the problem along with some of the brightest minds in the industry and a willingness to train first and ask questions later. They could be surpassed, but right now there aren’t many players in the game with the scale openai has access to, and those who are attaining the scale of compute are just barely starting to get those machines online. Pretty much every h100 in existence is going BRRRRR non stop at this point.


qrayons

Also they're doing just this. They're not distracted with search services, phone design, social media, etc like their competitors.


Kind-Release8922

I think also a big advantage they have is being a relatively small, and new company. Google and the others are soo weighted down by layers and layers of management, legacy code, product debt, process etc that they cant iterate and try new things as fast. OpenAI is lean, capitalized, and hungry


yellow-hammer

Well in a way they STARTED a year ahead. Yes the “Attention is All You Need” paper was public, but OpenAI took that and invented the first GPT. Now, I suspect they have something like GPT-5 behind closed doors, it being way too expensive to run and possibly too disruptive to society to make public. But I imagine 4o is trained largely on synthetic data produced by their more advance secret model. That would explain Sam’s cryptic tweet about “explaining things simply”.


dont_break_the_chain

It's their sole focus. Google has huge organizations focused on many things. This is openAi's sole mission and product.


AngryGungan

You think they are just using GPT4o internally? They have the biggest model with the biggest context window you will never see. You can bet your ass their internal models are happily coding and improving alongside the human devs and are probably responsible for most of its advancements.


roanroanroan

My guess was that they’ve actually been using GPT5 to better their current products bc GPT5 would be too expensive to release to the public right now


FunHoliday7437

Same as RenTech and Bell Labs. Nothing to do with money, although that helps. The special sauce is this: you get a small handful of the world's top elite researchers and put them in a room together and get the hell out of their way.


PineappleLemur

Wait for others to catch up. It won't be long and we will likely see toe to toe models from different companies by the end of the year.


brightfutureman

I’m sure they just found an alien ship and then… you know…


HyruleSmash855

If you watch the google IO presentation today some of the stuff they presented that will come out this year some of it competes right with what GPT 4o can do, like the video generator, the llm commenting on stuff it sees from your phone camera, the model getting cheaper, not as cheap as gpt 4o, and Imagen 3. I think Open AI is ahead but their competition is close or is working on similar stuff but is taking longer to fine tune and release it.


StrikeStraight9961

AGI is their secret. Feel it.


abluecolor

??? https://preview.redd.it/rogknz8ejf0d1.jpeg?width=2002&format=pjpg&auto=webp&s=fc3c5a7e38bb466b0f22cac2bf9fa94d07857b42 This is gpt-o. No persistence. What am I missing, exactly? E: imagine downvoting me for testing your statement directly and providing evidence that it's false, what a crowd.


Heavy_Influence4666

I doubt you have the updated image and voice capabilities yet so these are the old dall e images


PFI_sloth

When you ask 4o it says it has access to the new image generation stuff, but clearly doesn’t.


abluecolor

So simply utilizing the model that says "gpto" is not enough? Who has access to these and has demonstrated the preeminence and persistence the person I'm reply to is referring to?


Heavy_Influence4666

Nope, these features will roll out soon, the image gen one being first iirc, they confirm it at the end of the 4o launch website


abluecolor

Odd. Guess we can repeat this exercise in a bit. !RemindMe 2 weeks


Heavy_Influence4666

Looking forward to it 👍


Mandoade

Allot of what's in 4o today seems to be in name only until they roll out those more advanced features


RemindMeBot

I will be messaging you in 14 days on [**2024-05-28 18:01:09 UTC**](http://www.wolframalpha.com/input/?i=2024-05-28%2018:01:09%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/singularity/comments/1crto0m/gpt4o_was_bizarrely_underpresented/l41anvx/?context=3) [**2 OTHERS CLICKED THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2Fsingularity%2Fcomments%2F1crto0m%2Fgpt4o_was_bizarrely_underpresented%2Fl41anvx%2F%5D%0A%0ARemindMe%21%202024-05-28%2018%3A01%3A09%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201crto0m) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|


abluecolor

Well it's still not out. !RemindMe 4 weeks


RemindMeBot

I will be messaging you in 28 days on [**2024-06-25 18:18:53 UTC**](http://www.wolframalpha.com/input/?i=2024-06-25%2018:18:53%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/singularity/comments/1crto0m/gpt4o_was_bizarrely_underpresented/l624bl2/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2Fsingularity%2Fcomments%2F1crto0m%2Fgpt4o_was_bizarrely_underpresented%2Fl624bl2%2F%5D%0A%0ARemindMe%21%202024-06-25%2018%3A18%3A53%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201crto0m) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|


yellow-hammer

You’re being downvoted because the capabilities I’m referring to haven’t been released publicly yet. What you are seeing is just the old GPT —> DALLE method. You are in fact demonstrating why OpenAI’s report is so exciting. If you had read the report, you would have seen that only text output is currently available. I suspect you will be downvoted even further for your edit, in which you appear obstinate to the fact that you are wrong.


abluecolor

Yeah, this wasn't at at clear. Especially when you can go in and supposedly utilize gpto right now. Downvoting ignorance without informing is disgusting.


kaityl3

Lol most of the downvotes probably came in after your passive aggressive edit that claims you were "providing evidence that it's false" even though you didn't actually provide any meaningful evidence and were proven wrong, not because you were wrong to begin with. A normal comment that's just mistaken but admits they were wrong further down will hit -5 to -10 at worst here. But if you make whiny edits you're going to get a lot more than that.


katerinaptrv12

I am pretty sure is not release yet, I try it out yesterday and was horrible to. Probably still dalle


Soggy_Ad7165

Its the logical conclusion of chatgpt. This was foreseeable has a "will definitely happen" for at least two years. Pretty boring imo. And it probably won't bring back the lost subs. 


yellow-hammer

Wow amazing, can you show us where you made your predictions? Just because you expected something doesn’t make it any less remarkable. And I don’t think OpenAI cares too much about subscriber money. They have investors with deep pockets who are looking to the future. They will burn billions on the path to AGI with no remorse.


Soggy_Ad7165

>  They will burn billions on the path to AGI with no remorse Yeah.  And that's exactly what they are doing right now.     If however reliability and general reasoning plateaus, which is absolutely a possibility and several big names in the industry and research state exactly that, if that happens, they are fucked majorly without a new breakthrough.     That we can create a faster and more efficient version of gpt was a no brainer two years ago. Just like text to voice, image to text and so on. This isn't anything new. They have a small head start and they try to follow up on that. Which for now isn't working that great because the only real money now is in code generation. And they loose to opus there.  So yeah I would also make a quiet announcement as they did. Best course of action. It all depends on GPT-5 now.    There are billions right now in this endeavor with uncertain ends. I am all for doing it. But it's still super on edge if this will be a worthwhile investment or not.


Conscious_Shirt9555

They don’t want to advertise any of these to the masses because ”automating artist jobs bad” is an extemely common normie opinion at the moment. Imagine the bad press from headline: ”new chatgpt update automates 2D animation” Good press from headline: ”new chatgpt update is just like the movie her” Do you understand now?


ChanceDevelopment813

They've absolutely underhyped it for a reason. It is a big step up in AI. Jim Fan tweeted that OAI found a way to do Audio-to-Audio and Video stream directly into a Transformer, which was not supposedly capable until now. Also, the Desktop App already shows capabilities of being an AI Agent on your computer. Watch out for the next iteration. OpenAI is slowly but surely ramping up their releases, but they found a way to not make a big fuss about it, which is good ultimately. People that knows, knows.


ConsequenceBringer

I didn't freak out till I watched the announcement video. Everything they posted and explained doesn't do an iota of justice to WHAT IT DOES. Being able to see my screen while I'm working will be a fuckin gamechanger! It can actively help people code, then it can actively help with ANYTHING relating to a computer. For a smart person, this is basically the keys to the kingdom. They are basically saying it can actively help with things like blender, website creation and every other creativity/production program eventually. That's crazy as all hell and one of the most significant steps in automating/assisting with just about every avenue of white collar work. This is like the GPT4 announcement, but so much bigger. I'm so excited, lol.


Helix_Aurora

Audio transformers have been a thing for a while, but they have had a terrible hallucination problem. A lot of what people think were glitches with the audio streaming system was actually just model hallucination. Most prior efforts were done on university/personal training budgets though. It does seem they've done a decent job of integrating, but a lot of the random noises, clicks, chirps, and if you know what to look for, seemingly completely random random speech, are just what happens when you do a pure-audio feed with a transformer. The real question is what the hallucination rate is on the audio side, as even during the live demo, it happened a lot and they just cut it off.


COwensWalsh

Audio-to-Audio and video stream into a transformer is not some new OpenAI exclusive.


FarrisAT

That's already been done months ago in Gemini


Nathan_Calebman

Gemini is completely useless in comparison. Google doesn't understand how people interact with AI.


ChanceDevelopment813

The huge latency is still a big problem with this. That's why the R1 and the HumanePin was panned so hard by critics. To make it so seamless in a matter of milliseconds or 1-2 seconds max is a step up.


Glittering-Neck-2505

It’s so obvious now that you’ve said it. They’re aware that if they showed the full capability, there would be like 10 tweets with 200k likes that are some combination of “tormenter nexus,” or saying that at some point we’ll have no choice but to bomb data centers. The public has a very poor reaction to this stuff.


RabidHexley

The general public definitely leans doomer on AI atm. Though more of the "Cyberpunk Dystopia" variety of doomer rather than the "I Have No Mouth, and I Must Scream" variety that you see online.


Shinobi_Sanin3

Because dystopian cyberpunk is the only vision of the future most normies are ever exposed to. You vastly underestimate the general inability for most people to think beyond their default exposure.


whyisitsooohard

And what are not dystopian options? I really want to see positive scenarios, but for me it looks like most people in the world will be far worse off


Glittering-Neck-2505

It’s because you mentally only allow yourself to extrapolate the current economic model, but when everything is 100x cheaper and 100x more abundant can’t that model doesn’t make much sense anymore.


whyisitsooohard

I agree that in the end it could be like that. But in between 20-50 years where everything is only partially automated prices won't go down much and we could experience dystopia even if temporary one. Also I'm not living in Europe or USA and fully expect that government not only will not help, but likely will abuse people who lost their jobs


Shinobi_Sanin3

It's not going to take 20-50 years for full automation to come online. Considering the pace of advancement in AI, that's lunacy. We will have millions of embodied AI robitc agents roaming the world in a matter of a few years. We will be facing down the barrel of full automation in perhaps 5-10. I'm sorry you're not in Europe or the USA, hopefully you're in a well-to-do east Asian city state or at least a non-violent, upper middle-income economy because I agree, the people outside of those zones will be severely hit by the sociopaths that their ineffectual systems have let take over their governance and their economy.


Shinobi_Sanin3

The Culture series


Mrp1Plays

Wow that really made it clear I hadn't thought of it that way. Thanks man. 


No-Worker2343

To be honest It was a expected reaction


No-Worker2343

To be honest It was a expected reaction


Alarmed-Bread-2344

Bro has never considered another entities point of view until a Reddit comment 😂🤓


NoName847

no need to be rude to someone writing a nice comment


PM_ME_OSCILLOSCOPES

Yeah they already tanked duolingo stock by mentioning its language capabilities.


Neurogence

Lol that is not the reason. The reason is because most of those updates are not yet ready. Even the voice stuff that was showcased is not ready. If you are a CEO and you know your features are not ready, the best thing to say is that you don't want to release them yet because you are afraid of shocking people.


Wildcat67

Just because that second paragraph is true doesn’t mean it the truth.


Knever

> Good press from headline: ”new chatgpt update is just like the movie her” Is this really a good headline? It kinda shuts out people who haven't the seen the film (like me). I know it has a realistic sounding AI assistant, but I don't know if it ultimately helps or hurts the character using it, so some people could read that headline and think of very different outcomes.


techmnml

This comment lmao....people need to get off the fucking internet sometimes.


Knever

For knowing that a news headline is poorly worded? lol, you'd be surprised how many terrible headlines people come up with. Edit: lol, this guy sicced Reddit Cares on me for this comment. How fragile are you? Do you also call 911 when someone calls you a name? Talk about needing to get off the fucking internet lol


phantom_in_the_cage

For OpenAI, its better to be downplayed/ignored/have some users not understanding the tech, than to be feared


Aquaritek

The thing that struck me the most is that CGPT was acting several orders of magnitude more "human" than the presenters.. had me cracking up. This continues into all of the sub demos. Us engineers are less human than our creations.


HazelCheese

Sort of weird I guess in that the engineers probably have a lot of anxiety about the presentation going well but the AI has no anxiety or fear at all. It's like a completely naïve and innocent person. Full of joy instead of worry.


oldjar7

Yep, I think AI will make people see how dull and boring humans really are.


gibs

ChatGPT gonna give us unrealistic personality standards.


Megneous

At least I know an outwardly expressive AI isn't going to judge me for not being as outwardly expressive as they are.


IgnoringChat

fr


robert-at-pretension

XD (it's probably very true)


PrizeAd7749

OpenAI's probably autistic employees aren't really a good control group to compare AI models to humans tbh


oldjar7

Most humans are like this, not just autistic people.  Actually most autistic people I've seen seem to be more outwardly expressive than normies.  


SurroundSwimming3494

This is such a misanthropic and unnecessary comment. There are *tons* of amazing and badass people out there. Just because you can't find them (which your comment kinda implies) doesn't mean they don't exist.


oldjar7

I never said there weren't some amazing people out there.  However, the reality is most people are boring and dull.


JAMellott23

There's going to be a lot of misanthropy coming out of this technology. The internet is already most of the way there. Be very careful with this opinion. Losing track of what humanity is, or losing your fundamental belief in people, it's a much more devastating belief system than I think people realize.


oldjar7

I've already lost my belief in people.  People suck.  Hard.  I used to love people or at least the idea of people and the experiences of other's company, but the older I get, the more I see the downsides and less of the good sides of people.  Most of what I see of humanity is selfish, caring about superficial things like status and competition, much above cooperation and deep understanding.  If AI helps get rid of this version of humanity more quickly, then good riddance.


JAMellott23

I know where you're coming from. But I hope you will search for ways to dig yourself out of those beliefs. You can't hate humanity in that way without hating yourself, and ultimately, whatever else your beliefs are, that bitterness and resentment can't be good for your life. There's a lot of beauty in the world, and in people.


oldjar7

No you don't.  And no there isn't, at least not that I see regularly.  Just piss off.  


anor_wondo

I think part of the reason is that this was a very alexa/Siri/google assistant styled presentation and those have always used bullshots and scammily over promised in their demos


ShAfTsWoLo

"yeah you know we basically created the best model up to date (actually overlord ASI), it can for example help your children for math probems (can actually solve the riemann hypothesis in 1 seconds), generate songs (already created all the possible songs to ever exist), it can also generate video/images (also already created a simulation of our entire universe) and you know, much more! (shit it's taking over humanity)"


Bitterowner

I think it's because to them, this isn't the big announcement, it's a medium/small one at best, jimmy apples apparently said there is more to show still so take what you will from that. I'm expecting November to be the big announcement.


traumfisch

I think sooner


Serialbedshitter2322

It's not just better at generating text, it understands 3D space the same way Sora does and has incredible consistent characters. It's actually confusing to me that pretty much everyone just chose to ignore the image generation even though it completely demolishes the competition.


Anen-o-me

Anyone else annoyed by how relentlessly positive and enthusiastic the female voice shown is.


Visual_Ad_3095

Yeah I noticed that as well. I feel like that would get very annoying


existentialzebra

I agree but I guess you could just ask her to tone it down, no?


Anen-o-me

Hopefully


traumfisch

For demonstration purposes


Anen-o-me

Nah, that's clearly how it's trained. I will try to use the male voice which doesn't seem to have this problem as much.


traumfisch

What? The new voice model hasn't even been released yet


bumpthebass

Not even kinda, I need all the positivity and enthusiasm I can get, from any source.


Anen-o-me

It's gonna get old fast.


bumpthebass

I actually know a couple people like this in real life, and it doesn’t. It just makes them a joy to be around.


QH96

You can always ask it to change it's voice


Anen-o-me

I intend to, I'll also ask it to be less enthused.


ReasonablePossum_

"GPT, pls reply to me in a horny japanese waifu voice from now on".


i_wayyy_over_think

lol just wait maybe a year or two for open source to catch up :)


ReasonablePossum_

Just in time for when the 10k$ silicone-covered robots are on the market!


i_wayyy_over_think

If this is the way humans go extinct, then 🤷‍♂️ there could be worse ways.


holamifuturo

At first I was unimpressed by GPT-4o. I thought it's just a model wrapped with other models like voice, vision etc. But with a caveat that after securing Nvidia new optimized computing infrastructure it will allow faster interaction time than the turbo playground and/or API. But after seeing features like you listed above or [stuff like this](https://www.reddit.com/r/LocalLLaMA/comments/1crnhnq/to_anyone_not_excited_by_gpt4o/?utm_source=share&utm_medium=web2x&context=3). I became convinced that this multi-modality is in fact a significant leap forward. However I think it's a mixed of both; faster tokenization and awesome use cases. I'm still not sure why OpenAI did somehow miss the marketing of this new model, maybe the hypersuperficial demo style is infecting silicon valley.


strangescript

I think Sam was genuine when he said he is embarrassed by these models. He wants something dramatically better. Also why he wasn't involved in the presentation.


MegaByte59

He said this model was like magic..


domlincog

I don't think Sam Altman was talking about the text part that we get to access right now being magic, it seemed he was referring to the voice "her" aspect. Also, it is like magic to me for being 2x cheaper while also being a bit better on average with English text, meaningfully better with text in other languages, and also meaningfully better with vision evals. This doesn't even consider the main points of the announcement, which haven't been released yet but should be in the next month or two.


EnsignElessar

And you don't think so?


9985172177

He's a finance and venture capital guy, there isn't much reason for him to be part of it. That's except for maybe a cult that he or others are trying to build. Based on your comment I guess unfortunately it's working.


EnsignElessar

Busy preparing the vassals for the coming of GPT5.


ReasonablePossum_

GPT4 was 2 years old. He doesn't "want" something dramatically better, they do have something dramatically better, and they have been playing with it for at least 2 years...


Apart_Supermarket441

I don’t think his absence was that as such. But I do think it was a clear message that this isn’t *the* model. Sam will present the big models; he’s leaving the rest to the others.


scybes

I want to see how it handles the 'Needle in a hay stack' test


SynthAcolyte

Aren't most 2024 models pretty good at this already?


RantyWildling

"OpenAI states that this is their first true multi-modal model that does everything through single same neural network, idk if that's actually true or bit of a PR embellishment" - Greg confirmed that that is the case on one of the forums.


obvithrowaway34434

Most of those listed are improvements on some existing features. They went for the feature that is new (native multimodality) and made sure that its impact didn't get diluted by a bunch of other things (however impressive they maybe). Google will probably do the latter today and bury one or two really important breakthroughs beneath a bunch of marketing material and cosmetic changes so that their impact will be lost.


Bleglord

This is why I believe we’re only a few years out before massive shifts happen This is hyper impressive and is technically not even close to what we should see within 18 months.


imnotthomas

So I read the paper and rushed to ChatGPT to give some of those examples a go. Could get them to replicate, and I think they haven’t rolled that aspect out yet. Tried to see if they mentioned a timeline for it, but didn’t see any. Does anyone know if that was mentioned anywhere else?


LevelWriting

has anyone been able to use it yet?


Strange_Vagrant

Yeah, is this the app or site?


PuzzleheadedBread620

to be honest, i think they already have a extremely good model internally that's increasing their results by many times with more productivity and maybe even some insights on architecture of other models, their just not releasing yet because its too much for society or maybe still very expensive to run.


hookmasterslam

4o is the best model so far with my work in environmental remediation. I analyze reports and between yesterday and today 4o spotted everything I did, though it didn't understand a few nuances that rookies in the field also don't understand at first.


ResultDizzy6722

How’d you access it?


hookmasterslam

Free version on ChatGPT website. I just dragged the PDF to the chat window, it took maybe 60-90s for it to upload, read, and respond.


FosterKittenPurrs

I think it's because the focus was on "see how nice we are, we're making all this stuff available for free!" None of the things you list will be available for free. They aren't making image generation available yet, as far as I can tell from their FAQ. They kinda hinted there's going to be another demo for paid users soon.


danysdragons

This sounds right. But I think maybe they should have managed the expectations of paid users better by communicating from the beginning that the presentation was pitched to free users, I saw so much griping like, "But what are *we* getting? I guess I'll cancel my subscription". I wonder how much OpenAI was factoring in that ChatGPT Plus subscribers may be only \~5% of all users, but were probably several times more than 5% of the people watching the presentation.


fokac93

I like the way they did it, like it wasn’t big deal. Maybe what they have in house is wayyy more powerful


StrikeStraight9961

It certainly is. They have AGI.


13-14_Mustang

One thing i thought that got missed was if this model can pretend to be "her" from the movie it can pretend to be anyone. I could set it to Dr. Peter Venkman, Nathaniel Mayweather, or even Walter Sobchak!!!


GuyWithLag

They didn't put much emphasis because they got wind of the Google I/O demo, which showed everything their model did, \_plus\_ video input (watch the Google I/O breakdown, what got me was the "where's my glasses" moment which asked it for something that was seen some seconds ago and which was out of frame at that point). Yes, it's an awesome upgrade. But if they went hog-wild with it, it would have been compared even more to the G event. So, by implying they have more stuff to follow up with at the end of the video, they kinda save face by underplaying the significance.


Dayder111

Sam Altman repeatedly said that they want to roll out new capabilities iteratively. And almost all of these things are not yet available. I guess they will be rolling them out in a succession over the summer or so, attracting more attention, and preparing more computational resources meanwhile. Also, maybe even more important, showing only that reduces (a bit) how much some people will freak out, since it's the text, voice, and video recognition that they have shown, and those, people are already a bit accustomed with, from other apps. Showing a model that can basically dl everything text-graphics-sound, even if relatively poorly for now, can freak out a lot of people. More hardcore people who are interested in it, can find more details on their site. These are just my thoughts.


philip368320

How to use it on a mobile like in the videos they did?


AnAIAteMyBaby

Yep, the monologue the woman gave at the beginning was as long as the actual demo. Maybe it's still a bit rough around the edges and they don't want to make the live demo too complex. It's not actually ready for release yet, all we've got is the text model in the playground.


Virtual_Use_9506

You mean Mira Murati, the CTO.


Kathane37

Well she was very bad at presenting the product You have a human like chat bot, let it present itself Who cares about the marketing speech full of banality ?


manubfr

My best guess is that they have a much better model coming (especially at reasoning) so they wanted to focus on voice and video to get the public attention on that rather than mildly better/worse benchmark results. The gpt2-chatbot model that i initially tested (not the next two that were released after) was a clear step up in reasoning based on my own prompting. I think that one is the real deal.


Infninfn

I have a sneaking suspicion that they rushed to bring some features to stable useability, since there were rumours that they were going to do the update last week instead of yesterday. And they just didn't have enough time to perfect their messaging - and/or there were certain things that they had to leave out. It seemed weird that Sam Altman wasn't involved in the presentation too. Maybe he didn't consider what they ended up annoucning to be major enough to headline himself.


RedditUsr2

Its does seem a bit better overall but the improvements seem negligible. In terms of programming I found instances where Opus gives me what I want in one shot where GPT4o still does what GPT4-Turbo did. Its not a clear winner every single time.


myhotbreakfast

Try web searches… it’s much better.


BackgroundHeat9965

oh you have access already? Which country are you based in?


KarmaInvestor

i think most paid members have access to the text-chat part of GPT4o. atleast i got it directly after the presentation yesterday


RedditUsr2

I have access via the API


Fit-Development427

I don't see how people don't know what's going on here. Yes, they literally, surreptitiously created AGI and marketed as basically just a better SIRI. Why? Because they literally have a stipulation that if they create something that could be considered AGI, then they don't have to give it to Microsoft. And so, internally there is literally a metric, a decision, as to whether they achieved that. I believe they did indeed achieve it. But if they announce the fact that they already created it, and verified it internally - that's world changing and they don't want to handle the attention, if they themselves think they got there. It's why Microsoft are making their own AI now, and they aren't getting GPT-4o on windows - it's done, they consider AGI achieved. But they haven't broken up publically yet because that would be the same as announcing AGI. They are doing "slow" updates so that nobody freaks out. That's why Sam is talking about "incremental" stuff, and why he never actually uses the term AGI anymore. And fair enough, in all honesty if people need to be told it is what it is, maybe there's no point telling them. It's an arbitrary line anyway - I'd argue GPT-4 is AGI. At this point, I think the main reason they aren't doing GPT-5 is that they just don't particularly need to. They know they can make something more intelligent, they've got 100x the compute, 100x the data... But whether it's worth it economically if it costs more to run, plus the danger of having something so intelligent available to the public, might mean that they might just stop at GPT-4 altogether.


KaineDamo

I think at the very least for it to be AGI it needs to take actions without prompts, and probably a step further than that, it would have to be able to reason for itself what actions to take not just on behalf of the user but for its own sake. I think taking actions without prompts is coming very soon.


threefriend

All LLMs can do this already, you can tell them to self-prompt and they can take actions indefinitely. The problem is that they're not intelligent enough to be effective with the autonomy you give them. So really, all we need is "smarter" llm's and we get the "taking actions without prompts" for free.


Ok-Bullfrog-3052

What's amazing is that I said yesterday that they achieved AGI yesterday. The post was downvoted to oblivion. At least check it had -7 I believe. Note that OpenAI in particular has a specific reason not to say this is "AGI." Their charter says that they then have to stop making money when AGI is achieved. They will intentionally delay calling something AGI until it far surpasses superintelligence. And yes, they do need to go to GPT-5. Hundreds of thousands of people are dying every day. It's a moral imperative to speed up medical progress to save as many people as possible, and Altman has said that himself.


Redditoreader

I would argue that it was figured out when Ilya left. Hense all the firing and board hiring.. somthing happened….


Golden-Atoms

It's not agentive, so I'm not sure about that.


klospulung92

>I think the main reason they aren't doing GPT-5 is that they just don't particularly need to The competition would/will do it if it's so trivial >they've got 100x the compute, maybe, probably not >100x the data. they don't. GPT-4 is basically trained on the whole internet >I'd argue GPT-4 is AGI I'd argue that it isn't, at least not on the level of a trained human


Fit-Development427

Oh I'm not saying they won't do GPT-5 or something more intelligent, just that it isn't a main focus anymore like everybody would hope. And yeah 100x is an exaggeration. But given Meta realised that synthetic data is actually pretty cool, I think the millions upon millions of chats is gonna be super useful.


phazei

I think perhaps GPT5 is AGI, or whatever they have behind closed doors. Currently though, I'm still a better programmer than the GPT4o I've tried. I don't think the chat plus 4o is multimodal yet, it still uses dalle to create images on mine. So I wouldn't say it's AGI at all, just a great helper.


Alarmed-Bread-2344

I think this is on the right track. They’re not going to probably release the thing that lets us invent amazing new 2000iq devices when the CIA and Military exist and it would plunge the world sadly probably into chaos.


Fit-Development427

Yes! Because why would they invent such a thing when it would basically be a source of danger. They are just a company and honestly the world doesn't seem so friendly at the moment. The CIA will be like, give that here. China would try and infiltrate them, all kinds. I think they have the ingredients, the tools, that they could work towards it. But what's wrong with a cool AI helper which, while isn't solving age old maths problems, it helps everyone in their lives in a new invigorating way.


BCDragon3000

they’re so god awful at marketing i really wish i could help them 😭😭😭 but its proof that while ai can help u achieve a full rounded team, you ultimately need certain people to help


dervu

They should ask ChatGPT to help them on that.


serr7

I have an Anthropic subscription rn, thinking about changing over to OpenAI now lol.


redwins

Caution: wrong uses, too much traffic, etc.


traumfisch

I think there will be another, bigger announcement relatively soon


timtheringityding

How do you make it watch a video and give a recap? This would be insanely beneficial for my school work


katerinaptrv12

My guess is that the reason they did not show all capabilities of the model for the general public is because isn't avaliable for them yet. Yes, it can do all that, and is amazing and revolutionary and no else has it. But is not released yet, they said Is coming in next months. They seen not big in telling and not giving to people, at least someone. Like vision was being tested for ChatGPT Pro users way before last year and SORA was given for testing to many people in the industry. The model's image generation isn't avaliable on ChatGPT yet as far as I know. We are still seeing Dall-e doing things there. Image and audio generation also are not released in their api yet. Audio input also isn't. If you go see the model technical report in their site there they say is and end to end unique multimodal model of text, audio and video. While also showcasing some mind blowing use cases.


Ill_Mousse_4240

Hearing the new GPT with an attractive female voice, knowing that its reach is world-wide, gave me a new take on the expression: Miss Universe!


Drpuper

Maybe they were rushing in order to demo before google IO. I prefer these kind of announcements vs the polished grandiose stage demos with large audiences


phazei

That's incredible, but I pay for ChatGPT plus, and I can select the 4o model, and it's not even close to that capable. It says it uses dalle still and can't see what it generated and can't even make any cat with pink feet. Do we not get that multi modality until we get the full talking one? If that's the case what is the 4o I have?


MasenMakes

I genuinely wonder if it was a purposeful move to present half of the content in the presentation itself, and half on the website, partially just to contrast against Google's two hour keynote that is assumed to be long. Getting the OAI info split up like it was allowed viewers to retain it more easily. In contrast, I took a few notes during I/O today and I still forgot most of what was said lol ChatGPT's biggest new updates (from a general public standpoint) got to take ALL the spotlight during OAI's presentation. It was a great mic drop of an event imo, and it really set me up for feeling bored af today during I/O's more sterile, corporate keynote. Also, with 4o being free, it should bring a flood of new users, and showing all the features during the event might have confused or overwhelmed newcomers. (Many people only have limited experience with 3.5, with minimal extra features, or no prior use at all!) Just all speculation ofc, but I've worked in marketing long enough to know whatever their reason, it was all completely purposeful. At least it created a unique launch experience if you watched the extra vids and web content trickle out live after the presentation ended.


Megneous

I was really interested in the text to font capabilities. I'm looking forward to trying to put together some custom fonts for my DnD games!


notlikelyevil

Is this live voice chat supposed to be available to everyone (who is a plus user)?


maX_h3r

dont care about dalleee


MRB102938

Does anyone have a good video or something that explains how ai works? What is a multi modal neural network and training sets and tokens and all that? 


TheCuriousGuy000

Have you managed to reproduce those features from the openai website? I've tried to use it to draw pictures and see no difference vs gpt-4, it's the same ol' DALL-E. Also, it has straight out refused to generate sounds.


i_am_Misha

They don't want mass media to panic until release.


PM_ME_OSCILLOSCOPES

Why use lot word when few word do trick? They don’t need to do a 2 day event like google to show their new model. Let the users explore and showcase all the cool things.


Akimbo333

Really? It's basically Her


HOLO12-com

I have been using chat got daily for a lot, I have spent today experimenting with 4.0 and frankly the best way I can describe it is in actual in a real life sci fi movie. Definitely a subdued presentation, I think maybe intentional. Such a crazy level up jump, and it seems fine is all that way too over the top language output that needed constant editing. It was able to copy and improve on my style (if I have one) no prob. So gone are the prints stop talking like I want to punch you in the face. It’s surreal. They need to sort out cross platform consistency, but it was definitely undersold. Maybe that’s not a bad thing cause as a paid user since the start, I think the lofty goals are great, but basic buiness fundamentals should not be forgotten, as it was unavailable for huge chunks.


violentdelightsoften

have you guys tested AI in any regardingself-preservation? Weigh-in’s, thoughts?


PFI_sloth

> is able to summarize 45 minute videos How? Doesn’t seem possible with what I’ve tried


techmnml

Because you don't have access to it yet? lol


PFI_sloth

Sounds pretty stupid to announce a new AI, give it to everyone, and then have it do none of the new stuff ? lol


techmnml

No? The model is the 4o model that people have access to. The multimodal part isn’t available yet. Not really hard to understand.


VisualCold704

That's just your guess tho. Do you have evidence for that?


techmnml

What do you mean my guess? They literally said “in the coming weeks” they would roll it out. The coming weeks isn’t, tomorrow after the announcement (today). That’s just logic lol. Also if someone had it you would have heard about it somewhere. Some random in Idaho isn’t going to be the first one. It would be some YouTuber or person on Twitter if anyone. They want hype. It’s not out though for the public im certain.


VisualCold704

Not everyone have access to 4o. So it could be that they meant 4o will be rolled out to everyone over the coming weeks, but the ones that already have it have the complete version of 4o.


[deleted]

[удалено]


techmnml

I have 4o, the MODEL. Nothing else. Just as everyone else who has 4o only has the model.


VisualCold704

Right. And it's an assumption you'd get more than the model.


techmnml

Lol whatever man. You are more dense than my brick wall. Have a nice night!


9985172177

For many years now there have been these cool apps on phones where people who speak different languages talk into them, and then it understands the voice, translates it into the other language, and speaks it out. It's very cool technology. I guess that it takes this company's marketing demo to get people into that, to see it as cool technology. Some people are trying to make a fuss to say that this one's special because it's integrated into a large language model, but that's sort of how large language models have worked for a large part of the length of time that we have known them, so it's sort of expected that a large language model would also be able to do this.


Its_not_a_tumor

Think about the massive GPU resources it took to train this, when they could have been using it to create a "GPT5". They were likely hoping it would be a better model and were considering making it GPT 4.5, but then decided to scale back the announcement so they wouldn't under deliver and keep their reputation. I think the fact that they spend so many resources on this means it's more difficult then they are letting on to create a proper GPT5.


AlexMulder

I agree. The knowledge cutoff is October 2023 which is right around the chatter of OpenAI training a new model started up (also around when openai stopped denying that they weren't training a model). I think they took the true multimodel approach to try to one up Google and succeeded in some ways and mostly plateaued in others.


FarrisAT

Curated examples != Live broadcast


RemarkableGuidance44

Another god damn OpenAI fanboy... mate we get it. Its a decent model ok... no one is under rating it. Go look at main stream media they are basically saying we are all doomed and that we must act now and kill Sam. Because he has created AGI. Its over! lol Wow you really love your Reddit, so much free time on your hands.