OpenAI Releases GPT-4o: 12 things you need to know

Resources

Join the discussion: Ask Jordan questions on GPT-4o

Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineup

Connect with Jordan Wilson: LinkedIn Profile


OpenAI's GPT-4o - 12 Key Takeaways

A ground-breaking moment in the evolving landscape of artificial intelligence, OpenAI has launched GPT-4o, its latest generative AI model. This announcement brings with it a bundle of innovative features and improvements that have the potential to significantly impact businesses and workplaces

1. Understanding the GPT-4o Model

GPT-4o, with the "o" standing for omni-modal, represents a step-up in terms of its abilities. The versatility of this model allows users to work with and reason in text, video, audio, and more. This advancement sets the stage for much more dynamic and flexible interactions with AI.


2. The Power of Availability of GPT-4o

OpenAI is taking a stride towards democratizing AI, making GPT-4o available for both free and paid users. This move underscores OpenAI's commitment to making technology universally accessible - giving individuals and businesses of all sizes the ability to harness the power of a highly advanced AI model.


3. ChatGPT Paid User Benefits

While it’s intriguing that both tiers of users can access the same model, paid users have a 5x higher limit on the capacity. Although the exact differences - apart from capacity limits - between the paid and free accounts are yet to be detailed, this particular model promises a leap forward for AI, offering equal capabilities to all users.


4. Bridging Gaps with the GPT Store

Previously limited to the paid tier, now, even free users will have access to the GPT Store. The store offers users the chance to create a simplified version of the model. This allows users to inject custom configurations and documents, bridging the gap and introducing a more accessible AI.


5. OpenAI's GPT-4o is Omnimodel

GPT-4o melds transcription, artificial intelligence, and text-to-speech into one impressive model. The result is a streamlined AI that engages in more real-time interaction, significantly reducing the latency common in other voice AI interactions.


6. OpenAI ChatGPT Desktop Assistant

GPT-4o's introduction as a desktop assistant hails a new era of workplace efficiency. By enabling the assistant to have a look at the user's active screen and provide assistance accordingly, OpenAI has brought AI capabilities closer to the idea of a one-click autonomous agent.


7. Affordable Access to ChatGPT API

OpenAI has assured that GPT-4o will be accessible at a reduced cost via its API. Faster and cheaper access may encourage more businesses to integrate AI in their operations, enhancing efficiency and cost-effectiveness.


8. GPT-4o Live View

One of the game-changing features teased by OpenAI is the Live View mode, which opens the world of vision in real-time to GPT-4o, setting it apart from other AI models.


9. GPT-4o Real-Time Feel

While the reduced latency in voice-to-voice communication has the potential to create more natural and real-time interactions, users also have the ability to interrupt the model, making it a truly interactive experience.


10. GPT-4o Conversational Emotions

GPT-4o's conversational capability adds a human touch to the interactions, increasing the engagement and immersive feel of the AI experience.


11. GPT-4o Roll-Out Strategy

The new model is set for a staggered roll-out over the coming weeks to gradually allow users to test the waters and adapt to the new system.


12. GPT-4o Challenges Google and Other Competition

With Google's I/O Developer Conference just a whisker away, OpenAI's announcement may potentially steal the show, hinting at tougher competition for tech giants in the AI race.


Future of GPT-4o

The launch of GPT-4o by OpenAI is groundbreaking in the field of artificial intelligence, enabling businesses to leverage advanced AI capabilities, regardless of their size and financial capacity. As we look towards a future where AI innovations are continually redefining possibilities, the question remains, is your business ready to ride the AI wave?


Topics Covered in This Episode

1. OpenAI announces GPT-4o model
2. Key Features of GPT-4o
3. GPT-4o Rollout Plan
4. What GPT-4o means for Google


Podcast Transcript

Jordan Wilson [00:00:18]:
OpenAI has just released a new model called GPT-4o. So this is actually our second episode of Everyday AI. We normally just do this once a day, but it was a big enough announcement from OpenAI, just about an hour ago that we had to come back for a second time today so you can stay up to date. So we're gonna tell you not just what this new GPT 4 model is, but we're also going to give you quickly 12 things that you absolutely need to know. Alright. Well, hey. If you're tuning in, maybe for the first time ever or for the second time today, my name is Jordan Wilson. I'm the host of Everyday AI.

Jordan Wilson [00:01:02]:
We're a daily livestream podcast and free daily newsletter helping everyday people learn and leverage generative AI to grow their companies and to grow their careers. So, hey, thanks for, tuning in if it's your second time. But if it's your very first time, make sure to go to your everydayai.com. Sign up for the free daily newsletter. Much later today, right, because we have a lot of news that we're trying to, sort through. So earlier today, we brought you the AI news that matters for the week. So now let's just dive straight into this new announcement from OpenAI. A little bit about what it is and why it definitely matters.

Jordan Wilson [00:01:36]:
Alright. So today, we're just sticking with the facts and some observations. But tomorrow, we're gonna have a much more in-depth episode full of hot takes as always. It's hot take Tuesday, so we're you like, you know we're gonna come with it. Alright. So here's what you need to know. Well, right now, the new version, GPT-4 o is what it's called. And if you log in to your account right now, into your ChatGPT account, you may already have access to this new model called GPT 4 0.

Jordan Wilson [00:02:03]:
Alright. So, it's it's pretty pretty exciting. That was a fast turnaround, pretty quick, release. Also, you will need to make sure you select that model. So if you are, an avid GPT user, such as myself, make sure when you log in, that you select the newest model, GPT-4o. Alright. So here are, 12 things that you need to know. So first, the basics.

Jordan Wilson [00:02:28]:
So the new model is called GPT-4o, which stands for Omni modal, Omni modal. Right? So essentially what that means is the ability to kind of work with it or in reason, in in text, video, audio, and more. So more on that here in a minute. Alright. Second thing you need to know, GPT-4o will be available to free and paid users. Right? I kind of found that interesting. I don't know if now there is as much of a reason, to have a paid account. So, we'll get into the some of the differences between what a paid account actually gets you, but a big part of OpenAI's announcement today was really about their desire to make this technology available to all.

Jordan Wilson [00:03:14]:
So pretty big, pretty big, kind of thought there. Pretty pretty big move by OpenAI. I'm actually you know, like I said, this is brand new, so I'm checking this as I even type to see if you have access. So it looks like at least right now, you can only access this model if you are on the paid plan. Right now, if you are on a free plan even without logging in, you can still use the old model GPT 3.5, but it doesn't look like everyone, you know, not even, you know, users who aren't logged in, at least cannot access this. At least as of, you know, Monday afternoon at 2 PM Central Standard Time. Things could move quickly. Alright.

Jordan Wilson [00:03:52]:
3rd thing that you need to know about this new GPT-4o is that paid users will have 5 x the capacity, limit as free users. So again, we don't know, at least right now, what other differentiators there will be between the free account and paid account. If nothing else, this is huge news, I think, for free users, To have the, ability to use the exact same model, as paid users is actually something we haven't really seen out of the big large language model players. So as an example, up until, you know, this recent announcement, ChatGPT, so OpenAI's model had, you know, big difference in the model between free and paid tier. Google, with their Gemini 1.5, big difference between the free and the paid tier. Claude 3, you know, big big, kind of jump between, you know, Haiku and Sonet versus the paid version in Opus. So, this is the first, large language model that we've seen aside from, meta's llama 3, which is kind of free and pretty much open source. But aside from that, this is the first kind of big play from a large language model worker that gives free, you know, free and paid users the exact same model and level of access.

Jordan Wilson [00:05:10]:
Again, at least now that's what they said. Maybe there might be future capabilities or or future, features for the paid version of ChatGPT plus. But right now, it looks like the only differentiator might just be the limits. So having 5 times the, capacity if you are a paid user. Alright. Number 4. Well, speaking of more things for free users, even free users will soon be able to access the GPT store, which was previously limited to ChatGPT plus paid users only. So if you don't know anything, about, about that so, essentially, ChatGPT and OpenAI have what's called a GPT store, which, is a very simple way for people to go and create essentially a custom, pretty pretty simple version of ChatGPT.

Jordan Wilson [00:06:01]:
So you can give it kind of custom, custom instructions, some custom configurations. You can drop in your own, documents, things like that. So now all of a sudden, you're gonna open that up to every single free user out there. So, I don't believe you will be able to use this if you're just, you know, in an incognito or not logged in, but OpenAI did say today that even if you are a free user, you will soon have access to the ChatGPT, GPT store. So pretty big news there. And and I am checking live in real time y'all. This is like literally as quickly as we could get a podcast episode out. So it looks like right now I'm checking and even in a free account because we have multiple paid accounts and multiple free accounts.

Jordan Wilson [00:06:46]:
So I just just went into a free account and don't have access to the new, GPT-4o model yet, but we do have access to it, obviously, both on our iOS. So the the mobile app as well as obviously in the browser and the paid account. So, yeah. Let me know. Did, anyone out there I I I know we're going live here as well as well as the podcast, but, wondering if anyone here live has has jumped in. Number 5 thing that you need to know is, well, g p d four o combines transcription, intelligence, and intelligence, and text to speech, essentially, all in one mode. So, OpenAI kind of went through, kind of a historical look at how they've, traditionally handled, you know, conversations with an AI. Right? So conversations with a large language model.

Jordan Wilson [00:07:29]:
In a lot of it, you know, there's a lot of things going on, especially if you're using, your voice to talk to ChatGPT or if ChatGPT is talking back to you, which is, the the read aloud option has been available for quite some time. But, you know, we're gonna leave, some demos in our newsletter today, which should be going out here hopefully in an hour or 2. But, you know, the ability now to, have all of this happen in one instance without a bunch of latency is is pretty impressive. Alright. The 6th thing you need to know, and this is actually, I think, maybe the biggest, is there will be a new desktop assistant that can, quote, unquote, hear and see what you're working on. So, again, and and we're gonna talk about this more in our in our hot take Tuesday, tomorrow. But OpenAI did what they said were live demos, and they looked pretty live, unlike, Google's, kind of snafu back from, December when they previewed their Gemini model for the first time. And then we, kind of came to find out much later that it was all a marketing stunt and none of it was actually live.

Jordan Wilson [00:08:38]:
Today's demo did truly look live, and it was confirmed by, you know, at least 1 or 2, OpenAI senior staff members that it was live. It wasn't a prerecorded demo, and it you know, I watched it myself live. It looked like it was live. There were a couple hiccups as well, but, the new desktop app, I think, is huge. There were also some, some videos that that, that OpenAI put out on their YouTube channel, about 10 or so short videos showing some of the different capabilities. And I like that they do a split screen so you can kind of see, what's happening inside of the g, the GPT-4o interface, as well as, presumably what is a live recording of the user or users, who are interacting, with this kind of new, desktop assistant and on the app. The desktop assistant, I think, is going to be huge, at least, if the way that they demoed it, if that actually comes to fruition where you can actually, you know, have a desktop app. You can be working on something on your computer, you know, and an overlay on the screen is what it looked like, but the chat g p t app would come up.

Jordan Wilson [00:09:45]:
You can click to speak to it and say, like, hey, You know, look at this on my computer screen. You click one button. It gets what's on your computer screen. Help me solve this problem. Help me improve this code. Help me finish this email. Right? So, pretty cool. You know, it's it's technically not an AI agent in the, in the form that we thought of it, but this is kind of agent capabilities.

Jordan Wilson [00:10:09]:
Right? Like, maybe not autonomous agents, but one click agents. Right? So, think with this desktop app, no matter really what you're working on, being able to instantly, have a one click conversation with this new, GPT-4o and to be able to share with it within what looked like one click, kind of what's happening on your screen and for it to be able to help walk you through it, maybe help you, you know, create a blog, you know, finish a blog post. You know, I think that changes really how we work, because before, you know, even people who are great at ChatGPT, you would have to do a lot of things. You know, uploading files, screenshots, documents, etcetera. And now it seems like this new desktop app, whenever it does, or may be released, is really going to make that experience much more seamless. So if you haven't already gone through the process, whether, actually physically working with AIs in your day to day or even in your mind. Right? So so we're gonna have more on that tomorrow, but you're gonna most most people out there. Right? And presumably, you know, OpenAI is is kind of, kind of the the the pace car here and everyone else, I'm guessing, will be scrambling to catch up.

Jordan Wilson [00:11:20]:
But, you know, if you haven't already come to the conclusion and, you know, we, in our 2024 prediction show back in December, we predicted this that, you know, most people are gonna be working with agent workflows in 2024. And this is, I think, the the first example of that is just having essentially a dedicated agent in the form of a desktop app, that can see what you're working on. You can talk to it quickly, and it can help you work in real time. At least to me, that was probably the most impressive piece of the demo. There are a lot of other things that, were were really cool, but was one of the most, impressive to me. Number 7. And, hey, if you do have your question, get it in. We don't have a ton of time, but we'll try to get any, any of your comments.

Jordan Wilson [00:12:03]:
You know, Douglas said he doesn't have, 4 o yet. You might have to, you know, if you are listening to this, live or or on the podcast, you usually, this is just I'm an Internet dork. So try logging out, clearing your cache, clearing your cookies, logging back in, and presumably, you might have it at that point. But I'm guessing this, like most updates from not just, OpenAI, chat gbt, but any large language model, You know, they're they're iterative. They're slow slow to roll out. They go in phases. Sometimes it's available to everyone. Sometimes, you know, you might have to wait a couple hours, a couple days.

Jordan Wilson [00:12:35]:
So, yeah, make sure to go check that out. Alright. So here's another big piece. Number 7, g p d four o is rolling out the API at a reduced cost. So, OpenAI said it is much faster and much cheaper, to use the API as well. So, what that means is there are literally 1,000 and probably tens of 1,000 or even more of products that you probably use every single day that are connected to ChatGPT via OpenAI's, API. So presumably what that means is those programs are going to be getting faster and better because they will have access to this new model. And OpenAI said it is much cheaper and much faster.

Jordan Wilson [00:13:14]:
So, presumably also, right, especially if you're an enterprise company and if you're paying a lot of money, maybe you, you know, gotten to a contract, with the company maybe a year ago and it was pretty expensive, you should be revisiting that contract because, especially over the last year, you know, the API has gotten much more affordable and faster. But now even with this, it is getting even less expensive. So so pretty pretty big news, especially if within your company's tech stack, you're working with multiple, you know, pieces of of enterprise software that have a, OpenAI GPT connection via their API, which is, you know, nowadays, it's like any. It seems like just about any enterprise, software, whether it's marketing, advertising, communications, CRMs, etcetera. It seems like they all have some, connection to, GPT. Alright. Number 8. So OpenAI demoed a live view mode, presumably being able to use vision in real time.

Jordan Wilson [00:14:11]:
I say presumably, right, because I I I never really wanna report on things even when they look true because, you know, Google really just, you know, kind of I won't say that that they straight up lied to everyone, but they were not very truthful in their original, Gemini marketing video in December. But it did look, very legit and very live from OpenAI. But, essentially, where you had some people on the stage, some developers, you know, literally just turn on, a camera and it is a live view on the camera and to say, hey, chat gbt, What is my reaction? You know? And the developer was smiling, and, chat gbt said it looks like you're happy. So the new model was able to literally recognize video in real time, which was crazy. Another one was solving a math problem, a simple equation in real time. Again, the developer was showing the camera and, you know, literally working out a math problem by hand and asking ChatGPT for directions on how to solve. This is essentially it looks like OpenAI is delivering on what Google teased with its marketing, but never actually had working 6 months ago. So pretty pretty impressive there.

Jordan Wilson [00:15:20]:
Number 9, reduced latency with a real time feel in voice to voice communication. This piece was huge. So normally, if you've used any voice model, even if you were using the, the voice mode on ChatGPT's app previously or anything. Right? So the Google, you know, Google Assistant, Siri from Apple, you know, Alexa from from Amazon. Most of these, systems have a noticeable delay. Even if it's maybe only a second, it's pretty noticeable. It's it doesn't seem like real time conversation. At least in the demos with OpenAI, didn't seem like that.

Jordan Wilson [00:15:57]:
The latency was very low. At times, not even noticeable. Right? Like, I was listening and observing, and I'm like, wait. That's probably faster than I can respond if someone asked me a question. Right? Sometimes, like, I take a second to, like, actually process something and think about it. And I'm like, this is a pretty quick pretty quick, response time. The latency was super low. Another thing is you can, kind of cut off, chat gbt, or gbt4 o when it's responding to you.

Jordan Wilson [00:16:25]:
So if it looks like it's going in the wrong direction, you can just speak, and essentially cut it off and correct course. So, the latency was it seemed pretty pretty low. Alright. Number 10, a couple of more things. 10 just was a much more human feel, including, yes, there were some mistakes, which led me to believe, alright, maybe this was actually live and not real. Right? Yeah. Unfortunately, with with Google Gemini, I think a lot of people, you know, have a mistrust of AI models. Right? And I think the the Google Gemini kind of marketing stunt, so to speak, in December with their Gemini model only, increased people's mistrust, mistrust in, you know, these big tech companies who are saying like, oh, look at our AI.

Jordan Wilson [00:17:07]:
It does this, this, and this. And, you know, turns out with Gemini, none of that was really the case. It was all kind of manufactured behind the scenes. But with this, there were some mistakes. There were some mistakes in, OpenAI's demo, which I actually liked because that told me, like, okay. This this is believable. Right? So at one time, the developer asked OpenAI a question, and it was responding about something else. It was responding about, you know, oh, the the wood in the table.

Jordan Wilson [00:17:33]:
So this was probably from a previous response. The developer just said, oh, no. Not that. That was our last conversation. I'm asking about this and course corrected, and then ChatGPT instantly, got the question correct. So it did have a much more human feel, including that. Yeah. It was getting a thing or 2 wrong, which obviously over time and when millions of people are using it and, providing feedback in real time, like, yes, this is good.

Jordan Wilson [00:17:57]:
No. This is not good. You know, presumably this model is gonna get smarter. So next next, next piece here, 11, the 12 things that you need to know about the new GPT-4o. It will start to roll out to users in the, quote, coming weeks. Alright. So like I said, a lot of people, myself included, already have access to this new model. If you are a paid user, go ahead and try now.

Jordan Wilson [00:18:23]:
Love what Liz said here. She said Jordan is every IT manager. Try turning it off and back on again. Yeah. Exactly. So if you don't have access yet, don't worry. You'll probably get access in the coming days, but not all of these features are available yet. Right? You you know, right now, this new model so it's 2 different things.

Jordan Wilson [00:18:40]:
You have to think of it as that. There is a new model, and then there are all of these features that work with the new model. So it looks like OpenAI is probably first going to be rolling out just access, to the new GPT-4o model. Again, o stands for omni model or the, you know, kind of what they're hoping is or will be referred to as the everything model and the omni model. So it does, most people are gonna have probably access to that first before all of these other features, but do pay attention. You know, so we we don't know what's gonna be rolling out first. As an example, will all of these updates first be coming to the app? Will the desktop app, you know, kind of the smart desktop assistant be rolling out, you know, in the coming weeks as well? Not sure, but do check. And, obviously, if you tune in to Everyday AI, we do this show literally, every single weekday.

Jordan Wilson [00:19:29]:
We go live at 7:30 AM CST. So maybe this is your first time listening. So, you know, as these new updates get rolled out, we'll obviously talk about them on the show. And then we have Google. Our last thing to know is Google is likely in trouble. So, you know, as we have a commenter here, Cyber S, from, YouTube saying is anyone excited about Google tomorrow? Yeah. So the timing on this one is, interesting. Right? So OpenAI just officially announced this, this event, about 3 days ago.

Jordan Wilson [00:20:01]:
Whereas we've known for a couple of months that Google has their IO developer conference tomorrow. So OpenAI just kind of swooped in here and maybe potentially stole, the limelight. I mean, we'll see what Google, announces tomorrow, but, man, if I'm if I'm sitting in the seat at Google, I'm not feeling super great. Right? Number 1, you know, a lot of these kind of features or capabilities were teased by Google, like I said, 6 months ago in their original Gemini marketing video, and turned out that a lot of it was manufactured behind the scenes. Kind of this ability to interact with an AI in real time, It wasn't real. It none of it existed. Right? Google later shared a research paper that said, oh, here's how we actually put it all together. It was multiple and a human was involved.

Jordan Wilson [00:20:48]:
We took the video. We grabbed frames, then we prompted and re prompted the AI, and then, you know, we spit out this result. So it wasn't actually true. So if I'm Google, I am not feeling good heading into the IO developer conference tomorrow if I'm being honest, because OpenAI just came and kind of took their lunch and their dinner. Alright. So tune in for that. Actually, tomorrow, we're gonna be coming in with hot takes on what this actually means. Alright.

Jordan Wilson [00:21:16]:
So, today is kind of, extra edition, just bringing you the facts on this new GPT 4 model. So, again, we're gonna go over it very quickly here. Here are the things you need to know. The new model is a GPT 4 variation called GPT-4o, which stands for Omni Model. GPT-4o will be available to free and paid users. Paid users will have 5 x the capacity limit as for users. We don't know what other differences there will be. 4, even free users will soon be able to access the GPT store.

Jordan Wilson [00:21:47]:
5, GPT-4o combines transcription, intelligence, and text to speech all in one mode. 6, the new desktop assistant will be coming out that can hear and see what you're working on. 7, GPD 4 o will be rolling out to the API at a reduced cost. 8, OpenAI demoed a live view mode, presumably being able to use vision in real time. 9, we saw a reduced latency with a very real time feel and voice to voice communication with the new model. 10, it had a much more human feel, even including mistakes and the ability to cut off the model. 11, it will start rolling out to users in the coming weeks. A lot of people have access to the model.

Jordan Wilson [00:22:24]:
Features will be rolling out. And 12, I personally think Google is in trouble. So we'll be talking about this tomorrow. If this was helpful, let me know in the comments. Hit that repo. Share this with your friends. If you're listening on the podcast, maybe for the 2nd time today, thank you for your support. If you wanna leave a review on Spotify or Apple, we'd super appreciate that.

Jordan Wilson [00:22:43]:
So we hope to see you back tomorrow and every day for more everyday AI. Thanks, y'all.



Embrace the Omni Modal Era: How OpenAI's GPT-4o is Set to Disrupt How Businesses Leverage AI


As technology continues to evolve by leaps and bounds, businesses are constantly on the lookout for the next revolutionary trend that will steer them towards growth and success. Enter the 'Omni Model' era, an exciting tipping point in AI evolution that could significantly impact your business strategy. In harnessing AI technology, businesses need to reassess the limitations of tunnel vision and broaden their horizons - because the multidimensional potential promised by Omni Model technology is finally here.

In a recent episode of the much-esteemed podcast 'Everyday AI', host Jordan Wilson revealed insights about this burgeoning trend. Highlighting the launch of OpenAI's latest model, GPT-4o, the episode swiftly but comprehensively delved into twelve pivotal aspects of this development that businesses need to understand.

One of the major takeaways from this episode is the game-changing versatility of GPT-4o. Signifying 'Omni Modal', the 'o' in GPT-4o represents the model’s ability to reason in not only text but also video, audio, and more. This represents a veritable leap in AI capability, removing the hurdles between different types of data processing for a seamless, comprehensive technological experience.

Moreover, this isn't restricted to businesses that can shell out the big bucks, OpenAI confirms GPT-4o’s accessibility to both free and paid users. As Jordan Wilson pointed out, the model works with lightning speed and reduced costs via the API - a potentially transformative development for the plethora of products that lean on ChatGPT.

Of all the impressive features demonstrated, the new desktop assistant capability that allows the AI to "see and hear" what users are working on shines out. As Wilson exclaims, this is the first step towards agent workflows, bringing AI closer to a "one-click" human-like interaction model.

Working with GPT-4o also brings with it a reduced latency for a real-time feel in voice-to-voice communication. This characteristic humanizes AI, making it an increasingly desirable co-worker in a decentralized, digital workspace.

Amid all these monumental changes, Wilson concluded with an interesting opinion - Google might find itself in treacherous waters. The revolutionary edge GPT-4o brings may put it ahead in the technological race, leaving Google to play catch-up.

This is a critical time for businesses to reassess their AI strategy and question: Are we merely keeping pace or spearheading advances in the way we leverage AI? The launch of GPT-4o has unfolded a multitude of possibilities, and the race is on for businesses to capitalize on them.

Gain Extra Insights With Our Newsletter

Sign up for our newsletter to get more in-depth content on AI