Ep 211: OpenAI’s Sora – The larger impact that no one’s talking about

Impressiveness of OpenAI's Sora: The Game Changer

OpenAI's first-generation model, Sora, has exceeded expectations and invoked questions about the future of creative work. With video quality that makes it hard for humans to distinguish between real and AI-generated content, Sora is challenging the perception of reality in videos.

Comparison of Sora and its Contemporaries

Compared to other AI technologies such as Runway Gen 2, which lacks detail and appears artificial, Sora generates more realistic and detailed content. The potential implications of these advancements could accelerate the achievement of artificial general intelligence (AGI) - where AI outperforms the average human at multiple tasks simultaneously.

The Influence of Compute Power

Surprisingly, the level of compute power deployed significantly impacts the realism of Sora's outputs. OpenAI shared three videos, each generated with different levels of compute power: base compute, 4x compute, and 32x compute. The results clearly establish the correlation between increased compute power and improved video realism. Thus, future models, particularly those aimed at achieving AGI, will require significantly more compute power.

Tipping the Scales Towards AGI

Sora's release is not only a big leap for OpenAI but also a significant stride towards AGI. The tool aims to understand and simulate the real world, including three-dimensional physics, which will help to predict real-world scenarios. This is a conductive step away from simply creating visually impressive content and more towards understanding and predicting the real world, a vital prerequisite for AGI.

The Edge of OpenAI's Sora Over Other Platforms

With its unprecedented capabilities, OpenAI's Sora stands ahead of other text-to-video platforms like Runway, Pica Labs, and even Meta and Google's new text-to-video platforms. The sophisticated, minute-long videos created by Sora with multiple shots and impressive details outshine the outputs of other platforms.

Implications for Business and Industry Trends

OpenAI's cutting-edge technology and partnership with Microsoft could give them a distinct advantage in the AI field. While tech giants like Amazon, Meta, Google, Apple, Nvidia, and Microsoft are potential acquirers of AI startups with unique technology, to compete with OpenAI and Microsoft, they may need to consider combining multiple acquisitions.

In Conclusion

AI technology, particularly advancements like Sora, is progressing at a rapid pace. Businesses and industry stakeholders must remain informed and ready to adapt to these developments. They should also consider supporting the evolution of AI, such as by sharing educational content on the subject to increase understanding and appreciation of its capabilities. With its ability to understand and predict the real-world scenario, AGI is closer than ever - the future is now!

Topics Covered in This Episode

1. Detailed Examination of Sora’s Technology
2. Comparison of AI Art Technologies
3. The Push towards Artificial General Intelligence

Podcast Transcript

Jordan Wilson [00:00:16]:
I think most people have open AI's Sora all wrong. So if you don't know much about it, open AI last week just released a new text to video model called Sora, and it is extremely impressive. And everyone has been collectively talking about just that, how impressive the video is. But there's something larger at work here that I don't really see anyone talking about. We're gonna talk about that today and more on Everyday AI. Welcome. What's going on, everyone? My name's Jordan Wilson and I'm the host of Everyday AI. Everyday AI, it's for you.

Jordan Wilson [00:01:02]:
It's your daily live stream, podcast, and free daily newsletter helping everyday people learn and leverage generative AI. So if you haven't already, make sure to go to youreverydayai.com and sign up for the free daily newsletter. So every single day, we not only go over what's happening in the world of AI news, we cover kind of, new tools, fresh finds from across the internet, but we also break down, each and every podcast conversation in pretty great detail by me, a human, writing this for other humans. Right? So we tell you not just what's going on in the world of AI, but, a topic each and every day on how you can learn and leverage what we're talking about. I tell people it is a free generative AI university. No matter what you care about, you can go on our website, you can go listen and watch to now more than 210 different podcast episodes, videos, etc, and go read every single newsletter we've ever written. So it is I still don't know any other source, any other single source, that has more free generative AI education across every single medium. It's actually wild how much content we have there.

Jordan Wilson [00:02:08]:
So before we get to that topic on what's actually the larger impact of OpenAI's Sora, Let's start as we do every single day by going over the a I news. All right. So let's start with scale a I and the U. S. Department of Defense are joining forces. So Scale AI and the US Department of Defense are partnering to develop a comprehensive test and evaluation framework for the responsible use of large language models within in the department of justice, the department of defense. So this framework will allow the DOD to deploy AI safely and accurately in military applications. This partnership between Scale AI and the DOD will create benchmark tests tailored specifically to DOD use cases to measure LLM performance and provide real time feedback for war use.

Jordan Wilson [00:02:59]:
So, pretty big news there worth taking a look at. Next, Google has released a new AI feature, and just like Gemini Ultra 1.5, this one is also a limited release. So Google is introducing a new feature called shop with Google AI that allows users to search for products and generate AI inspired out fit variations in a single click. That's interesting. So this will enhance the user experience by providing personalized and effortless shopping options. And it will obviously be driving a lot more business and growth for retailers partnering with Google. So right now, this is just slowly rolling out to a select few users inside the Google search app. So, hey, holler at us if you have access.

Jordan Wilson [00:03:46]:
I kind of want to know, right, can I use AI to just make myself dress better? That's what it sounds like. Alright. And last but not least, the internet has all at once fallen in love with a large language model. That is not new. So it it this could be a Valentine's Day hangover, but Internet users have all seemingly fallen in love with an older large language model named Grok at the same time. So let me just say this out loud. This is Grok, with a k. Or or sorry.

Jordan Wilson [00:04:17]:
Not the Grok with a k. See? It's so confusing. So so Twitter has their Grok with a k. This is Grok with a q. Okay? So GROC is actually a California based semiconductor company. So GROC with a q is a generative AI solutions company that has developed a unique technology called l p u, which is language processing unit, and this is their interface engine and it's designed to accelerate the processing of large language models. So the reason why I think everyone is all of a sudden in the last 24 hours sharing about this on social media is because last week, Rock with a q won the, won a large language model benchmarking competition. So now the Internet is all at the same time going wild over it.

Jordan Wilson [00:05:06]:
And, yes, you should check it out. It is very impressive. And, essentially, by using this, kind of LPU, language processing unit, technology versus the general GPU, technique that all other large language models are using, it can generate text based output much faster. Right? Is the quality there? You know? Maybe. You know? It just runs other, it runs other local models such as, you know, models from meta. So, you know, it the the quality is really just obviously dependent on what model that you're using, but the speed is wild. So, yeah, it's worth checking out. So we'll be sending a link and probably a a video of that so you can just see it in our newsletter.

Jordan Wilson [00:05:50]:
So make sure, if you haven't already, go to your everyday ai.com and sign

Jordan Wilson [00:05:55]:
up for that free daily newsletter. Alright. That's the intro. Now I'm excited for today. I'm very excited for today to talk about OpenAI, SoRa, and and the larger impact that no

Jordan Wilson [00:06:06]:
one is really talking about. And, hey, to our livestream audience, thank you so much. If you're listening to the podcast, come join us live. It's a good time. You know, you can network. That's the good thing. There's a lot of networking happening happening in our comments, here at Everyday AI with our daily livestream. So you can come and connect with, you know, Tara, who's joining us here from Nashville, and doctor Harvey Castro, a top voice in AI.

Jordan Wilson [00:06:29]:
Rolando, thanks for joining us. Nancy and Jay, Douglas, everyone. Excited to have you all on here. But, hey, I wanna know right now. What are your thoughts, you know, before I get too deep? And I'm gonna tell you right off the bat. Alright? So if you're on if you're on a limited, you know, time schedule here and you're like, alright, Jordan. Get to the point. I'm gonna tell you right away.

Jordan Wilson [00:06:49]:
Don't worry. I'm not gonna drag you on for 20 or 30 minutes. But I wanna know from our livestream audience, specifically, what do you think of OpenAI's? Alright. So let me just, before we get there, ask 1 more question. And I want to know from everyone who's joining this live, give me a yes or a no, or if you're joining us on the podcast, I always leave in the show notes, you know, in the episode description. You can email me. You can connect with me on LinkedIn. I I read all your emails, all your messages.

Jordan Wilson [00:07:14]:
Sometimes it just takes a minute. But let me know. Should hot take Tuesday? This is hot take Tuesday. Right? Every Tuesday, I come with a hot take. You know, something that's outside of the normal AI news or bringing on guests, which is what we do the rest of the week. Should this be a call in show? Right? So the the software we use to stream, called StreamYard, there's there's a feature where we can kinda have a little waiting room and you can call in just like this was, you know, an AM radio show. Right? And you can call in with your opinion. So, shout out to, to to Niciotti who who dropped this, recommendation, yesterday.

Jordan Wilson [00:07:49]:
Should we do that for Tuesday? Should you come on and, you know, you can you can join and, you know, we'll have someone kind of, quote, unquote, take your call, you know, in the waiting room and you can wait in line to come in and and drop your hot take? Let me know yes or no. Does that sound fun, or is it too early in the morning for for for you to come with some spicy hot takes? I wanna know. You know, let me know. Yes or no. Should should this be a call in, or should this just be a a random, a random rant from me? Alright, woozy says do it old school radio style. I love some, some

Jordan Wilson [00:08:24]:
just call in radio. AM, right? Alright. Let's get cookin'. And it's Hot Take Tuesday. I promise you, we're gonna deliver. So let's just go

Jordan Wilson [00:08:36]:
over the facts and then I'm getting straight to the end point. All right. So Sora is a new text to video model from open AI And it is very impressive. All right. It is very impressive. But let's just go ahead and tell you the 3 things that you need to know that are about the larger impact that and 2 of them are things that no one

Jordan Wilson [00:08:59]:
is talking about.

Jordan Wilson [00:09:01]:
I'm sure maybe someone is somewhere. I can't find anyone. Ready? Here's here's 3 things that you need to know, and 2, that no one is talking about. So 1, Sora is light years ahead of any other text to video platform right now. So including Runway, Pica Labs, and also Meta and Google have previewed their new text to video. Right? So, Meta's Emu Video and Google's Lumiere. Alright. It's also worth noting, right now, out of the 5 that I just mentioned, so OpenAI, Sora, Runway, Gen 2, Pico Labs, one point o, Meta's, Emu, and Google's Lumiere, only 2 of them are publicly available right now.

Jordan Wilson [00:09:42]:
So only Runway and Pika Labs. Right? So OpenAI Soarer is not publicly available. However, people do have access. So kind of red teamers or those people putting it through safety precautions in a select group of visual artists. So, yes, it is technically kind of out there, but not really. The only ones publicly available for all of us are Runway and Pika. But OpenAI Sora, is it's in its own. I would say even to put it say it's in its own category is doing it injustice.

Jordan Wilson [00:10:12]:
It is in its own sphere.

Jordan Wilson [00:10:14]:
You can't compare. At least early results. You you you literally cannot compare the quality of of these products. Alright. So that's number 1, and we're gonna talk about that more. Number 2, the timing of Sora, with all that's been going on in February at OpenAI, means that AGI is near. Artificial general intelligence is closer than we all think. Yeah.

Jordan Wilson [00:10:43]:
Hot take Tuesday. We're bringing we're bringing some fire, and you know I got receipts. Stick around. Right? And then number 3, I don't think anyone will catch the combination of OpenAI and Microsoft. Alright. And I'm gonna lay out, you know, a couple scenarios, but I think that they that Sora and what Sora means specifically as it as it comes to artificial general intelligence, h e I. Right? If you don't know much about h e I, I'm gonna get into it here in a minute, but that's essentially when, you know, when AI systems become smarter than the average human or the smartest humans at general things. Right? You already have narrow kind of AGI or or narrow AI that's outperforming humans in specific tasks.

Jordan Wilson [00:11:31]:
But AGI is when, okay, when these AI systems are better than most humans at general tasks.

Jordan Wilson [00:11:38]:
They were closer than we're think, and I don't think anyone right now is able to catch

Jordan Wilson [00:11:45]:
the combination of OpenAI and Microsoft. Right? Microsoft, reportedly owns 49% of open AI with a large many, many $1,000,000,000 investment in the

Jordan Wilson [00:11:55]:
company. I don't think anyone's gonna catch them. I really don't. I'm gonna lay it I'm I'm gonna lay down a couple scenarios in which it could happen, but it's an unfair advantage right now. Alright. So let's get to those 3 things.

Jordan Wilson [00:12:10]:
I cut them down. Soars light years ahead, number 1. Number 2, the timing of this means AGI is near. And number 3, I don't think anyone will catch OpenAI in Microsoft. So let's let's look at number 1. Let's look at number 1 first. Right? And if you're joining us on the podcast, I apologize. You're not gonna be able to see this very well, but I'm gonna go ahead and I'm gonna show some examples.

Jordan Wilson [00:12:31]:
Alright? So make sure to check the show notes, you can come see this. I'm sure other people have done this, but I took, 4 different clips from OpenAI's Sora. Okay? The good thing that I liked in their research paper, OpenAI, allows you to download their, results as well as see the prompt that they use to generate these results. Alright. But there's a lot more that goes on behind the scenes than than just that. Right? We don't know how many attempts. Right? Because if you've used something like Runway or Pica Labs, like I have, sometimes, you know, the 10th attempt is better than the first. Right? Or sometimes after 20 attempts, you'll get something you're like, oh, this is much better than the 1st attempt.

Jordan Wilson [00:13:13]:
So we don't know kind of the process, of how OpenAI generated this, or it could have just been, yeah, here's literally 1 prompt. Copy and paste, 1 shot. It could be. Alright? But I'm gonna go ahead for our livestream audience anyways, and I'm gonna share. Alright? I'm gonna share this video that I did, took the exact same prompt, downloaded the videos from OpenAI. So we have OpenAI first, and then we have Runway Gen 2. Alright? And I want you all to see it, and I want to hear all of your thoughts as well. I'm gonna do some light narration here just to tell you what's going on.

Jordan Wilson [00:13:49]:
Alright. So our first one here, we have woolly mammoth, walking through the snow. And again, first we have open ai, and second we have runway. So with open ai's, it is fantastic. So here we go, the video's playing. It looks somewhat real, right? No one knows what a woolly mammoth actually looks like. And then we have runway, version 2. It doesn't look bad, runway gen 2, but you'll see some of these woolly mammoths are kind of walking backwards.

Jordan Wilson [00:14:16]:
Some are missing limbs, where the one from OpenAI is is pretty is pretty sound. It's pretty solid. Alright. And then we have OpenAI's, SOAR model with this kind of, astronaut, theatrical, you know, wearing these red hats on another planet. The OpenAI is fantastic. Also, I'm gonna pause here. And I'm gonna say one thing that's worth noting that I should've started at the top of the show is what you can produce, allegedly. Right? A minute a minute of video in OpenAI with a single prompt.

Jordan Wilson [00:14:45]:
And you'll see in this example here, this kind of, astronauts red hat example from Sora. It actually cuts to multiple shots, which is something that right now you cannot do in Runway Gen 2 without regenerating. Right? So you have a minute. Right? So OpenAI, Sora, creates a minute of video with, splicing in times multiple shots together. Sometimes it doesn't. I'm sure there's ways to control that. And with runway gen 2 and all the other AI video models right now, it's essentially 4 seconds. You can kind of stitch them together and extend it.

Jordan Wilson [00:15:17]:
You you know, there's kinda some workarounds you can get maybe up to 16 seconds, but it's all kinda the same. So OpenAI's, SOARUP model. Great. Runway Gen 2 didn't do that good. Right? It's just a random random guy here in a motorcycle helmet not really moving. Nothing very theatrical. Alright. Then we have our Sora.

Jordan Wilson [00:15:35]:
This is supposed to be a drone shot over a, kind of like a gold mining town from back in the day. And Sora looks pretty nice. It's pretty smooth. It does just that. Runway gen 2, you know, it has a similar look and feel, but if you look at the, the subjects of the image, it looks like maybe 2 horses and 1 person, but if you look, kind of the person or the horse morphs into something else at the end. Not super cinematic. Doesn't really seem like a drone shot. Alright.

Jordan Wilson [00:16:03]:
And then our last one here, OpenAI's Sora. It is a close-up, it is supposed to be a close-up of a woman's eye, kind of her her stare. The the detail on this one is mind boggling. I I wish I had more words this early in the morning to describe, this, this one from opening eyes, sore up, but you can see the, oh gosh. I mean the details of the eyelashes, the details of the skin, the lighting right from the pupil. There is a reflection in the pupil with and even the reflection in the pupil seemingly has correct composition. This is this is wild. This video is wild.

Jordan Wilson [00:16:42]:
Right? And then in Runway Gen 2, that you you know, to Runway's credit here, this doesn't look real, but it's actually a pretty decent shot on this last one from Runway. Right? You don't get the detail, but you do get a woman. Her her skin tone is is very looks kind of AI generated, I guess. It looks either, like, very AI generated. The skin tones are too smooth, but there are some nice shadows. There's some nice, lighting going on with this last one from Runway. So it's actually not too bad from Runway. Alright.

Jordan Wilson [00:17:14]:
But long story short, if you if if you're not gonna watch this one, the comparisons are that's that's apples in in in basketball. Right? We're not comparing apples and apples, we're not comparing apples and banana chips, we're not comparing apples and fruit. Thora is in another sphere. It's not even in the same category. Right? I have a little bit of a background. I talk about this sometimes in in Martech communications, but a lot of times that involved creating videos. Right? I've created a lot of video in my day, so I have a decent eye. You know, more than more than the average person, I've spent probably more than a 1000 hours safely, I would say, multiple thousands of hours either shooting or editing or, you know, video production.

Jordan Wilson [00:18:02]:
It's not close. It's it's it's not even fair. How far ahead for a 1st generation model, Right? For a 1st generation model, which is what OpenAI's SOAR is. It should not be this good. Right and that's gonna lead us to our 2nd point you know, even if you want to go back and look when chat gpt came out right so when chat gpt came out g p d 3.5. The world was shook. I was not impressed. I'm being honest because our team had been using the g p t technology for 3 years.

Jordan Wilson [00:18:39]:
And I thought at the time of its release, if you wanted to use the GPT technology to create text from text, right, so a text to text prompt, I didn't think ChatGPT was even a top two option. At the time, I thought, what was then called Jarvis, what is now called Jasper, was better. I thought Copy AI was better. So even OpenAI, right, which they're known for their now I think they're really well known for chat g p t and kind of text to text and multimodal with text. It wasn't even a, I don't think, a top use of its own technology at

Jordan Wilson [00:19:11]:
the time. Right? The same thing with DALL E. You know, if you look at DALL E 3, DALL E 3, if

Jordan Wilson [00:19:18]:
I'm being honest, is not, I would say, in the top 3. Maybe it's in the top 3. Right? But Midjourney is so far ahead of everyone else. Then you have, Stabilities, you know, Image Generation. You have, Leonardo. You have some other great products. Is DALL E top 3? I don't know. Maybe.

Jordan Wilson [00:19:34]:
Maybe they're number 3 or number 4, but they're not number 1. Right? So when you look at OpenAI's 1st iteration or first attempt at something, it's usually not mind mind boggling. The Sora model is mind boggling. I don't understand. What this means for the future of creative, we're gonna have another show on that because that's not what this is about. That's what everyone else is talking about. We'll talk about this. Let me just say this.

Jordan Wilson [00:20:03]:
It is going to be extremely difficult for the average human to understand what is real and what's not. Right? Because we always thought, oh, okay. Well, you know, on photos, you know, if you look at a photo long enough, I think it's actually kind of easier on photos because it's still. And you can inspect it, right? You can look at the fingers. Oh, are there 6 fingers? Right? Or, oh, is the arm bent in a weird way? With video, I mean, some of this video looks so smooth. Right? And how your brain processes things, it is much harder if the quality passes a bar, if it's a yes or no, if the quality passes a bar, it is much harder to detect AI video than it is in AI images. And because we've had this long period over the last probably year now where ai images have been good enough to almost pass for real. So, you know, people out there on social media, your average consumer, etc, we've we've been exposed to this, in a large scale.

Jordan Wilson [00:21:01]:
And we've had time for our brains to kind of rewire things and to first see something and be like, oh, okay. Is that AI image is that an AI image? Maybe. There's no warning sign. There's no ramp up here with OpenAI. It is with with with Sora, it is entirely too powerful. Yes. There's still some instances where, you know, you could clearly tell. Right? But if you can, in 1 prompt, create a minute video and maybe you generate it 2 or 3 times I can guarantee that the at least from samples we're seeing so far, the majority of that 1 minute, if you cut it up, no one's gonna be able to tell.

Jordan Wilson [00:21:37]:
This is coming from someone again. I've spent thousands of hours in video photo production. You're not gonna be able to tell. And this is

Jordan Wilson [00:21:46]:
the 1st model. Oh my

Jordan Wilson [00:21:47]:
gosh. Wild times, y'all. Yeah. What Juan is saying here Juan, thanks for joining us. Juan says, wow. That's a huge difference between Sora and Runway Gen 2. Definitely lot a lot more realistic, for Sora. Yeah.

Jordan Wilson [00:22:02]:
Absolutely. Mind yeah. Mind blown emoji. A lot of mind blown emoji. If you're listening on the, podcast, make sure to check the show notes today, the episode description. We'll leave a link. You gotta you gotta watch the side by side comparison. Alright? It's pretty telling.

Jordan Wilson [00:22:19]:
Hey. Here we go. Nancy. Nancy, am I paying you today? Nancy says, something she says this will have to be a premium add on because tokens. Yes. Alright. We're gonna get to tokens and computes because that's a big piece. So let's now talk about number 2, which I think is the hottest topic for today's hot top, Hot Take Tuesday.

Jordan Wilson [00:22:45]:
I didn't ask for the normal, flame emojis, but I saw some. Yep. Tara Tara wanted me to burn this down. We'll we'll we'll try to keep it real, that's what we do here, but we bring receipts.

Jordan Wilson [00:22:57]:
Alright. So Sora. Sora signals that OpenAI is close to HEI, or maybe it has already achieved it. Alright? No. I don't I don't say this lightly. Don't worry. I bring receipts. I bring a paper trail.

Jordan Wilson [00:23:13]:
The fact that this is a first model, and for some of the facts that I'm about to lay out, we have to wonder,

Jordan Wilson [00:23:23]:
are we at that point of AGI? Right? If you read a lot about AGI like I do, or maybe you don't, a common, train of thought around artificial general intelligence and, again, I'm oversimplifying it here because our our audience is for the everyday people. Right? We're not talking to, you know, an entire audience of people who, you know, build machine learning applications on a daily basis. I'm trying to make this simple to understand. Essentially, artificial general intelligence is different than artificial intelligence. So AGI, this is what open AI has been openly working toward for years. Right? Meta and Mark Zuckerberg have recently said that they are openly working toward AGI. Right? So AGI is kind of scary. It's unknown.

Jordan Wilson [00:24:07]:
Right? And that's essentially what happens to oversimplify things. It's when the AI becomes smarter than us across the board. It's when the AI doesn't necessarily need us. It's when the AI can fix itself and improve itself. Right? Essentially, it's when it, displays intelligence across the general, kind of general field of of expertise or general field of skills. Right? So you have something that's called kind of, like, narrow, artificial intelligence in general. Right? So narrow is fact or or sorry. Cut kinda more skill set based.

Jordan Wilson [00:24:40]:
Right? And in those instances and I'm gonna have a chart here from DeepMind, a very famous chart, at the end to explain it. But, essentially, in certain tasks, obviously, AI is already way outperforming humans. But when we talk about artificial general intelligence, AGI, that's when AI can outperform humans, the average human, on a variety of tests all at once. Not 10 specific individual, you know, tasks but 1 AI system can it outperform the average human on general tasks?

Jordan Wilson [00:25:16]:
Let's get to receipts here. Let's look at the timeline. Alright.

Jordan Wilson [00:25:23]:
The last month. And I've been talking about this a lot. So if you follow the show every day, first of all, thank you. I appreciate you guys. Second of all, you'll know I've kind of hinted at this before. The writing's been on the wall for a while now, and I saved it for a Hot Take Tuesday. Alright? But let's just take a look. In the last month so Sam Altman has talked multiple times in multiple interviews about the next version of GPT, you know, presumably GPT five, or, you know, who knows, we might see a 4.5 first.

Jordan Wilson [00:25:55]:
But the next version of GPT four or GBDT 5, Sam Altman has talked multiple times about the increased ability to reason, which leads to AGI implications. Right?

Jordan Wilson [00:26:08]:
We've talked about agents here on the show. Right? We've known for a long time that OpenAI has been working on agents, but we just saw the first kind of official, report, I believe it was from the information, we'll link it in the, in the newsletter. But we saw the 1st official report on OpenAI's agents. Right? And I talk about it here on the show. Essentially, 2 different kinds. One that can control your device, whether that's your computer or your phone, we'll see. And another agent that can perform actions on your behalf. Right? So it can perform actions on a website or an app.

Jordan Wilson [00:26:45]:
Alright? You see how that leads to AGI? Yeah. You can't have AGI without, a system being able to perform like

Jordan Wilson [00:26:54]:
a human. Alright? So it has

Jordan Wilson [00:26:56]:
to be able to control the device, and it has to be able to perform actions on your behalf, and it has to have a better model. Right? Okay. So we've crossed those 2 things out. Number 1, there's a new model. Number 2, it can perform actions like a human. Number 3, compute. Compute. Sam Altman has, you know, it's been rumored and widely reported.

Jordan Wilson [00:27:22]:
He even joked about it on Twitter. I I still can't call Twitter axial. It just sounds weird. So Sam Altman joked, but, you know, it's been reported he's trying to raise $7,000,000,000,000, you know, for essentially compute, for energy, for these GPU chips, or for who knows, whatever a chip is after a GPU. He's really doubling, tripling, 10 axing down on compute, on chips, on powering generative AI. Alright? So combine those 3 things. Next model, being better at reasoning, number 1. Number 2, OpenAI's agents.

Jordan Wilson [00:28:01]:
Number 3, raising $7,000,000,000,000 for more compute. And then last but not least, OpenAI previewing Sora. Alright. So I get what you might be saying. Hey, Jordan. You're just a nerd. Sora has nothing to do with AGI. It's video.

Jordan Wilson [00:28:23]:
No, it's not. No, it's not. That's topical.

Jordan Wilson [00:28:31]:
This is not

Jordan Wilson [00:28:31]:
video. Is it video on the surface? Yes, but you have to understand what that means. Alright. Let's unwrap that, shall we? Because I think most people are so blown away by the output. They're not looking beneath the surface. And y'all guess what? It is literally right in front of us.

Jordan Wilson [00:28:58]:
Alright. So OpenAI released quite a few things all at once when this, sort of announcement came out. We've now seen that there's been a small team working on this for more than a year. Alright? But, reportedly, a lot of even internal employees at OpenAI just found out about this right before its release last week. So there's been a small team going stealth on this, investing presumably a lot of time, energy, and resources into this project. So most people just stopped and they only looked at the OpenAI page. And they looked at all the videos. Right? I was talking about those videos they downloaded.

Jordan Wilson [00:29:35]:
But there is something I wouldn't say hidden, but you had to really care to read the research paper it was a completely separate piece right Everyone just went and, you know, oh, you know, all our Billy boys out there that are just trying to, you know, trick you into, you know, reading their newsletter or whatever that's written by AI and they're just selling you crap products, they didn't care to look at this. Alright? We do. I'm sure someone else on the Internet has, but, you know, this isn't where the conversation is. But you gotta read the research paper, y'all. I think I read the research paper before I even downloaded any of the videos. And I said, okay. Here's our hot take Tuesday. Alright.

Jordan Wilson [00:30:17]:
So let me just let me just read, excerpts

Jordan Wilson [00:30:21]:
of this. Okay. So in the research paper, which I recommend you all read doesn't take a lot it

Jordan Wilson [00:30:29]:
says video generation models as world simulators. Alright? And then I'm gonna read the last sentence in 1 more sentence of this, little excerpt from the OpenAI research paper on SoRa. So they said, our results suggest that scaling video generation models is a promising path towards building general purpose simulators of the physical world. One more sentence. THORA serves as a foundation for models that can understand and simulate the real world. A capability we believe will be an important milestone for achieving AGI. Okay. Let me highlight those things one more time in case you're not watching them and you can see them highlighted on my screen.

Jordan Wilson [00:31:26]:
OpenAI is calling this world simulators, building general purpose simulators of the physical world, and they're saying, 'SORUS serves as a foundation model for models that can understand and simulate the real world, a capability we believe will be an important milestone for achieving AGI.

Jordan Wilson [00:31:48]:
Okay? Guess what, y'all? No other OpenAI releases were this direct about their correlation to AGI. Okay. Not the

Jordan Wilson [00:32:01]:
new version of GPT 4 turbo, not custom GPT's, not the new memory feature, right, that was released last week, not DALL E, none of these things. None of these things did OpenAI draw this strong of a correlation between a a new product or a new release and AGI. Alright? Let's continue to unpack and talk about this. So Sora is a world simulator. I'm gonna try to remember this and, you know, actually, shout out I mean, you do have to give, a shout out to runway here as well. Right? Because Runway, quite a while ago this is many months ago. They kind of introduced what they called general world models. And I do believe that in in a similar vein, SoRa is is kind of, positioning, or sorry.

Jordan Wilson [00:32:55]:
OpenAI is kind of position positioning SoRa as such, a world simulator. So this is more about understanding how the world works. So we're talking about real world intelligence and three-dimensional physics. Okay? This is not will it mainly just be used for looking at this cool video I can create with text? Yes. But OpenAI is using it for a much larger purpose. And we're gonna get a little bit into the technical aspect here as well. But in short, this is helping OpenAI understand and predict the real world. Let me say that one more time and tie it back to AGI.

Jordan Wilson [00:33:38]:
This helps Sora helps OpenAI understand and predict the real world. What does AGI need? It needs to be able to understand how the

Jordan Wilson [00:33:47]:
world works physically, spatially, physics, relationships between things, etcetera. Can't do that right now. Can't do that with Sora?

Jordan Wilson [00:34:02]:
Maybe. That's what it seems like where we're going. Alright. So, again, let's go into more receipts. Alright. Let's look at again, Sora's research paper for

Jordan Wilson [00:34:11]:
an example on what this real world understanding is. Alright. So if you're joining on the podcast, I'm gonna do my best here. But OpenAI shared 3 different examples, 3 different videos. And applying different tiers of compute again if you're a machine learning researcher I mean, you yeah. You can email me and tell

Jordan Wilson [00:34:34]:
me how I'm wrong. Again, when I'm talking here on everyday AI, I try to oversimplify things. Okay? So people can understand that. But essentially, think of computes as layers of technology that is applied to something. Okay? So in this example, there's 3 different videos of a cute little dog playing in the snow playing in snow. Right? So your base level of compute. Right? Let's just say if you apply this technology behind Sora on one layer, you can't really tell. You can't really tell.

Jordan Wilson [00:35:06]:
Is it a dog? Is it, in in, is it the snow morphing into something? You can't tell. It doesn't even look like a dog. You can barely tell anything on a base level of compute. Alright. Then you're saying then they're showing an example of 4 x compute. Right? So let's just say so we can ease easily visualize it. The technology behind Sora, let's just say you stack it times 4 or you run it through it 4 times or you apply 4 times the compute, four times the technology toward it. At this point, in the middle here, you have what, you know, looks like a dock.

Jordan Wilson [00:35:39]:
Looks like an AI generated doc. You know, this is kind of what we kind of have now, I think, with the current models. You know? This you know? You can go look at it. That video looks a little closer. Maybe Runway and PICA are a little better, but it looks a little closer to our our current day text to video. So who knows what Pica has cooked up for 2 point o or what Runway has cooked up for gen 3? But at least right now, and when you look at 32 x compute, that is what is on the right on my screen or if you go look at those 3 videos, that's what it is. So now think of applying that technology 32 times or applying 32 times the compute power behind OpenAI's SOAR technology. Then you get a video that looks pretty real.

Jordan Wilson [00:36:32]:
Right? We'll share this 1 in the newsletter as well, but there's a very famous, you know, famous video. One of the 1st AI videos that kind of went viral on the Internet was Will Smith eating spaghetti. Right? And Will Smith just keeps on morphing and, you know, even in every single frame, everything changes. And, you know, the spaghetti is morphing with his face and his mouth is morphing with his eyes. Right? Like, it's it's a mess. Right? And that's kind of on the left hand side here. That's when you look at something like a base compute or a four x compute in this one specific example. Right? But this 32x compute it looks like real life.

Jordan Wilson [00:37:09]:
But again, think this is not just for allowing us all to generate hyper personalized, realistic looking 1 minute videos. This model, and as we use it, right, and as millions of people use it, right, when you use a model people don't know this. Sometimes you get a split screen and it says which one's better. You know, there's always a thumbs up or a thumbs down. Right? We are training these models. So we are telling OpenAI and we are telling SWORA this is how the world works. This is not correct. This is correct.

Jordan Wilson [00:37:48]:
And as millions of people use this technology, OpenAI gains a better understanding of the

Jordan Wilson [00:37:55]:
world beyond what models currently can comprehend.

Jordan Wilson [00:38:01]:
And it starts with more compute. We need more computes. Here's another thing to keep in mind. Again, I'm speaking in generalities here. But, generally, current large language models, not just ChatGPT, but most large language models, they try to use as few tokens as possible while still giving a good answer. Right? But it could use if it wanted to. Right? And people, you know, people much smarter than me, you know, have examples of this online. You can get much better responses from current large language models if you can fine tune them to not care about tokens, to not care about memory.

Jordan Wilson [00:38:40]:
Right? So the reason why all the big companies do this, it makes sense. Right? Because you have to balance cost with with quality. Right? But what I'm saying is current models and future models are obviously capable of much much more. What we get out of them on a daily basis, right, for for for most of us, for everyday use, for everyday people, right, I'm not talking about, you know, your LLM hackers out there and your machine learning experts, but for the majority of us, what we see out of large language models is a balance That does not show us the full capabilities of models. These models have to balance compute, they have to balance cost with quality, They're capable of, obviously, much, much more. So, you know, earlier, we had a great comment from Nancy. Right? She said something about, you know, tokens and costs. And, yes, this will be this will have to be a premium add on, I would assume.

Jordan Wilson [00:39:42]:
Right? I would assume, but I also wouldn't be surprised if this does get released and OpenAI is just for a while taking a bath on this. It's going to be very expensive. Even if we have to pay $20 a month, additionally, to use this and we get, you know, a couple generations an hour, maybe we get 1 or 2, I would assume that OpenAI will still be losing money on this. I will assume that they will not care early on because they want us all to use it. They want us all to help train their model. You know, we might get 2 different variations and you might choose which one's better. They want to see those thumbs up and thumbs down. They need real world humans to train a model that is for the future.

Jordan Wilson [00:40:24]:
For real world AGI. Right? We have to talk about compute for today versus compute for tomorrow. Have any of right. Like, we talk about on the show all the time on Everyday AI. We always talk about, oh, you you know, more more companies than NVIDIA and, you know, Microsoft creating their own chips, Amazon creating their own chips, you you know, Sam Altman trying to raise $7,000,000,000,000 which is an asinine amount. Right? Why? Because all of us right now, are we suffering from this lack of compute power? No. We're not.

Jordan Wilson [00:41:02]:
We can go on and use these models. Right?

Jordan Wilson [00:41:04]:
Pretty much. Yeah. There's limits. There's throttling. There's caps. But we can use them, essentially, fairly well. We're not crippled by today's lack of compute. This is for tomorrow.

Jordan Wilson [00:41:16]:
This is for AGI. Right? There's a reason. The 2 leading voices, in the push for AGI, obviously, it's Sam Altman and OpenAI, so far ahead of everyone else in terms of they've been on the AGI kind of bandwagon for longer. And recently, you know, Mark Zuckerberg has become more vocal. So, you know, you have Zuckerberg and Meta being very vocal and saying, yeah, we're investing 1,000,000,000 of dollars in chips. Right? And here's Sam Altman who says, okay, I see your 1,000,000,000,000, I say 7,000,000,000,000,000,000, or joking around on Twitter, 8,000,000,000,000. It's compute for

Jordan Wilson [00:41:51]:
tomorrow. So let's combine all

Jordan Wilson [00:41:54]:
of that. I know we're still on 0.2 here. We're gonna wrap up. 0.3 is pretty straightforward. But combine all of that, everything that we just talked about with AGI, and a lot of people don't understand what OpenAI can currently do with its different products. Right? Can't use them all in

Jordan Wilson [00:42:13]:
the same interface a lot of them, you can ready? I'm gonna run them down You have GPT Vision, so OpenAI can see. You have GPT Voice. OpenAI can talk. You have Whisper. OpenAI can listen. Jukebox.

Jordan Wilson [00:42:34]:
Which they've been sitting on for years and it's actually pretty impressive.

Jordan Wilson [00:42:41]:
OpenAI can make music and sing. Data analysis OpenAI can write code and understand data just the GPT4 technology OpenAI can understand that is your general use case Then you have Sora, where it can understand relationships in the real world now And then, in the blue, you have 2 things in the future, right, which I would presume are in the works You have agents which can perform tasks, and you have that $7,000,000,000,000

Jordan Wilson [00:43:17]:
of compute, which is essentially unlimited resources. Right? Yeah. That could take many months or multiple years to achieve. Are you getting it now?

Jordan Wilson [00:43:32]:
Are you understanding? What's happening? Yes. Sora is text to video. But OpenAI told us. OpenAI told us. Just most people don't wanna read. This is a step toward AGI. Y'all? You not see it? Receipts on the board. So right now, OpenAI can see, it can talk, it can listen and understand voices,

Jordan Wilson [00:44:05]:
It can sing and create music. It can write code and analyze data. It can actually understand. And future models will be better with human reasoning. It is starting to understand relationships, and it will soon reportedly be able to perform tasks like a human. You wanted the hot take. This is a mild take. Right? Because I don't wanna be, you know, too much hyperbole.

Jordan Wilson [00:44:32]:
I don't wanna be, you know, too much sensationalized click click click bait. The writing is literally on my wall right here. And they are telling us they are telling us if you bother to read the research paper and to

Jordan Wilson [00:44:48]:
look at past whatever is going viral on Twitter or LinkedIn, you will see. This is about AGI.

Jordan Wilson [00:45:01]:
Referenced this earlier. So it's important to, you know, this, this Google deep mind, couple months ago. And also, if you don't know much about Google deep mind, I'd say they are by far the leading AI research team in the world. Alright. So they released a couple months ago this little chart. I'm not gonna spend a lot of time on it. I'm gonna give you a high level overview. But, essentially, it's different levels of AI and different levels of AGI.

Jordan Wilson [00:45:26]:
Right? So you have narrow, which are certain tasks. Right? And right now, AI can already perform when you look at specific tasks better than any human. Right? You have your level 2, which is, you you know, average. You have your level 3, which is expert, let's just say 90th percentile. And then you have your level 4, your virtuoso, which is 99%, which is better than almost anyone. So on narrow task, individual task, AI already wipes the floor. Right? And there's already all these studies. It's already been done.

Jordan Wilson [00:45:54]:
Right? General is a different story. That's what all these big companies are working toward. AI, artificial general intelligence, AGI, that's just better at everything than humans. Are we there yet? Maybe. Maybe not. Probably. We're close. Will we know when we get there? Not necessarily.

Jordan Wilson [00:46:13]:
Could be happening before our very eyes. People who follow AI and have been following it for much longer than I have, and AGI say it's it's just gonna happen. Right? There's not gonna be

Jordan Wilson [00:46:22]:
a warning. All of a sudden, it's gonna be, oh, okay. Yeah. We have AGI. Right? So level 1 emerging AGI, it's already there. Right? So

Jordan Wilson [00:46:31]:
that's is AI better than essentially unskilled humans? Right? There's not a nice way to say that, but, you know, people who are unskilled, uneducated, is AI in general better than those people? Yes. But you don't really start talking about AGI until you get to level 2, which is competent, which is when at general tasks, AI is better than the average human. Are we

Jordan Wilson [00:46:56]:
there today? No. Could we be there soon? Yes.

Jordan Wilson [00:47:00]:
You know, these predictions go back and look. 5 years ago 5 years ago, I'll try to find these studies. I have so many studies floating around in my head. 5 years ago, I believe they said, oh, we'll have AGI by 2060. Alright? In 2019, they said 2060. You know, they said we were decades out. Today, they're saying, oh, it could be a year. It could be a year or less.

Jordan Wilson [00:47:25]:
You know? Because all of these all of this, new technology in this race right now, people did not, 5 years ago, understand how important generative AI would be to the US economy. They did not understand that every single of the largest, like, companies in the US are investing 1,000,000,000 of dollars into AI. We did not foresee that 5 years ago. Most people did not. That is why the developments are coming way faster than even the smartest researchers 5, 6, 7 years ago could have ever predicted. They said decades. And then 3 years ago, they said, maybe a decade. And now, today, people are saying, it might be a year.

Jordan Wilson [00:48:06]:
It could be as quick as a year. It's enough on AGI. Let's get to point 3. I don't think anyone will catch OpenAI in Microsoft. It'll take what I'm calling a double acquisition to get close. And if you did listen to my 2024 bold predictions, which, hey, you never would have you never would have thought this. A show from 2 months ago, so many of them have already come true. People were like, no, Jordan.

Jordan Wilson [00:48:34]:
These won't come true. Anyways, I said 2 or 3 months ago, I said there is going to

Jordan Wilson [00:48:38]:
be a very large acquisition in 2024, an acquisition that most people aren't expecting. And here's kind of the rationale or

Jordan Wilson [00:48:46]:
the reasoning behind it. Companies are now like, now what you see with sort of and when the general public understands that sort of is about more than video, it's about more than text to video. Once the rest of the world and the tech world starts to understand that, you're gonna see pressure. Big companies and their stocks, once they're, the analysts and the investors and the general public understands what this means from OpenAI and Microsoft. Again, Microsoft owns 49% of OpenAI, so you gotta talk about them in tandem. Once the rest of the world catches up, whether that's in 2 weeks, 2 months, or 2 years, an acquisition is the only way out for these other companies, period. Whether that happens, like I said, tomorrow, who knows? Maybe an acquisition is close. Maybe it's gonna still be a couple months, but a big acquisition is going to happen.

Jordan Wilson [00:49:34]:
There is no other way around it in 2024. Alright. So I'm gonna categorize these. We have our tech titans and what I call startups that can burst or startups that can ignite. I will say highly flammable startups. Right? Startups that once they combine with the tech titan, they can do something big. So our tech titans, we essentially have Amazon, Meta, Alphabet, which is Google, Apple, and NVIDIA. Right? So aside from Microsoft, I'd say and, you know, you could throw Tesla

Jordan Wilson [00:50:05]:
in there as well, but they're kind of competing in a different space. So you have Amazon, Meta, Google, Apple, Nvidia. Alright.

Jordan Wilson [00:50:13]:
And then you have your startups that can burst. So your startups that can ignite. You have Anthropic. You have MidJourney. You have Cohere. You have StabilityAI. You have Hugging Face. You have Runway.

Jordan Wilson [00:50:24]:
You have Peeka. Yeah. There's probably 1 or 2 more you might be able to throw in there, but I'm not talking, you know, companies that are just, like, wrappers, like GPT wrappers. Not talking about this. Yeah. I know there's companies that are valued at 1,000,000,000 of dollars that are essentially GPT wrappers. I'm talking about companies that have unique technology that they've built in house. Right? So Winthropic, MidJourney, Cohere, Stability AI, Hugging Face, Runway Pika.

Jordan Wilson [00:50:51]:
Yeah. There's probably 1 or 2 more. It is going to take a either 2 tech titans combining forces. You know? I think you also have to throw in probably, you know, IBM in there. You know? Some of those more that are in hardware as well. Right? But then, I think they're you're gonna have to acquire multiple. Right? If you're Amazon, you might have to, acquire Anthropic and MidJourney. They've already invested heavily into Anthropic.

Jordan Wilson [00:51:24]:
Right? If you're meta, I would I would keep my eyes on meta. Right? Meta, I think, from a multimodality standpoint, is the one closest to where OpenAI will be soon, if that makes sense. Right? Meta might have to acquire, you know, as an example, a hugging face and a a runway. Or, you know, Apple might have to acquire a a Pika and a Stability AI. I don't know. But it is going to take either multiple tech titans combining forces, which I don't see that happening. You know, I should have thrown IBM and some others in the mix here, or it will take one of these tech titans acquiring multiple, igniting startups to compete with the combination of open AI and Microsoft. They are so far.

Jordan Wilson [00:52:13]:
They are so far ahead of everyone else. Right? NVIDIA, not you know, I guess NVIDIA is also kind of in its in its own category. Right? Because, yeah, they just released their chat with RTX, but, technically, you know, most of these big tech titans are, clients or they pay NVIDIA for their chips. So I know NVIDIA is kind of on the outside. Tesla is kind of on the outside. IBM is kind of on the outside. But, you know, even specifically, if

Jordan Wilson [00:52:39]:
we're looking at Amazon, Meta, Google, Apple, they're gonna have to acquire multiple of these companies, I think, to compete. Because, jeez, look at Gemini 1.5. I'm not super impressed. Look at these, you

Jordan Wilson [00:52:56]:
know, these video models. Right? Even that Meta and goo and and Google have have, previewed. Oh, gosh. I feel bad for those very smart people who have spent a lot of time building what are very impressive models. And then you see Sora. Right? It's in its own hemisphere. It is in its own stratosphere. It is not close.

Jordan Wilson [00:53:22]:
I wanna hear from you. That's all I got for today. It is about so much more. To recap. Right? It is about so much more than just text to video.

Jordan Wilson [00:53:37]:
You have to look at the bigger picture. Alright? Number 1, SOAR is light years ahead of everyone else with text to video. So if you're just looking at it at that at face value. Number 2, the timing. You gotta look at the timing. That has to mean AGI is much closer than we think. And number 3, I don't think anyone right now without doing something drastic can catch OpenAI in Microsoft. I hope this was helpful, y'all.

Jordan Wilson [00:54:03]:
We spend so much time doing this. People always ask me, hey, Jordan. How can we support everyday AI? Well, we're gonna be officially launching some consulting services soon. But right now, hey. If this was helpful, please share this episode. You know? If you're listening on on LinkedIn, you know, repost it. If you're on Twitter, retweet it or reax it or whatever that's called. Share this with friends.

Jordan Wilson [00:54:24]:
Tag them in the comments here. You know, text them. Talk to them about it. We are trying to be your best friend in AI. We are trying to cut through the noise, cut through the Billy boys. They're just trying to make a buck off you. They're just trying to lead you down the wrong road. We bring receipts.

Jordan Wilson [00:54:39]:
We bring you the facts. Yeah. Every Tuesday, we we spice in some hot takes and we give you some opinions. But I think that generative AI education is essential for all of us to grow our companies and to grow our careers. I'd appreciate it if you'd let others know. And join us tomorrow. Your voice and your context, how to scale a content engine with AI. I'm excited for tomorrow's conversation.

Jordan Wilson [00:55:03]:
I'm excited. Thank you for joining us. Make sure to go to your everyday ai.com. This is gonna be a big newsletter, so make sure you check it out. Thank you for joining us. We'll see you back tomorrow and every day for more everyday AI. Thanks, y'all.

Gain Extra Insights With Our Newsletter

Sign up for our newsletter to get more in-depth content on AI