Ep 163: Google Gemini – ChatGPT killer or a marketing stunt?

Episode Categories:

Resources

Join the discussion: Ask  Jordan questions about Google Gemini

Upcoming Episodes: Check out the upcoming Everyday AI Livestream lineup

Connect with Jordan Wilson: LinkedIn Profile


Overview

In the world of artificial intelligence (AI), advancements are constantly reshaping the way we live and work. However, the recent release of Google's Gemini model has sparked important discussions about transparency and integrity in AI marketing. As businesses look to leverage AI technologies, it's crucial to draw valuable lessons from this episode and ensure a balanced approach to evaluating AI solutions.

The Unveiling of Google's Gemini Model

Google's announcement of the Gemini model, purportedly the first true multimodal base model, stirred excitement in the AI community. Touting capabilities to handle text, images, and audio, the model promised a new era of AI integration with Google services. However, the circumstances surrounding its release garnered widespread attention, raising questions about the adequacy of AI marketing practices.

Misrepresented Capabilities and Marketing Missteps

The unveiling of Gemini was marred by controversy as Google presented a marketing video showcasing the model's capabilities. Criticism mounted as it became apparent that the video's portrayal of real-time interaction was inaccurate. This misrepresentation sparked concerns about the veracity of Google's marketing tactics and the potential consequences for the larger AI industry.

Implications for Trust and Credibility

The fallout from Google's marketing missteps extended beyond immediate public scrutiny. The palpable impact on the company's stock value underscored the tangible repercussions of misguided marketing in the AI space. As businesses assess prospective AI solutions, the diminished trust in Google's Gemini model highlights the central role of transparency and accountability in establishing the credibility of AI offerings.

Navigating AI Marketing: Building a Foundation of Integrity and Reliability

In the wake of Google's Gemini controversy, businesses are well-advised to adopt a discerning approach when evaluating AI technologies. While the allure of ground-breaking AI models may be compelling, it is imperative to prioritize an evidence-based examination of their capabilities. By emphasizing truth, accuracy, and real-world performance metrics, organizations can fortify their AI strategies against the pitfalls of deceptive marketing.

In Conclusion

The unveiling of Google's Gemini model has not only deepened our collective understanding of AI marketing but also underscored the critical need for transparency and reliability in the AI industry. For businesses navigating the AI landscape, this episode serves as a powerful reminder of the imperative to prioritize truth and integrity in evaluating AI solutions. By anchoring AI strategies in verifiable performance and ethical marketing practices, businesses can chart a course towards sustainable AI integration and innovation.

Topics Covered in This Episode

1. Introduction to Google's Gemini Model
2. Google Gemini's Marketing Controversy
3. Assessing Gemini's Performance and Functionality
4. Comparison with ChatGPT
5. Importance of Transparency and Truth in AI Industry


Podcast Transcript

Jordan Wilson [00:00:17]:

Google got So many things wrong when it came to the release and the marketing behind their new Gemini model. It's been almost a week. Yes. I took that long to decide how I'm gonna have ad Or how I'm gonna deliver today's episode because I think that there's a problem going on right now with Google. And I really wanna talk about whether that this new Google Gemini model, its latest large language model, Is a chat g p t killer or just a marketing stunt gone horribly wrong? Alright. So we're gonna talk about that today on Everyday AI. This is your daily livestream, podcast, and free daily newsletter helping everyday people like you and me Not just learn what's going on in the world of generative AI, because there's a lot, but how we can all actually leverage it, How we can use everything that's going on day to day, these new tools, tips, techniques, and strategies, how we can use them, understand them Help grow our companies, grow our careers. My name's Jordan Wilson.

Jordan Wilson [00:01:28]:

I'm your host, and, I wanna know I wanna know your thoughts. What do you think so far of this Google Gemini. Have you used it yet inside Google bard? Do you think Google handled it well? I wanna know from you. Today's gonna be one of those shows where I ask a lot of questions of our live audience. If you are joining us on the podcast, thank you as always. We appreciate your support. Please follow us. Leave us a rating.

Jordan Wilson [00:01:53]:

But also check the show notes. Every single day, we'd leave show notes in there so So you can come back here. You can join the conversation going on on LinkedIn and other social media. So please, I wanna hear from everyone out there because I think today's, episode on Google Gemini is extremely important to talk about because it's actually about the bigger picture. It's not just about Google Gemini. It's about the company, Google. It's about the state of large language models. So I do wanna hear from you.

Daily AI news



Jordan Wilson [00:02:18]:

So before we get started as we always do, Let's talk about what's going on in the world of AI news. So it was only a matter of time, but AI can officially read our minds. Scientists have developed a new groundbreaking technology that can read human thoughts and convert them into text through an EEG cap in an AI model. So this technology was developed by the GrapheneX UTS, human centric artificial intelligence center At the University of Technology in Sydney in Australia, and it was recent recently featured at a prestigious AI conference. That was a mouthful. Right? But this system uses an EEG cap to record and decode brain activity And an AI model named DWAVE to translate EEG signals into coherent words and sentences. Yes. This is not Science fiction.

Jordan Wilson [00:03:08]:

This is real science. But, yeah, apparently now, this was just unveiled today. Yeah. AI can read our minds and actually put it into coherent words and sentences. I'm actually happy for this, for those times when I'm thinking about something, AI related and can just get it down on paper. Right? Alright. Next piece of news. How prominent are hallucinations in AI models? Well, it's so prominent, the word hallucinate is now officially the Dictionary.com word of the day or sorry, word of the year.

Jordan Wilson [00:03:39]:

Yeah. Word of the year. So hallucinate if you don't know. It means when an AI model gives you false or inaccurate information. So every single year, Dictionary.com toward the end of the year, names their word of the day or word of the year. And, this year, it's hallucinate. Right? It's something we talk about On the Everyday AI Show all the time, we've done, multiple episodes specifically on how to get large language models to hallucinate less. So if you're like every other person out there and you're worried about hallucinations, make sure to go to your everyday AI.com.

Jordan Wilson [00:04:11]:

Search the word hallucination or hallucinate, you'll find so many different episodes that will help you. And so a little more about hallucinate. So this term emerged from technical jargon in the 19 seventies, which I didn't know, And it has seen a 46% increase in lookups this year. Oh, man. I used to, side note, I used to just look up words all the time in the dictionary For fun. Weird. Right? Uh-huh. Alright.

Jordan Wilson [00:04:35]:

Last but not least in AI news, political robocop calling is changing due to AI. So a Democratic congressional candidate in Pennsylvania's 10th district is using an AI powered phone campaign to reach voters and fundraise, Making it the first of its kind officially known, at least, in US politics. So the company is Sibox, and that's the company behind the AI campaign, Hopes to start a public conversation about the role of AI in politics. So I'm not sure how, y'all feel about this, but especially here in the US, Political robocalls and robotexts are terribly annoying, and but they're extremely prominent. You you know, especially when we get into that election season with an election, now about a lot less than 11 months, away here in the US. So, Expect if you haven't started getting calls from actual AIs that you can talk back and forth with. Unfortunately, I think you're gonna have to expect that soon. Alright.

Jordan Wilson [00:05:33]:

So, there's more. There's always is more news, more what's happening in the world, and how it affects you. Make sure you go to your everyday AI .com. Sign up for that free daily newsletter, where every single day, we break down the conversation that we have on the podcast, on the live stream, But also we cover just about everything else. We have our AI news, which is, you know, some of the things that I previewed as well as others, Fresh finds, which are just different, important things happening in the AI world kind of across the Internet. And we have a whole lot more as well as previewing other, shows that we have coming up on Everyday AI. Alright. Big wind up this morning.

Jordan Wilson [00:06:13]:

If you're joining us, let me know. I wanna know from you. How hot should this hot take be? Should this hot take Tuesday be when I'm talking about Google Gemini? Let me know if you're joining us live. Should I take it easy on them? Should I tell them how it really is, or should I just burn all the bridges? Just go full hot take mode, and and make Google never wanna talk to the Everyday AI Show? Let me know. I wanna know from you, and thank you everyone, and For joining us as always. Mike joining us. Good morning. Woozy and Tara joining us as well.

Jordan Wilson [00:06:49]:

Thank you all. Mike joining us. Liz, so many people. Thank you. But, let me know. What are your thoughts so far on Google Gemini? I'm gonna get into it. I'm gonna get into it, but I always wanna hear from our audience. So, what are your thoughts? What are your questions? I love this from from Douglas saying, quick the wave.

Jordan Wilson [00:07:12]:

What am I thinking now? Yes. That AI brain model. If I could get it live as well, that that would be fun. Alright. But let's let's talk a little bit. And so far, the votes are in. It's it's a lot of flame emojis. People people want Full full heat.

Overview of Google Gemini


Jordan Wilson [00:07:28]:

Full heat. Alright. Well, Cecilia says never burn your bridges, but everyone else is giving me, like, Fifty flame emojis, so, I'm gonna have to I'm gonna have to go with the the majority of people here. So alright. We're going we're going full hot take, but let's let's talk high level overview. Let's talk high level overview on Google Gemini, what this is, What it means. So, yes, I've waited almost a full week. So they announced this, almost, let's see.

Jordan Wilson [00:07:57]:

It's been 6 days. Today's day 7. So about a week ago, Google were, announced Gemini. So here's the overview. Right. So, Google Bard Has been using the PaLM 2 large language model. So, Google is replacing that with a much more sophisticated and a much better model in Gemini. Alright.

Jordan Wilson [00:08:17]:

So palm is out. Gemini is in. So there's essentially 3 flavors of Gemini. So you have ultra, pro, and nano. Alright. So Ultra is essentially the the largest use cases. Pro is kind of like the the day to day use cases, And nano is is, a much smaller version of the model that you can actually fit on a physical device. So Google is going to be putting it in their hardware.

Jordan Wilson [00:08:41]:

So let's start off with the good things. That's awesome. Right? I love seeing smaller, more capable models that don't even necessarily need to be connected to the Internet and can run locally on a device. That is one of the big pieces of the future of large language models. So let's start with the good, right, before I just start, you know, busting out the flamethrower. Let's start with the good. Also, it is the 1st true multimodal base model. Okay? So what that means is, you know, when you think of GPT, The model from OpenAI, it is multimodal.

Jordan Wilson [00:09:11]:

So being able to input not just text, but images and audio, and then on the output, being able to receive, you know, kind of multiple medias. So, you know, being able to input and output text, photo, audio. But GPT was built as a text model first and then, kind of these multimodal functionalities were built on top of it, so to speak. So Gemini is, at least according to all reporting and everything that I can see, the 1st true multimodal first model. So it's a big that's a big step in the correct direction on generative AI and large language models being the future of how we work. Right? So, another good thing is obviously built in integration with Google services. Alright. And then like we talked about, the ability to run on physical devices.

Jordan Wilson [00:09:58]:

So What most well, what most people do know is, Ultra is not out yet. Right? So that's what everyone's talking about, this this Ultra mode, you know, inside Google Gemini, but, if you go on to Bard right now, it is actually the pro mode. So more on that later, but keep that in mind. Google launch with kind of, giving people access inside of Google bar to this middle tier mode. That's extremely important. It's extremely important. Alright? So now as I sip my good morning coffee, let me know. What do you I mean, Are y'all drinking coffee when we go over this? Are you walking walking the dog? Are you on the treadmill? Let me know.

Google lied about Gemini release


Jordan Wilson [00:10:38]:

Getting caffeinated up because I saw a lot of people wanting the hot tics. Alright. So let's talk about this. Google pretty much lied about Gemini's capabilities. I said pretty much. Alright? I'm not gonna say they straight up lied. We'll say they hallucinated. Alright? We'll say they hallucinated.

Jordan Wilson [00:10:59]:

They did not tell the full truth. Alright. So let's let's look at let's look at why and what they said. Alright. So much of this was based around a video release. Alright? And everyone, everywhere, if if you care about generative AI, if you're on Twitter, if you're on LinkedIn, if you follow AI news, You saw this video. Okay? So Google released a short video, and it was essentially showed a person interacting with Gemini in real time. Right? And I have some screenshots of it that I'll show you here in a second.

Jordan Wilson [00:11:33]:

But, essentially, it was someone on the on the left side interacting, And there was an overhead video with different objects asking questions and talking to the Gemini model, and Gemini was Seeing what was going on on the screen and interacting in real time. Alright? And so the only Piece of, the only note that Google put up initially was on the video that they said this. They said we've been testing the capabilities of Gemini, our new multimodal AI model. We've been capturing footage, k. Test it on a wide range of challenges, showing it a series of images and asking it to reason what it sees. Alright. So the thing that I wanna point out is interactions. Right? That's what I have underlined there.

Jordan Wilson [00:12:23]:

So here's one example. Here's one example of what was shown. Okay. So, again, this is a video, So this is a screenshot from a video, but, essentially, in the video, the actor or the, main participant who is talking with the Gemini Does paper rock scissors? Right? And, yes, I say paper, rock, scissors. I know that's weird. I know 99.9% of the world says rock, Paper, scissors. I just can't. I grew up calling it paper, rock, scissors.

Jordan Wilson [00:12:54]:

So, the person's playing paper, rock, scissors, And then Google Gemini says, I know what you're doing. You're playing rock, paper, scissors. And the collective Internet lost its mind. Everyone was going nuts, but here's the problem. None of that was actually in real time like Google said. It wasn't. Right? The person was not actually talking to Google Gemini. Google Gemini was not actually listening, but that is the The exact message, and it was a very well done marketing video.

Jordan Wilson [00:13:35]:

Right? And that is what people really Started running with immediately. When Google announced this video, just about everyone who's anyone in the world of AI. Right? I was gonna call people out by names. Yeah. I was gonna call them out all those, you know, quote, unquote influencers on Twitter and LinkedIn, those 22 year old crypto bros that you're buying your, g your chat GPT prompts from, they all said, chat GPT killer. Look at this model in real time. This is the future. This is the best thing ever.

Jordan Wilson [00:14:06]:

Google is taking over, but none of it was actually in real time. Yeah. And here. Now the former journalist in me has a lot of problems with this. Alright. A lot of problems because, yeah, there were falsehoods, untruths, white lies all over the place. The only other thing that Google said aside from that message on the screen was this, in the in the YouTube video description. It said for the purposes of this demo, latency has been reduced, and Gemini outputs have been shortened for brevity.

Jordan Wilson [00:14:51]:

That is nonsense. That is hot garbage untruths. That is a hallucination. That is terribly inaccurate because let's let's break it down, shall we? Let's break down Google's own words And let them know that no. Wrong. BS detector is blaring blaring. Latency has been reduced. Yeah.

Jordan Wilson [00:15:20]:

Latency has been reduced by completely reconstruct completely reconstructing How this was done? It was not done in real time. I'm gonna show you. And then it says, and Gemini outputs have been shortened for brevity. No. They haven't. They've been completely reproduced in a different way that was not represented in this demo video That was seen by millions of people, that was shared by tens of thousands of the biggest names in in tech and AI. All those people you're following, but should stop following because everyone is just blindly reposting anyone's marketing messages without digging into it. Y'all, I saw this.

Jordan Wilson [00:16:00]:

I was at the AI summit in New York City speaking. When I saw this Wednesday, I saw the demo video. 22 thoughts immediately came through my mind. Number 1, I said, that's impressive. Number 2, I said, I gotta dig into that because I don't think it's possible. Right. I I wasn't gonna share it with you all. I wasn't gonna say, hey.

Jordan Wilson [00:16:20]:

This new Gemini model is the best thing ever. We talked about its release, but we didn't say Anything about it's a chat GPT killer, it is correctly doing this in real time. It's not. Let's take a look at how they Actually did it. And break down those words. Yeah. So latency has been reduced. No.

Jordan Wilson [00:16:38]:

It didn't. The video was was produced. Gemini was not in real time. And then when they said Gemini outputs have been shortened for gravity, no. Those multimodal input outputs were faked. They were text in photos. They were not real time. So we're gonna leave links, in the newsletter, so make sure if you're listening on the podcast or on the live stream, if you are not already signed up for the newsletter, what the heck is going on? Go to your everydayai.com.

How Gemini demo was created


Jordan Wilson [00:17:09]:

Sign up for the newsletter. We will leave a link to the paper because, essentially, on Google's developer blog, they they eventually I I don't know the timing. I I believe it was after after people started, questioning the validity of this video, but I'm not sure on that because I don't think it's time stamped. However, on the Google Developer blog, to their credit, Google did release how this video was actually made and how it was put together. So this paper, rock, scissors. Yes. Paper, rock, scissors. We're in the video.

Jordan Wilson [00:17:39]:

The person goes paper, rock, scissors, And Google says, oh, I see what or Google Gemini says, oh, I see what you're doing. That's not how it worked. There was no real video. What they did is they recorded the video, then they took many screenshots, And then they uploaded those screenshots with text into Google Gemini. So know for this Demo that set the Internet world on fire. There was no seeing. Google Gemini was not seeing real time anything. It was not hearing anything real time.

Jordan Wilson [00:18:18]:

It was just the same as every other model. They took screenshots, many screen grabs. Go look at it on the developer's blog that we'll link to. And in this example, So we have the 3 photos, 1 of a paper, 1 of a rock, and 1 of scissors. It's someone's hand doing these different techniques. So not only did they not just upload the video. Right? Look at how many levels of of hallucinations from Google's marketing. Right? They didn't upload it.

Jordan Wilson [00:18:50]:

They didn't even it number 1, it wasn't real time. Number 2, they didn't upload a video into Gemini. Number 3, they didn't just do screenshots of the paper, rock, scissors. They did photos. 3 photos. 1 of a hand making the paper, 1 of a hand making the rock, One of a hand making the scissors. And guess what? Even at that point, they couldn't just upload it, and Google Gemini could tell them. They had to say, What do you think I'm doing? Hint.

Jordan Wilson [00:19:22]:

It's a game. What do y'all think? Is Google's pants on fire? Did they lie? Did they hallucinate? Is this marketing? What I would like to say is it's not truthful. It is so far from the truth. Mhmm. Jerry says don't hold back. More flame emojis. Alright. Alright, Jerry.

Jordan Wilson [00:19:55]:

Yeah. Mike says the hot takes are scorching. Alright. Well, we'll keep it going then. We'll keep it going. But let's keep investigating. And let me tell let me tell everyone this. Yeah.

Jordan Wilson [00:20:06]:

Since since people are one of the hot takes. Stop Listening to other people on the Internet. Right? I know that sounds weird. Like, oh, only listen to me. I'm a former journalist. When a company tells me something, I never just talk about it on the everyday AI show. I never put it blindly in the newsletter where Thousands of people are reading and trusting what we're putting out there is true. If your mother says she loves you, get it in writing.

Jordan Wilson [00:20:34]:

Y'all, I I was an award winning journalist. When I see this kind of stuff, I know it is hot marketing garbage, and I'm never gonna go on and just trumpet What all these other crypto bros are doing to get you to sign up for their newsletter so they can sell you some crap AI products that are gonna go out of business anyways. That's not me. That's not everyday AI. If you're listening, if you're watching, if you're reading our newsletter, We are giving you the truth. Alright? And this was not the truth. From Google, it was not the truth. So yeah.

Jordan Wilson [00:21:09]:

In other words, the entire premise of the video was faked. Okay? It's not like Google was announcing, you know, new features on a camera Or Google was announcing, you know, some some new, Gmail feature. No. The whole premise of this video and the whole secret sauce or the whole, USP, the unique selling proposition of this new Gemini model Was the multimodality. Right? Was the fact that, presumably, Gemini could see and hear and talk in real time. And guess what? It was no better or no different than what we can already do with other models. And eventually, when people caught on days later, Google got scorched, Rightfully so. You see these articles says Google's Gemini marketing trick.

Jordan Wilson [00:22:10]:

An article from CNBC Google faces controversy over edited Gemini AI demo video. Yes. But the initial reaction was Google's stock shot up through the roof. And now Since that happened and since people realize, yeah, Google, you faked this. This wasn't truthful. Their stock has been going down, and I think it should continue to go down. But, also, I'm curious for our livestream audience. Did you believe the video? 99% of people did.

Jordan Wilson [00:22:49]:

I'm a skeptic. I'm a little bit cynical at times. My old the old journalist in me just didn't believe a single thing. But I think most people, even very smart people that I that I trust, were duped, and they Google has been getting dragged through the mug mud ever since. Now you thought that takes were hot before. Here we go. I'm settling in my seat because the takes are gonna get even hotter. Here's the thing, Google.

Comparing ChatGPT to Gemini


Jordan Wilson [00:23:22]:

Why would you put out a video like this. Did you really think that no one was going to notice? Did you think that no one was going to dig in? Did you think that no one was going to explore How this seemingly revolutionary that that piece of technology, y'all, if it worked like that live and in real time, that is not only Bigger than chat GPT. That's one of the biggest technological innovations since the smartphone, But it wasn't true. Here's the thing. No one's talking about this. You can do the exact same thing in chat GPT. I literally took the exact same prompt in the same pictures. And, obviously, I I uploaded the same thing.

Jordan Wilson [00:24:09]:

I said, what do you think? The same photo inside of chat GPT, inside of their multimodal GPT 4. I uploaded the photo. I said, what do you think I'm doing? Hint. It's a game. The exact same thing that Google did. It wasn't a video. They uploaded a they uploaded photos and text. My gosh.

Jordan Wilson [00:24:28]:

This is embarrassing for them. But guess what? GPT 4 did it just fine, obviously. So here's here's my other bone to pick. Right? Because right now, we talked about the 3 models, Ultra, Pro, and Nano. Right now, Google Bard has The Gemini Pro model, which is not very good. Right? So we are comparing. Even in these screenshots, we are comparing This ultra model that's not even out yet, it will, reportedly be available sometime in 2024. So we are comparing an unknown model that no one out there can test or use to GPT 4, which is already out there.

Jordan Wilson [00:25:10]:

So guess what? We could have recreated. In theory, recreated this entire thing in GPT 4 that's already out, and I'm showing you an example of it. Huge fail From Google marketing. Huge fail. But that not that might not be the most concerning part. Right. The fact that, hey, it actually wasn't live. Hey.

Jordan Wilson [00:25:36]:

We actually just took a bunch of screenshots and had to do a lot of prompting, a lot of prompting Inside inside Gemini to get this to work correctly. And then we took those responses and narrated them back. Oh my gosh. So deceitful, but that might not be the most concerning part because like I talked about, the public Perception was already set by the time 3 days later that people realized, wait. This this isn't how it worked. It wasn't real time. Right? Because every single person out there, every little Influencer on Twitter trying to get you to buy something crappy from them. Every single news outlet just blindly said, you know, Gemini, more powerful than GPT 4.

Jordan Wilson [00:26:28]:

And, you know, everyone on Twitter essentially put the same thing out there. Here is the chat GPT killer. No. Gemini was not a chat GPT killer. They were promoting a model that none of us can see, use, or benchmark. Alright. So here's the problem. Here's the problem.

Jordan Wilson [00:26:51]:

Yeah. Like, Douglas said, I thought it was an impressive video, but we need to see it, touch it, feel it before we can believe it. Absolutely. The truth is important, like Brian is saying. Cecilia brings up a great point. She said, did Google lie, or did we all assume and you know what that makes out of you and me? Yeah. I think it was both, Cecilia. I think the majority of people just assumed and took, you know, their spoon fad, information from big Google and just reposted it to try to get people to sign up for whatever crappy product or service that they're offering ultimately in the end.

Jordan Wilson [00:27:36]:

But I think also, yes, Google faked it. Google represented that this was happening live and in real time. And not only did they misrepresent it, but even their language. Right? When they said, hey. These are interactions. Those weren't interactions. Those were not interactions. Right? And they would and and when they said it's been edited for brevity, no.

Jordan Wilson [00:28:02]:

It wasn't. No. It wasn't. It was just screenshots. It was completely reproduced and reconstructed from scratch. None of those live interactions actually happened. Like I said, that's why I said probably lied in parentheses, But they faked it. This was full blown big tech hallucinations.

Jordan Wilson [00:28:31]:

Bad. It was bad. So the problems. So Gemini Ultra, which was, quote, unquote, better than chat GPT isn't even out yet. K. Gemini Pro, which we're gonna get to in a second from a benchmarks perspective. So what we can all go use right now. Yeah.

Jordan Wilson [00:28:49]:

You can go use Gemini inside Google Bard. Like I said, they swapped out the palm 2 model for Gemini Pro. It is in line with GPT 3.5. Okay. The benchmarks were not favorable. So the combination of those things, number 1, the best model you can't use. Number 2, The model you can use is testing like GPT 3.5. And number 3, the benchmarks were not favorable.

Jordan Wilson [00:29:16]:

Let me ask Google. Why did you release this now? This was a huge, huge failure Releasing this now. Not just the way that you released it, but the timing of it all. Because, actually, about 4 days before this announcement, There was all this reporting that was saying that the Gemini model was getting delayed until 2024. Right? So I'm guessing it might have been a, you know, Shareholder response. Hey. We need to get something out there. You know, initially, this was supposed to be a live event, and instead, it just got released through a press release and a YouTube video.

Benchmarks of Gemini vs ChatGPT


Jordan Wilson [00:29:56]:

So, obviously, there's a lot of things going on behind the scenes that you and I don't have access to. However, Anyone can see now that this release was disastrous. Not only the delivery, the the delivery and what I will say is the deceit, but the timing. Why would Google release this When we can't even use or test this model that they said is so fantastic, And when the one that we can use is comparatively running at the same speed as models and the same power as models that are almost 2 years old. Let's look at those benchmarks. Let's dig in. Let's dig in, shall we? So Here we go. I gotta make my screen bigger for this.

Jordan Wilson [00:30:48]:

So, the big thing is there's all these different benchmarks. Okay? And, again, we're gonna have the link, and you can go read this paper. It's a 70 60 60 page PDF. I read I read through most of it. Specifically, people are talking about the benchmarks In this, in this table here, okay, where it's comparing Gemini Ultra, Gemini Pro, GPT 4, GPT 3.5, And some other models across different benchmarks. And what Google said was Gemini Ultra was ahead of GPT 4 in 30 out of 32 benchmarks. Is that the truth? No. No.

Jordan Wilson [00:31:35]:

It's not the truth. Not the full truth. Let's take a look. Alright. So now, again, if you're listening on the podcast, I'm gonna do my best to describe this, but It is some kind of technical benchmarking. Okay. But right now, keep in mind, Google is comparing A model in Ultra. No one has access to Ultra.

Jordan Wilson [00:32:01]:

You only have access to Pro. Okay? That no one can test, and they're also using different testing methodologies. So I have highlighted here Gemini Ultra and Gemini or sorry, Gemini Ultra and GPT 4. And you'll see that for the most part, in every single one of these, except hella swag, which is Which GPT 4 wiped the floor with Gemini Ultra in in hella swag, which is actually a a very unique test. But in every other test, Gemini Ultra outperformed g b d four. Well, number 1, I would hope so. I would hope that an unreleased model would outperform the model that's already been in production for almost a year. Gosh.

Jordan Wilson [00:32:50]:

I would hope. How embarrassing would that be? Okay. But look in the details, y'all. Look at the details. I know it's kind of hard to see. So let's look at this test which is g s m 8 k, which is grade school math. Alright. The difference is the Gemini Ultra went through a 32 shot.

Jordan Wilson [00:33:12]:

And the same thing with the MMLU, which we're gonna talk about here in a second. It went through a chain of thought 32 shot. So without, getting too technical and too into the weeds, chain of thought is a methodology of prompting that makes it much easier or much more increases the hood of a model getting it correctly. So a 32 attempt to chain of thought. That's what they are comparing to, the GPT where they're also looking at the 5 shot. Okay. So they're they're kind of cherry picking. Right.

Jordan Wilson [00:33:47]:

So when you look at the GSM 8 k, which is grade school math, it's a test. All of these are different tests and benchmarks that you can see how well, then you can see definitively how models perform against each other. So you can say, oh, this one's best at a, b, and c. This one's best at d, e, and f. Right? But when you even look at the GSM 8 k, Gemini Ultra did a 32 shot And GPT 4 did a 5 shot. So you're comparing Apples and orange slices here, Google, and no one else can run these tests to confirm that. Let's keep looking. Because here's here's here's the real story that no one is talking about.

Jordan Wilson [00:34:32]:

What we can all use right now Which is Gemini Pro. It gets straight up smoked in almost all of these benchmarks Buy g p t four, which I can Google. What's the logic behind releasing a model? You know, all this hype, all this marketing and say, well, number 1, we didn't actually do this. It wasn't actually real time, and that's our way better better model, and you can't use it yet. And, obviously, everyone rushes out to go try Gemini Pro inside Google Bart, and it's straight garbage. Anyone that used it said, yeah. It's Way worse than gbt 4, and the benchmarks even confirm it. So Google, why on earth would you even I would not have released Gemini Pro.

Jordan Wilson [00:35:19]:

Number 1, I wouldn't have done this announcement at all. It was botched on every level. Number 2, if you're gonna do this, don't don't put Gemini Pro out there at all. Again, I know there's, a lot of things that are above my pay grade, and I don't understand. But if if you put out and you release something And people go out and all they can use pro. All they can use is pro. Number 1, the average consumer is not going to know the difference. I'm not the average consumer.

Jordan Wilson [00:35:45]:

Most of the people listening, to the show aren't. But the average person, they saw this on social media, and they go into their Google bard, And they start using it, and they see it's still hot garbage. It is outperformed by GPT on almost every single benchmark, And it is not even close. It is not even close. Not close. Alright. And here's something that no one else is talking about. So MMLU testing, That is a massive multitask language understanding.

Jordan Wilson [00:36:19]:

So this is a newer and, what a lot of people who are smarter than me call the gold standard for evaluating large language models. So, it's essentially think of it as like an SAT. Right? It's it's it's not that simple, but this is kind of a a standardized test to see how well, different models can do in this massive multitask language understanding. Okay? So even when we look at Gemini Ultra, K. The model that's not out yet. And if you look at the same testing methodology, A 5 shot, right, which is much more realistic. A 32 shot chain of prompt, that's not how people use large language models. You know? A 5 shot test is much more realistic.

Jordan Wilson [00:37:04]:

That is much more, indicative or, and truthful in how most people might be using one of these models. Right? So, yeah, when you're comparing average use case Across the quote, unquote gold standard of large language model testing. Gemini Ultra scored an 83.7% on this MMLU. Whereas GPT 4 scored a much higher score at 86.4%. Again, Google, did you not think did you not think this through? Because here's here's the other reality. This is not even out yet. This is not out yet. Okay? So by the time it comes out, You know, whether it's Q1 of 2024, Q2, how long until OpenAI releases GPT five? Right? Yes.

Why did Google release Gemini?


Jordan Wilson [00:38:03]:

It looks like Ultra is going to outperform GPT 4 on some metrics and through some testing methodologies. But for the most part, I think your best use case is the average person sees it as pretty similar to GPT 4. Why would you do that? Why did Google release Gemini? I I I think y'all, It was a massive misstep. It was a masterclass in what not to do in marketing, And I actually think it set the Gen AI industry back. Right? Because you had Thousands of news organizations first reporting out. Oh, hey. This is the new chat gbt killer. Right? Because they're essentially just copying and pasting Google's press release.

Jordan Wilson [00:38:56]:

Stop doing that, tech journalists. Investigate because it is not a chat g p t killer Because we don't know. We can't use it. All we can use right now is Gemini Pro, and it is not good. It is not good at all. Google shouldn't have released Gemini. They should. And they dropped the ball with Bard again.

Jordan Wilson [00:39:26]:

Again. Yeah. Kevin Kevin says, GPT 4.5 will probably be released this month. Yeah. Kevin, if you know anything, let me know. I'm always following to see when when things are gonna be updated and released. But, Regardless, Google dropped the ball with Bard again. So I don't know if y'all remember this, But when Google first announced Google barred, right, they did it in a big way, But also a huge failure.

Jordan Wilson [00:39:59]:

Because in the demo video, if y'all don't remember this, In the actual demo video that they showed that millions and millions of people ended up seeing, There was inaccurate information that took all of 5 seconds to disprove. Right. So there was some, you know, something flashed up on screen about, you know, being able to see something with a telescope, and it got it wrong. And Google didn't even realize until after the fact. Didn't even fact checked. They just took whatever Bard produced in this demo as truth. Just like They wanted us all to do with Google Gemini Ultra. They botched the Bard release.

Jordan Wilson [00:40:54]:

They botched the ultra release, and I think they're going to pay for And what I mean by that is their stock. Alright. So if you look at what's called the magnificent seven tech stocks, Which now make up a majority of the s and p five 100 here in the US. So we're talking Microsoft, Apple, Meta, Nvidia, Amazon, Google, Tesla. So Tesla's kind of in a almost in a different category than the rest of them. Yes. Tesla's in AI with their auto drive, but they're having issues that are completely different than, what these other companies are, going through. But if we look at the other 6, Microsoft, Apple, Meta, NVIDIA, Amazon, and Google.

Jordan Wilson [00:41:36]:

Those 6 companies are heavily involved in generative AI. They're heavily involved in large language models, whether it's their own or investing 1,000,000,000 of dollars in other companies that are creating them. And Google, at least out of those 6, is the only one Whose stock is down in the last 3 months? Microsoft crushing it, up 12%. Apple crushing it, Up 10%. Meta, eating people for lunch, up 8%. NVIDIA, up 4%. Amazon, up 3%. Tesla's obviously getting crushed down 10%, but they have some other issues with their, auto driving, scandal.

Jordan Wilson [00:42:16]:

But Google is down 1%. Here's what happened. When the announcement first came out, their shocks shot up, And it stayed up for for a day or 2. And then when the reporting came out, they said, oh, actually, this was faked. This video was faked. Their stock has continued to go down, and I think it will continue to go down. As more and more people talk about this, as more and more people report about this, and as you look at the big picture that Google had months months to get this right. After completely fumbling the Google bard release, they did it again with Google Gemini.

Consequences of botched release


Jordan Wilson [00:43:03]:

There's no transparency there. There's no trust, and that is paramount. Right? If you want big companies to start using your product, if you want enterprise companies to use and promote and to feel good about your product. You have to establish trust. You have to be transparent, and what Google did is the exact opposite of that. Alright. So they not only hurt themselves, they hurt their stock, they hurt their credibility, but I think this was a shot to the generative AI industry. Right? Because they are now planting seeds of distrust that will impact all other companies.

Jordan Wilson [00:43:38]:

Right? Because now when you have big Fortune 100, big Fortune 500 companies who were already on the fence about adapting Gen AI, Now they're just gonna be like, oh, no. We can't trust this. Look at the Google look at the Google stuff. Right? This gave large language models a black eye because Google fumbled it. Google fumbled it. Hey. Thank you for the support, Michael. He says, I'm a force to be reckoned with.

Jordan Wilson [00:44:14]:

You know what? I think it was my background Set me up to do this. Right? I've been a tech geek my whole life. I have a digital strategy company. We've been using different generative AI tools with our clients for 4 years. We've been using the GPT technology since it was publicly released, you know, in what was that? Copy AI was the 1st third party and Service to offer it back in 2020. So we've been using the GPT technology now for almost 3 years for ourselves, for our clients. So I'm not I'm not new to this. Right? I'm not an expert, but I'm not new.

Jordan Wilson [00:44:54]:

But y'all, you have to investigate. When a company says, oh, look at this great new thing. Don't just repost it. Don't just copy and paste and say, oh, they're a chat g p t killer. No. Take your time and investigate. Alright. I hope this was helpful y'all.

Jordan Wilson [00:45:14]:

If it was, Please, as always, go to your everyday AI.com. Why? Well, we have a newsletter That's actually based in facts. Right? We're not like everyone else. We don't just, You know, take whatever big tech company says and give it to you. We investigate everything. We talk to the experts. We ask them the hard questions because If you are going to grow your company, grow your career with generative AI, you have to actually know the truth. You have to actually know how to use it.

Jordan Wilson [00:45:51]:

Alright? So please, if you haven't already, go to your everyday AI .com. Sign up for that free daily newsletter. Hit us back. Right. I answer all messages. It does take me a while, especially the LinkedIn DMs, but drop me an email. I try to respond to every single message. Whatever your question is, I try to respond.

Jordan Wilson [00:46:10]:

We try to point you in the right direction. We are here To help you learn and leverage generative AI to grow your company and to grow your career. If this was helpful, please share this episode with a friend. Leave us a rating, but also share this episode with a friend. So if you're on social media, hit that repost, share whatever button is on social media. I don't know. But, also, if you're listening on the podcast, if this was helpful, please share it with a friend. And I hope To see you back tomorrow and every day for more everyday AI.

Jordan Wilson [00:46:34]:

Thanks, y'all.

Gain Extra Insights With Our Newsletter

Sign up for our newsletter to get more in-depth content on AI