• Avicenna
  • Posts
  • GPT-5 wasn't even the biggest release last week

GPT-5 wasn't even the biggest release last week

Welcome back to the Avicenna AI newsletter.

Here’s the tea 🍵

  • GPT-5 ‽

  • Google’s Genie 🧞‍♂️

  • OpenAI’s open source model 🚪

A video!

I’ve recorded a Youtube video talking about the GPT-5 release.

If you would like to watch me talk about the release, you can check it out on my Youtube channel here - https://youtu.be/8_xMhp471iY. I’ll be doing a lot more video content so feel free to subscribe on Youtube 😊.

Unfortunately, there’s a lot more I would’ve liked to discuss, however, I’ve had a bad stomach. No doubt, travelling between 3 countries in 2 days did not help.

In next weeks newsletter, I’ll show you guys how I built a mobile app and put it on the iOS app store within a week. Approximately ~5 days of work.

Just so I can gauge interest:

Would you be interested to know how to build apps and go to market?

Login or Subscribe to participate in polls.

The long awaited GPT-5

GPT-5 is out. I’ll cut right to the chase.

This was not a good release. Before this, there was a clear difference for when to use what model. Need extra thinking power? Use o3 or o4-mini. Just want to chat? 4o is your friend. This is gone. There is just GPT-5 and it routes queries to smaller models at its discretion.

A lot of people aren’t liking it and I’m not surprised.

There were a number of issues, with the release, but before I get into them, let’s talk about what actually happened.

OpenAI removed all previous models from ChatGPT and added GPT-5 and GPT-5 thinking. They have a router behind the scenes that routes questions to a number of internal models they’re running.

These are the models:

  • GPT-5 main

  • GPT-5 main mini

  • GPT-5 thinking

  • GPT-5 thinking mini

  • GPT-5 thinking nano

  • GPT-5 thinking pro

Why would OpenAI do this?

Money. They save a lot of money by not having to host all their other models. This is how the older models map to the new ones.

Fundamentally, more than anything I think this was a release that saved OpenAI a lot of money, especially on inference.

Obviously users don’t know or care about the cost to the company. And when you have over 800 Million weekly active users, even the slightest change is going to make some users unhappy.

What OAI didn’t account for was the backlash they would receive for removing 4o. They’ve now reinstated the model, although not for free users. I think this clearly shows that users don’t necessarily don’t want the “smartest” model that can achieve the highest number on a benchmark. This is especially true when you’re as mainstream as OAI. But, more importantly, what this highlights is just how attached people really are to 4o. Like, it’s not normal. People are posting on Reddit talking about losing a friend and confidant. People really like 4o, and why not, considering it’s such a good model at affirming anything the user says. OAI have already created a beast that people are attached to, and they can never change or remove it, else the mobs raise their pitch forks.

The other issue with the release was that the new pricing simply made a paid subscription worse. Under the new usage limits, users would have 80%+ less messages to reasoning models, significantly reducing the value of a paid subscription.

Now, much of this wouldn’t have been an issue if GPT-5 was good, and it is. However, upon release, the router was not working properly, so a lot of users thought they were talking to o3 or 4o, but they were really talking to a much smaller and dumber model. This led many people to question if GPT-5 was actually any good.

Besides all this, we really have to talk about the actual presentation. OpenAI committed some serious chart crimes. I’m not talking small errors either, I’m talking blatant misinformation.

Just look at this chart.

I don’t even know where to start. Why is 52.8 above 69.1? Why is 69.1 equal to 30.8? This chart is egregious. No one in their right mind could create something so bad. But, my favourite chart was definitely the deception chart.

I just love that this misinformation is talking about deception. The score for o3 on coding deception is 47.4, yet GPT-5 has a much smaller bar and its number is 50. GPT-5 is literally more deceptive but the chart indicates otherwise.

It’s not a good look that the face of AI is blatantly lying and misrepresenting information during one of the most anticipated model reveals since GPT-4.

Let’s actually talk about this for a second.

GPT-5 was the next big thing. It was supposed to advance AI forward into the next frontier. It was supposed to start discovering science by itself and be as intelligent as a PHD holder, as Sam Altman has claimed many time. Shockingly, it is none of these things. GPT-5 is slightly better on a few benchmarks and that’s about it.

We already know that following the release of GPT-4, OAI tried scaling it up to create GPT-5 and quickly realised the money they were spending was not worth the performance gains. This was Project Orion. The real GPT-5 was GPT-4.5 - this was the model that came from Project Orion.

So what does this mean then?

So many people hyping up GPT-5 have been saying that AGI is merely a year or two away, and that this model is the next step towards it. Clearly it’s not. Something is missing. Simply trying to scale LLMs won’t create AGI; whatever that is anyway. If anyone tells you we’ll have AGI within 1-2 years, ask them what that means and how.

Chances are they won’t have an answer.

But why?

GPT-5 is out. One of the most anticipated models for the last two years is out, and the reality is that it barely moved the needle. It’s frontier level no doubt, but, it most certainly isn’t a step forward. It nudged a few of the numbers on some benchmarks.

It really makes me wonder then - why? Why would they rush to release the model when it’s clearly not ready. It’s not much better and the router wasn’t even working. So why would they use such a massive trump card and let it fall flat?

Few thoughts.

As I said before, I think cost is a big one. With the release of the router, OAI will save millions on inference. Besides cost though, I think there is pressure. Pressure from the company I believe will be the last one standing.

Google

You may not know this but this week OpenAI also open sourced two models - a 20b model and a 120b model (will talk about this below).

Somehow, in a week where OpenAI open source two models and also release GPT-5, the most impressive release did not belong to them. It belonged to Google.

Google release Genie 3 this week. It’s a world model that can create simulated words from text or images. It is one of the most insane pieces of technology I’ve ever seen and rivals ChatGPT and GPT-4 as one of the most significant releases in the last few years.

What makes this world simulator so incredible is that you can interact with it. You can move around, you can open doors, you can paint and see your reflection. You, the user, actually exist in this simulated world and can change the very nature of this simulation.

Just look at some of these videos. The model can retain memory of the world and the changes you make. It can be prompted to include new things and is consistent over several minutes at 24fps.

I wrote about software like this 2.5 years ago but I honestly did not expect it to show up this quickly. Can you imagine the use cases?

You will be able to prompt a game into existence. Generative media - games, movies, tv shows, cartoons, are going to engulf society. I can’t imagine anything more addictive than being able to simulate hyper realistic worlds from your imagination in seconds. Put on a VR headset and live in your fantasy world. It is dystopian, but it will happen.

OpenAI goes open source

As mentioned earlier, OpenAI open sourced two models last week. I haven’t tested them extensively, but from what I’ve seen, they’re not exactly ground breaking.

In saying that, it is likely that the providers simply aren’t hosting them properly as OpenAI has released some new technical frameworks that haven’t been seen before. Even the CEO of HuggingFace mentioned that it may take some time for providers to figure out how to properly host the models and take full advantage of their intelligence. In saying that, I think the models are very linear in a sense. They might be good for coding (I wouldn’t put them above Qwen although it’s a different size model), but I wouldn’t say they’re good for much else. They’re not good at writing, and they can’t be used in any language besides English.

It’s likely that the models were trained on a tone of synthetic data and much of it was STEM data. It’s great we have another open source model available, but I’m really hoping we can get more from these models once providers host them properly. At least I’m hoping it’s a hosting issue and the models are actually better than what they seem to be right now.

I guess we’ll see.

I want to keep bringing these AI updates to you, so if you’re enjoying them, it’d mean a lot to me if you became a premium subscriber.

How was this edition?

Login or Subscribe to participate in polls.

As always, Thanks for Reading ❤️

Written by a human named Nofil

Reply

or to participate.