Both models are at the forefront of technology, offering powerful natural language processing capabilities that cater to various applications. However, despite their similarities, GPT-4o and Gemini 1.5 Pro exhibit distinct differences in architecture, performance, and use cases. We will be taking the latest snapshot of ChatGPT, released, on 6th of August. This article explores these differences to help you understand which model might be better suited to your needs.
Although GPT-4o has an edge in terms of output speed and the ability to generate longer responses, Gemini 1.5 Pro stands out with the larger input context window, allowing it to handle more extensive and complex contexts.
This benchmark is the combination of official release notes for both models, as well as multiple open benchmarks.
Although Gemini performs better in Maths, GPT-4o beats it in all other benchmarks. This probably has to do with the fact that Gemini has been updated multiple times since the benchmarks release. It'll be interesting to check, whether GPT also beats Gemini in coding - we'll test that...
We're thoroughly testing AI in challenging, realistic situations. We'll use prompts related to chat, coding, and image analysis to evaluate each model's performance.
Some of the outputs will be edited down for shortness or clarity.
In 50-75 words, explain what an anagram is and give two examples -
one word and one phrase.
An anagram is a word or phrase formed
by rearranging the letters of another
word or phrase.
It is like a word puzzle where you
shuffle the letters to create
something new!
Word Example:
`Listen - Silent`
Phrase Example:
`School master - The classroom`
An anagram is a rearrangement
of the letters in a word or phrase
to form a new word or phrase
...
For example, the word "listen"
can be rearranged to form "silent."
As a phrase,
"the eyes" can be rearranged to
"they see."
Read the following sentence:
"The complex houses married and single soldiers and their families."
In 40-60 words, explain why this sentence is grammatically correct
but confusing. Then, rephrase it to make its meaning clear.
The sentence is grammatically correct
because it follows subject-verb
agreement, but it is confusing
due to the ambiguous modifier
"married and single soldiers."
It sounds like the complex houses both
groups and their families, which is
illogical since single soldiers
would not have families in the
traditional sense.
Rephrased:
"The complex houses soldiers,
both married and single,
and provides accommodation
for their families."
The sentence is grammatically correct,
but its structure creates ambiguity.
The word "complex" can be interpreted
as either an adjective describing
the houses or as a noun.
Additionally,
"married and single soldiers
and their families" lacks clarity
about who lives there.
Rephrased:
"The housing complex accommodates
soldiers, whether married or single,
along with their families."
1111 = 0;
1234 = 0;
9876 = 4;
5656 = 2;
0011 = 2;
0743 = 1;
8911 = 3;
8888 = ?
This is a classic pattern recognition
puzzle. The answer lies in counting
the number of closed circles in each
number.
Let us break it down:
1111 = 0: The number '1' has no
closed circles.
…
8911 = 3: The '8' has two closed
circles, and the '9' has one.
Following this pattern:
Since each '8' has two closed
circles, the total is
2 + 2 + 2 + 2 = 8.
Therefore, "8888 = 8"
This puzzle is based on counting
the number of closed loops in each
of the digits. Let us break it down:
- 0 has 1 closed loop.
- 1 has 0 closed loops.
…
Finally, for 8888:
Each '8' has 2 loops.
Total = 2 + 2 + 2 + 2 = 8.
So, the answer is "8".
Every morning Aya goes for a "9"-kilometer-long walk and stops at a coffee shop
afterwards. When she walks at a constant speed of "s" kilometers per hour, the
walk takes her "4 hours", including "t" minutes spent in the coffee shop.
When she walks "s+2" kilometers per hour, the walk takes her "2 hours and
24 minutes", including "t" minutes spent in the coffee shop.
Suppose Aya walks at "s + 1/2" kilometers per hour.
Find the number of minutes the walk takes her, including the "t" minutes
spent in the coffee shop.
Let us break this problem down
step-by-step:
...
We know: Distance = Speed * Time
So:
Scenario 1:
9 = s * (240 - t) / 60
...
Scenario 2:
9 = (s + 2) * (144 - t) / 60
...
540 = 240 * s - s * (144 - 48 * s)
Simplify and solve the quadratic
equation for s.
"Error happens on the next line"
You will find s = 3 km/h.
...
Answer: "154.2 minutes"
To solve this problem, we need to
determine the walking speed of Aya,
denoted as "s",
and the time she spends in the
coffee shop, denoted as "t".
Equation for the first scenario:
9/s + t/60 = 4
Equation for the second scenario:
9/(s+2) + t/60 = 2.4
Solve the system of equations:
...
s = 2.5 km/h.
t = 24 minutes
Finally, calculate the total time
when Aya walks at
s + 0.5 = 3 km/h: 9/3 + 24/60 =
3 + 0.4 = 3.4 hours
Thus, the walk,
including the coffee shop time,
takes Aya "204 minutes".
Objective: Create a Tetris game using Python and the Pygame library.
The game should include classic Tetris gameplay,
with blocks of different shapes falling from the top of the screen.
The player can move, rotate, and place the blocks,
aiming to complete horizontal lines to score points.
Requirements:
Game Board:
Tetrominoes:
Game Logic:
Scoring:
Game Over:
Controls:
UI Elements:
With such tasks it is paramount to have very detailed requirements, which you can write out with an AI model itself.
Analyze the following image:
Clever trick!
You still have 4 marbles,
but they're no longer in the cup
because you turned it upside down!
They're probably scattered
around on the floor or counter now!
You still have 4 marbles in the cup,
even though it is now upside down and
in the freezer
These tests are the hardest we've thrown at any model - especially the coding part. And as you can see - one of the two models is always ready to solve the tasks you provide. We'd recommend ChatGPT-4o for language comprehension, and Gemini 1.5 for coding.
GPT-4o and Gemini 1.5 have equal input prices with lower output price by Gemini.
You've seen these models in action. Now it's your turn to test them for your specific needs. Copy the code below into Google Colab or your preferred coding environment, add your API key, and start experimenting!
import openai
import requests
def main():
client = OpenAI(
api_key='<YOUR_API_KEY>',
base_url="https://api.aimlapi.com",
)
# Specify the two models you want to compare
model1 = 'gpt-4o-2024-08-06'
model2 = 'gemini-1.5-pro'
selected_models = [model1, model2]
system_prompt = 'You are an AI assistant that only responds with jokes.'
user_prompt = 'Why is the sky blue?'
results= {}
for model in selected_models:
try:
response = client.chat.completions.create(
model=model,
messages=[
{'role': 'system', 'content': "be strong"},
{'role': 'user', 'content': "who is strong?"}
],
)
message = response.choices[0].message.content
results[model] = message
except Exception as error:
print(f"Error with model {model}:", error)
# Compare the results
print('Comparison of models:\n')
print(f"{model1}:\n{results.get(model1, 'No response')}")
print('\n')
print(f"{model2}:\n{results.get(model2, 'No response')}")
if __name__ == "__main__":
main()
While both models show competence in various tasks, Gemini 1.5 Pro demonstrates stronger overall performance, especially in more complex tasks like coding and problem-solving. It is also surprisingly cheaper. GPT-4o, even after price cuts for its newer model - remains a very expensive alternative to most competitors. It also fails to perform in coding, albeit its language skills are as high as they get.
You can access both Gemini 1.5 PRO and the latest snapshot of ChatGPT-4o API, or see our full model lineup here - try for yourself, and get a feel for the frontier AI power!
Also check these other comparisons with the models:
ChatGPT-4o mini vs Llama 3.1 8B
ChatGPT-4o vs LLama 405B
ChatGPT-4o vs Sonnet 3.5
ChatGPT-4o vs Qwen2