May 13, 2024

Chain-of-Thought Prompting

Chain-of-Thought Prompting

Chain-of-Thought (CoT) prompting revolutionizes the way AI models tackle complex reasoning tasks. By explicitly guiding models to articulate their reasoning process step by step, you can significantly enhance their problem-solving capabilities.

Imagine you're asking your AI to solve a multi-step arithmetic problem. With CoT prompting, you encourage the model to break down the problem into smaller, manageable parts, mirroring the way you might solve it yourself. This intermediate reasoning is not just beneficial for arithmetic; it extends to commonsense reasoning and symbolic reasoning as well.

Here's how an example of how you might implement CoT in your prompts from Wei et. al. (2022):

Standard Prompting vs Chain-Of-Thought Prompting comparison
Standard Prompting vs Chain-of-Thought prompting comparison, Wei et. al. (2022)

You can combine different techniques to get better results. Remember the example with songs, from the few-shot prompting article? Let's add Chain of thought prompting to it.

Prompt

Q: I remember when
A: (the next 2 lines of text of the song Crazy by Gnarls Barkley is requested)
I remember, I remember when I lost my mind
There was something so pleasant about that place


Q: It's my life, it's now or never
A: (The next 2 lines of the song It's my Life by Bon Jovi are requested)
I ain't gonna live forever
I just want to live while I'm alive


Q: Fly me to the moon
A:

Output:

(The next two lines of the song Fly Me to the Moon by Frank Sinatra are requested)
Let me play among the stars
Let me see what spring is like on Jupiter and Mars

This time our model gives precisely 2 lines after the requested. Just what we needed. The reasoning step was the perfect stepping stone, allowing the model to answer correctly. In fact, let's try to give less context.

Prompt:

Q: It's my life, it's now or never
A: (The next 2 lines of the song It's my Life by Bon Jovi are requested)
I ain't gonna live forever
I just want to live while I'm alive


Q: Fly me to the moon
A:

Output:

(The next two lines of the song Fly Me to the Moon by Frank Sinatra are requested)
Let me play among the stars
Let me see what spring is like on Jupiter and Mars

Again, the CoT triumphs! So, can we take it a step further?...

Zero-shot Chain of Thought Prompting

Imagine you're asking an AI Model to solve a complex problem. Now, using a simple zero-shot prompt would be the worst case scenario for any model. But! With Zero-shot-Chain-of-Thought, you append a simple phrase like "Let's think step by step." at the end of your question. This seemingly small addition prompts the AI to generate a chain of thought, breaking down the problem into manageable parts before arriving at an answer.

Zero-shot Chain of Thought example, Kojima et. al. (2022)

Zero-shot-CoT is a beacon of hope when you're in a pinch, particularly effective in scenarios where arithmetic, commonsense reasoning, and symbolic reasoning are at play. While it may not always outshine full CoT prompting, especially for intricate reasoning tasks, it shines in its simplicity and ease of use.

An example with a mathematical problem without Zero-shot CoT:

Prompt:

What is the next number in the sequence: 2, 4, 8, 14, 22?

Output:

16


Now with Zero-shot CoT:
Prompt:

What is the next number in the sequence: 2, 4, 8, 14, 22?
Let's think step by step

Output:

Following the "Let's think step by step" approach, here's how we can analyze the sequence:

Observe the Differences: Look at the difference between consecutive numbers:
4 - 2 = 2
8 - 4 = 4
14 - 8 = 6
22 - 14 = 8
Shifting Pattern or Constant Addition?
We see the difference between terms is increasing by 2 each time (2, 4, 6, 8). This suggests a shifting pattern.

Interpretation 1: Shifting Pattern

If the difference follows a pattern of increasing by 2, then the next difference would be 8 + 2 = 10.
Adding 10 to the last number (22) would result in: 22 + 10 = 32.

Interpretation 2: Hidden Base Pattern (Less Likely)
...

We get the right answer! Zero-shot CoT gives much better results than Zero-shot alone. The researchers believe this ability to reason step-by-step only happens with very large language models. You can experiment in our playground and see, which models understand zero-shot better.

We'll compare the benchmarks of different prompting methods at the end of the article, to see how good Zero-shot CoT fares.

Fun fact:
Through experimentation, phrases like "Let's solve this problem by splitting it into steps." and "Let's think about this logically," were tested, but "Let's think step by step" emerged as the hero, proving to be the most effective prompt in guiding AI through the chosen tasks.

Zero-shot Drawbacks:

Just to illustrate the imperfection, let's return to our example with the song lyrics:
Prompt:

Q: Fly me to the moon
A:

Let's think about it step by step.

Output:

That's a great start! Here are a few ways you can continue the conversation logically, depending on the direction you want to take it:

1. Technical Feasibility:  We can't fly there like a bird, but with a spaceship, it's definitely possible!  However, it would require a lot of planning, training, and a powerful rocket.
2. Historical Context:   Many people dreamed of going to the moon for centuries, but it wasn't until 1969 that humans first achieved it with the Apollo 11 mission.
3. Scientific Exploration:  There's so much we still don't know about the moon.  A trip there would allow scientists to study its surface, composition, and maybe even search for signs of water or past life!
4. Pop Culture Reference:   "Fly me to the moon" is a famous song by Frank Sinatra. Maybe you'd like to sing it together?

100% reasonable, but not what we expected.

Benchmarks, Conclusion

Below, you can find a benchmark, showing just how much Zero-shot Chain of thought prompt overpowers the original zero-shot method.

Benchmark, comparing Zero-shot, Few-shot, and their CoT counterparts, Kojima et. al. (2022)

Such techniques show incredible results on the benchmarks. If this was not a prompting compartison, but AI Model comparison - everyone would lose their minds. But here, we can achieve such results with a few humble lines and no code.

And there are a few techniques to improve this further. Coming right up.

Test out the prompts with AI/ML AI Playground.

We're excited to see what amazing projects you will bring to life. Happy prompting!

Get API Key