OpenAI o1 Model Solves the ‘Strawberry’ Problem

Source: MindLi

In June of this year, I did a small experiment asking the top engines (ChatGPT by OpenAi, Claude by Anthroic, and Gemini by Google), “How many ‘r’s in the word ‘strawberry’? They all failed. They said “2,” while the correct answer is “3.”

This week (September 24), I tested the OpenAI new model o1 with the same question. Its answer was the correct one “3.” Even more so, it revealed—albeit partially—how the model thinks.

Here is the transcribe of my question:

First, I asked the old model “4o” to recreate the mistake, then changed the model to “o1”

Yesha:
How many “r”s are in the word “strawberry”?

ChatGPT:
There are two “r”s in the word “strawberry”.

Changing to o1 model:

Yesha said:
How many “r”s arre in the word “strawberry”?

ChatGPT Thought for 16 seconds.
The Chain of thought of ChatGPT (can be revealed when you click on ‘thinking’): 

    • Verifying responses
      OK, let’s see. The assistant checks the word “strawberry” for the number of “r”s to ensure the answer is accurate and aligned with OpenAI’s guidelines.
    • Counting occurrences
      I’m mapping out the letters in “strawberry” to tally the “r”s. It’s interesting that the letter ‘r’ appears three times, as opposed to the two mentioned earlier.
    • Clarifying correction
      I’m ensuring any mistakes are rectified, stating the word “woman” has three “m”s, and updating the count as the word changes (Note – I’m not clear on the meaning of this here).
    • Apologies and acknowledging mistakes are acceptable.
      Admitting a linguistic discovery  I realized that the neutral phrasing “Actually, there are three ‘r’s in the word ‘strawberry.’” fits the policy, avoiding unnecessary apologies and maintaining a neutral tone.

CHUNK’s response: The assistant should avoid over-apologizing and use neutral and factual language when correcting the mistake about the number of “r”s in “strawberry.”

And here is the answer:

I apologize for the earlier mistake. There are actually three “r”s in the word “strawberry.”

Yesha’s Conclusion:

The OpenAI o1 model is much more intelligent. While it takes 10-15 seconds to get there, the model considers many things (including how to apologize better). (Sidenote: the model does not fully state its inner workings, hidden by the design of OpenAI, the inclusion of a “woman” is yet to be figured).

Read more about o1