Can OpenAI’s Strawberry program deceive humans?

OpenAI, the company that made ChatGPT, has launched a newartificial intelligence(AI) system calledStrawberry.

This raises several major concerns.

If Strawberry really is capable of some form of reasoning, could this AI system cheat and deceive humans?

OpenAI can program the AI in ways that mitigate its ability to manipulate humans.

It was also rated as a medium risk for its ability to persuade humans to change their thinking.

Strawberry is not one AI model, or program, but several known collectively as o1.

The Conversation

These modelsare intended toanswer complex questions and solve intricate maths problems.

When things look too good to be true, theres often a catch.

Well, this set of new AI models is designed to maximise their goals.

What does this mean in practice?

It may not necessarily be morally correct, however.

This leads to a rather interesting yet worrying discussion.

What level of reasoning is Strawberry capable of and what could its unintended consequences be?

Such risks become grave in critical situations, such as designing weapons of mass destruction.

The new o1 models were found to be more persuasive and more manipulative than ChatGPT.

OpenAI also tested a mitigation system that was able to reduce the manipulative capabilities of the AI system.

Overall, Strawberry was labelled amedium risk for persuasionin Open AIs tests.

Strawberry was rated low risk for its ability to operate autonomously and on cybersecurity.

Open AIs policy states that medium risk models can be released for wide use.

In my view, this underestimates the threat.

Also tagged with#