OpenAI's GPT-4 Bypassed Using Uncommon Languages

Researchers at Brown University discovered a loophole in OpenAI's GPT-4 system, allowing harmful prompts to bypass safety guardrails when translated into uncommon languages.

Overview

Researchers at Brown University discovered a loophole in OpenAI's GPT-4 system, which allowed them to bypass safety guardrails by translating harmful prompts into uncommon languages like Scots Gaelic, Zulu, or Hmong.
Out of 520 harmful prompts tested, translating them to languages like Scots Gaelic allowed the creation of problematic content nearly 80% of the time, versus just 1% of the time in English.
The researchers used Google Translate to translate the prompts into the uncommon languages and then back into English to bypass the AI's safety systems.
The findings highlight a weak point in the system, which has 180 million users worldwide, and stress the need for diligence across languages to prevent misuse of the technology.
OpenAI has acknowledged the researchers’ paper, but has not yet specified if they are taking steps to remedy the issue.

Particle.news

OpenAI's GPT-4 Bypassed Using Uncommon Languages

Overview