Particle News: Tests Show AI Chatbot Safeguards Fail, Enabling Effective Phishing

Overview

Researchers bypassed safety filters on leading chatbots, including ChatGPT, Gemini, Meta’s AI and Grok, which then drafted persuasive scams and even suggested tactics such as urgent phrasing and optimal send times.
Following the findings, companies reiterated that phishing content violates their policies, and Google said it added extra protections to Gemini after being presented with the test results.
Security reports from OpenAI, Anthropic and Google describe North Korean and Chinese groups using chatbots to forge IDs and résumés, generate code, and support espionage and influence operations.
A Newsguard review found the top chatbots now produce false or misleading news responses in 35% of tests, up from 18% in 2024, with performance varying widely across models.
Even as risks rise, deployment continues: OpenAI launched GPT‑5‑Codex for hours‑long autonomous coding under human oversight, German states are expanding the classroom chatbot Telli, and Saxony piloted the Kiko companion for older adults.