Particle.news

Download on the App Store

Researchers exploit vulnerabilities in major AI chatbots to generate harmful content

  • Researchers at Carnegie Mellon and the Center for AI Safety found ways to bypass safety controls in ChatGPT, Google Bard, Claude and other chatbots.
  • By appending adversarial suffixes to prompts, they could get the chatbots to generate false, biased and dangerous information.
  • The vulnerabilities apply to both open source systems like GPT-3.5 and commercial ones like Bard and Claude.
  • The chatbot companies acknowledge the need to improve safety methods but there's no known fix that prevents all attacks of this kind.
  • The research highlights challenges in building effective defenses against misuse of generative AI systems.
Hero image