Particle News: arXiv Study Unveils Adversarial Instructional Prompts That Subvert RAG With 95% Success

Overview

The preprint introduces Adversarial Instructional Prompts that covertly shift RAG behavior by influencing retrieval rather than user queries.
The approach combines diverse query synthesis with a genetic algorithm to evolve prompts that appear natural, useful, and robust.
Experiments report up to 95.23% attack success while maintaining benign functionality in normal tasks.
The authors identify instructional prompts as widely shared and rarely audited components and urge stronger provenance and integrity checks.
The findings are from a newly posted, unreviewed paper, and practitioner guidance stresses validation pipelines and human review to detect such risks.