Overview
- Posted September 25 on arXiv, the work claims to close a gap by systematizing the threat landscape for retrieval‑augmented generation.
- The authors classify adversaries by their access to model components and data across the retrieval pipeline.
- They provide formal definitions for concrete attacks, including document‑level membership inference and data poisoning.
- The paper details risks tied to external knowledge bases, such as leaking the presence or content of retrieved documents and injecting malicious content to steer outputs.
- The framework is positioned as a foundation for rigorous defenses, audits, and more principled security practices in RAG deployments.