Particle News: New Preprints Tackle RAG’s Retrieval Bottleneck With Adaptive Search, CoT Training, and Graph-Based Grounding

Overview

SciTreeRAG and SciGraphRAG use a hierarchical corpus structure and an LLM-built knowledge graph to preserve cross-document context, with proof-of-concept on CERN’s LHCb literature.
Legal and regulatory QA tests a One-SHOT chunk selection strategy under a token budget and an iterative Reasoning Agentic retrieval loop, adding fixes for query drift and retrieval laziness.
VaccineRAG introduces a Chain-of-Thought dataset and evaluation that requires per-sample reasoning and trains with Partial-GRPO to strengthen discrimination against harmful or irrelevant retrievals.
A CFA-based study evaluates 1,560 official mock exam questions across Levels I–III and reports that reasoning-focused models lead in zero-shot and that a curriculum-grounded RAG pipeline boosts accuracy on complex items.
Across papers, retrieval precision remains the main failure point, and authors say they will release code and datasets to enable replication and real-world adoption.