Overview
- A new arXiv study finds direct multimodal embedding retrieval outperforms text-summary pipelines for image–text corpora, improving mAP@5 by 13% and nDCG@5 by 11% on a financial earnings benchmark.
- The multimodal analysis reports more accurate, factually consistent answers when images are stored natively in the vector space rather than summarized into text before embedding.
- Another arXiv preprint introduces Cluster-based Adaptive Retrieval, which selects retrieval depth by detecting clustering transitions in similarity distances for each query.
- CAR reports 60% lower LLM token usage, 22% faster end-to-end latency, and 10% fewer hallucinations in tests, with the authors also claiming a 200% engagement lift after integration into Coinbase’s virtual assistant.
- A domain-focused preprint presents Mycophyto, a RAG pipeline for arbuscular mycorrhizal fungi that pairs semantic retrieval with structured extraction of experimental metadata stored in a vector database.