Abstract
Retrieval Augmented Generation suffers from high distraction from top-ranked passages, rendering LLM positional bias less impactful than previously thought.
Retrieval Augmented Generation enhances LLM accuracy by adding passages retrieved from an external corpus to the LLM prompt. This paper investigates how positional bias - the tendency of LLMs to weight information differently based on its position in the prompt - affects not only the LLM's capability to capitalize on relevant passages, but also its susceptibility to distracting passages. Through extensive experiments on three benchmarks, we show how state-of-the-art retrieval pipelines, while attempting to retrieve relevant passages, systematically bring highly distracting ones to the top ranks, with over 60% of queries containing at least one highly distracting passage among the top-10 retrieved passages. As a result, the impact of the LLM positional bias, which in controlled settings is often reported as very prominent by related works, is actually marginal in real scenarios since both relevant and distracting passages are, in turn, penalized. Indeed, our findings reveal that sophisticated strategies that attempt to rearrange the passages based on LLM positional preferences do not perform better than random shuffling.
Community
The paper shows how positional bias on relevant and distracting passages compensates for each other in real RAG scenarios, resulting in a negligible impact on the LLM average accuracy.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- The Distracting Effect: Understanding Irrelevant Passages in RAG (2025)
- Benchmarking the Myopic Trap: Positional Bias in Information Retrieval (2025)
- GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis (2025)
- On the Consistency of Multilingual Context Utilization in Retrieval-Augmented Generation (2025)
- CAFE: Retrieval Head-based Coarse-to-Fine Information Seeking to Enhance Multi-Document QA Capability (2025)
- MIRAGE: A Metric-Intensive Benchmark for Retrieval-Augmented Generation Evaluation (2025)
- Multilingual Retrieval-Augmented Generation for Knowledge-Intensive Task (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper