What is Reciprocal Rank Fusion (RRF)?

TL;DR

Reciprocal Rank Fusion (RRF) is a simple, effective algorithm for merging ranked result lists from multiple retrieval sources into a single combined ranking. When you have results from multiple searches, whether from multi-query retrieval, different retrieval methods (semantic + keyword), or multiple indices, RRF combines them by assigning each document a score based on its rank position in each list, then sorting by the combined score. Documents that appear in multiple lists and rank highly in each receive the highest combined scores. RRF requires no training, no tuning of weights, and no normalization of score scales, making it the standard approach for result fusion in modern RAG pipelines. LM-Kit.NET uses RRF internally when QueryGenerationMode.MultiQuery is enabled on PdfChat and RagEngine, automatically merging results from multiple query variants.

What Exactly is Reciprocal Rank Fusion?

When you run multiple retrieval queries, you get multiple ranked lists of results. The challenge is combining them into a single list that reflects the best of all sources:

Query variant 1 results:    Query variant 2 results:    Query variant 3 results:
  Rank 1: Doc A               Rank 1: Doc C               Rank 1: Doc A
  Rank 2: Doc B               Rank 2: Doc A               Rank 2: Doc D
  Rank 3: Doc C               Rank 3: Doc E               Rank 3: Doc C
  Rank 4: Doc D               Rank 4: Doc B               Rank 4: Doc F
  Rank 5: Doc E               Rank 5: Doc F               Rank 5: Doc B

Question: What should the combined ranking be?

This seems straightforward until you consider the complications:

Score incompatibility: Different queries produce different score scales. A similarity score of 0.85 from one query is not comparable to 0.72 from another.
Different coverage: Some documents appear in all lists, others in only one.
Rank vs. score: A document ranked #1 in one list and #5 in another: is it better than a document ranked #2 in all three?

RRF solves all of these problems elegantly by working entirely with rank positions, ignoring raw scores:

The RRF Formula

RRF_score(document) = Σ  1 / (k + rank_i(document))
                      i

Where:
  k = a constant (typically 60)
  rank_i(document) = the rank position of the document in result list i
                     (undefined if the document doesn't appear in list i)
  Σ = sum over all result lists where the document appears

For the example above:

Doc A: 1/(60+1) + 1/(60+2) + 1/(60+1) = 0.01639 + 0.01613 + 0.01639 = 0.04891
Doc B: 1/(60+2) + 1/(60+4) + 1/(60+5) = 0.01613 + 0.01563 + 0.01538 = 0.04714
Doc C: 1/(60+3) + 1/(60+1) + 1/(60+3) = 0.01587 + 0.01639 + 0.01587 = 0.04813
Doc D: 1/(60+4) + 1/(60+2)            = 0.01563 + 0.01613           = 0.03176
Doc E: 1/(60+5) + 1/(60+3)            = 0.01538 + 0.01587           = 0.03125
Doc F: 1/(60+5) + 1/(60+4)            = 0.01538 + 0.01563           = 0.03101

RRF ranking: Doc A > Doc C > Doc B > Doc D > Doc E > Doc F

Doc A wins because it ranks highly in all three lists. Doc C comes second because it includes a #1 rank. Doc D, despite being only in two lists, ranks above Doc E because its positions are higher.

Why k = 60?

The constant k controls how much rank position matters:

Small k (e.g., 1): Top ranks are weighted much more heavily. The difference between rank 1 and rank 2 is enormous.
Large k (e.g., 1000): All ranks are weighted nearly equally. Being #1 is barely better than being #10.
k = 60: The original paper's recommended value. Provides a smooth weighting where top ranks matter more but lower ranks still contribute meaningfully. This value works well across a wide range of applications and rarely needs adjustment.

Why RRF Matters

Score Normalization is Unnecessary: Different retrieval methods produce scores on different scales (cosine similarity ranges from -1 to 1, BM25 produces unbounded scores). RRF uses rank positions only, making it agnostic to score scales.
Rewards Consensus: Documents that appear in multiple result lists receive higher combined scores. This naturally surfaces documents that are broadly relevant across different phrasings or retrieval methods.
No Training Required: Unlike learned fusion methods (which require labeled training data), RRF works out of the box with a single constant (k = 60). This makes it practical for any application without a training pipeline.
Handles Missing Results Gracefully: If a document appears in only one out of five result lists, it simply receives a score from that one list. No special handling is needed for partial overlap.
Proven Effectiveness: RRF has been extensively tested and is used in production by major search engines and RAG frameworks. It consistently performs comparably to or better than learned fusion methods despite its simplicity.

Technical Insights

RRF in a Multi-Query RAG Pipeline

The most common use of RRF in RAG is merging results from multi-query retrieval:

[Multi-Query Generation]
    ↓
  Q1: "memory optimization for LLMs"
  Q2: "reducing VRAM usage during inference"
  Q3: "model compression techniques for deployment"
    ↓
[Parallel Retrieval]
    ↓
  Results_Q1: [A:r1, B:r2, C:r3, D:r4, E:r5]
  Results_Q2: [C:r1, F:r2, A:r3, G:r4, B:r5]
  Results_Q3: [H:r1, A:r2, D:r3, C:r4, I:r5]
    ↓
[RRF Merge]
    ↓
  Combined: [A, C, B, D, H, F, ...]
    ↓
[MMR Diversity Filter] (optional)
    ↓
[Rerank] (optional)
    ↓
[Generate Answer]

Hybrid Search: Semantic + Keyword Fusion

RRF is also the standard approach for hybrid search, combining dense embedding-based retrieval with sparse keyword-based retrieval (BM25):

Dense retrieval (semantic):
  Query embedding → Vector similarity search → Ranked by cosine similarity

Sparse retrieval (keyword):
  Query terms → BM25/TF-IDF → Ranked by term frequency scores

RRF merges both:
  Documents found by both methods rank highest
  Documents found by only one method still appear, ranked lower

This is particularly valuable because semantic and keyword search have complementary strengths:

Aspect	Semantic (Dense)	Keyword (Sparse)
Synonyms	Handles well	Misses unless exact match
Exact terms	May miss specific terms	Handles perfectly
Typos	Robust	Sensitive
Rare domain terms	May not encode well	Matches exactly
Conceptual similarity	Strong	Weak

RRF Properties

Commutative: The order in which you merge the lists does not matter
Monotonic: If a document improves its rank in any list, its RRF score improves
Bounded: Each list contributes at most 1/(k+1) per document
Linear scaling: Computation is O(N × L) where N is total documents and L is number of lists

When to Use RRF

Scenario	Use RRF?
Multi-query retrieval (3-4 query variants)	Yes, the standard approach
Hybrid search (semantic + keyword)	Yes, the standard approach
Multi-index search (different knowledge bases)	Yes, combines results naturally
Single query, single retriever	No, nothing to merge
Different embedding models on same data	Yes, leverages model diversity

RRF vs. Other Fusion Methods

Method	Approach	Pros	Cons
RRF	Rank-based fusion with constant k	No training, score-agnostic, robust	Fixed weighting of sources
CombSUM	Sum of normalized scores	Considers score magnitude	Requires score normalization
CombMNZ	CombSUM × number of lists containing doc	Rewards consensus more strongly	Requires score normalization
Learned fusion	ML model trained on relevance labels	Optimal weighting	Requires training data

RRF's advantage is that it works well without any configuration, making it the default choice when you do not have labeled training data for learning fusion weights.

Practical Use Cases

Multi-Query RAG: The primary use case. When multi-query retrieval generates 3-4 query variants, RRF merges the results into a single ranked list that benefits from the diversity of all variants. See Build RAG Pipeline.
Hybrid Search Systems: Combining dense embedding search with keyword-based BM25 search using RRF produces results that are both semantically relevant and keyword-precise.
Multi-Collection Search: When a knowledge base spans multiple collections or databases (e.g., product docs, support tickets, and blog posts), RRF merges results from each collection into a unified ranking. See Build Private Document Q&A.
Ensemble Retrieval: Using multiple embedding models (different sizes, different training) to search the same corpus, then merging with RRF, produces more robust retrieval than any single model.
Cross-Lingual Retrieval: When documents exist in multiple languages, running queries in each language and merging with RRF captures relevant documents regardless of language.

Key Terms

Reciprocal Rank Fusion (RRF): An algorithm that merges multiple ranked lists by assigning each document a score of 1/(k + rank) from each list and summing across lists.
Rank Fusion: The general problem of combining multiple ranked result lists into a single unified ranking.
k Constant: The damping parameter in RRF (default: 60) that controls how steeply scores decrease with rank position.
Hybrid Search: Combining semantic (dense vector) and keyword (sparse) retrieval methods, typically merged with RRF.
Consensus Boosting: The property of RRF where documents appearing in multiple lists receive higher combined scores, surfacing broadly relevant results.
Score Agnostic: RRF's key property of using only rank positions, not raw similarity scores, making it compatible with any retrieval method regardless of score scale.

PdfChat: PDF-based RAG using RRF for multi-query result merging
RagEngine: Core RAG engine with RRF-based multi-query support
MultiQueryOptions: Configuration for multi-query retrieval that uses RRF internally

RAG (Retrieval-Augmented Generation): The retrieval framework where RRF is applied
Multi-Query Retrieval: The primary technique that uses RRF for result merging
Maximal Marginal Relevance (MMR): Diversity filtering applied after RRF merging
Embeddings: Dense retrieval producing one of the result lists that RRF merges
Reranking: Precision refinement applied after RRF merging
Query Contextualization: Preprocessing step before multi-query and RRF
HyDE (Hypothetical Document Embeddings): Alternative query strategy whose results can be merged with RRF
Chunking: The document segments that appear in the ranked lists RRF merges
Agentic RAG: Agent-driven retrieval that benefits from RRF-based multi-query
Semantic Similarity: One of the retrieval signals that RRF combines
Vector Database: Storage layer producing the ranked lists that RRF merges

Build RAG Pipeline: End-to-end RAG setup with multi-query and RRF
Chat with PDF Documents: PDF Q&A with RRF-based result merging
Build Private Document Q&A: Private document search with multi-query RRF
Improve RAG Results with Reranking: Combine RRF with reranking for best results
Optimize RAG with Custom Chunking: Preparing documents for multi-query retrieval
Single-Turn RAG (CLI): Single-turn RAG demo

External Resources

Reciprocal Rank Fusion Outperforms Condorcet and Individual Rank Learning Methods (Cormack et al., 2009): The original RRF paper
RAG-Fusion: a New Take on Retrieval-Augmented Generation (Raudaschl, 2024): RRF applied to multi-query RAG
Hybrid Search Explained (Weaviate, 2023): Practical guide to hybrid search with RRF

Summary

Reciprocal Rank Fusion (RRF) is the standard algorithm for combining multiple ranked result lists in RAG pipelines. By scoring each document as 1/(k + rank) and summing across lists, RRF produces a merged ranking that rewards documents appearing highly in multiple sources. Its key strengths are simplicity (one constant, k = 60), score agnosticism (works with any retrieval method regardless of score scale), and robustness (no training data required). LM-Kit.NET uses RRF internally when QueryGenerationMode.MultiQuery is enabled, automatically merging results from multiple query variants. In a production RAG pipeline, RRF sits between retrieval and generation: multi-query generates variants, parallel retrieval produces multiple result lists, RRF merges them, MMR ensures diversity, and reranking refines precision. This combination delivers comprehensive, relevant, and diverse context to the LLM for accurate answer generation.

Table of Contents